Best Practices for Prompt Engineering in Azure OpenAI Applications

Summary: This post explores the art and science of prompt engineering for Azure OpenAI applications, covering techniques to craft effective prompts, optimize token usage, and implement robust error handling for production-ready AI features in .NET applications.

Introduction

Since Azure OpenAI Service became generally available in January 2023, developers have been integrating powerful AI capabilities into their .NET applications. However, many quickly discover that the quality of results depends heavily on how you communicate with these models. This communication happens through “prompts” – the instructions you provide to guide the AI’s responses.

Prompt engineering is the practice of designing, refining, and optimizing these instructions to get the best possible results from AI models. It’s a crucial skill for developers working with large language models (LLMs) like GPT-3.5 and GPT-4 through Azure OpenAI Service.

In this post, we’ll explore best practices for prompt engineering specifically for .NET developers working with Azure OpenAI Service. We’ll cover techniques to craft effective prompts, optimize token usage, and implement robust error handling for production-ready AI features.

Understanding Prompt Engineering

What Makes a Good Prompt?

Effective prompts share several characteristics:

Clarity: Clear, specific instructions that leave little room for misinterpretation
Context: Sufficient background information for the model to understand the task
Structure: Organized in a way that guides the model toward the desired output format
Examples: When needed, demonstrations of expected inputs and outputs
Constraints: Defined boundaries and limitations for the response

Let’s look at a simple example comparing a poor prompt with an improved one:

Poor Prompt:

Write about C#.

Improved Prompt:

Write a concise overview of C# 11's key features for an experienced .NET developer. 
Focus on the most impactful changes that improve code readability and performance. 
Format your response with feature names as subheadings followed by 2-3 sentences of explanation and a brief code example for each feature.

The improved prompt provides clear instructions about the content, audience, focus, and format, leading to a much more useful response.

Prompt Engineering Techniques for .NET Developers

Let’s explore specific techniques that are particularly useful for .NET developers working with Azure OpenAI Service.

1. Role-Based Prompting

Assigning a specific role to the AI can significantly improve the quality and relevance of responses:

csharp

using Azure;
using Azure.AI.OpenAI;
using System;
using System.Threading.Tasks;

public async Task<string> GetCodeReviewAsync(string code)
{
    OpenAIClient client = new OpenAIClient(
        new Uri("https://your-resource-name.openai.azure.com/" ),
        new AzureKeyCredential("your-api-key"));

    // Role-based prompt for code review
    ChatCompletionsOptions options = new ChatCompletionsOptions
    {
        DeploymentName = "gpt-4", // Or your deployment name
        Messages =
        {
            new ChatMessage(ChatRole.System, "You are an expert C# code reviewer with 15+ years of experience. You specialize in identifying performance issues, security vulnerabilities, and maintainability concerns. Provide constructive feedback with specific recommendations for improvement. Format your review with clear sections for 'Issues', 'Recommendations', and 'Positive Aspects'."),
            new ChatMessage(ChatRole.User, $"Please review this C# code:\n\n```csharp\n{code}\n```")
        },
        Temperature = 0.3f,
        MaxTokens = 1000
    };

    Response<ChatCompletions> response = await client.GetChatCompletionsAsync(options);
    return response.Value.Choices[0].Message.Content;
}

By defining the AI as an “expert C# code reviewer,” we guide it to adopt a specific perspective and expertise, resulting in more relevant and technically accurate feedback.

2. Few-Shot Learning with Examples

For complex or specific tasks, providing examples can dramatically improve results:

csharp

public async Task<string> TranslateErrorMessagesToUserFriendlyAsync(string technicalError)
{
    OpenAIClient client = new OpenAIClient(
        new Uri("https://your-resource-name.openai.azure.com/" ),
        new AzureKeyCredential("your-api-key"));

    // Few-shot learning with examples
    ChatCompletionsOptions options = new ChatCompletionsOptions
    {
        DeploymentName = "gpt-35-turbo", // Or your deployment name
        Messages =
        {
            new ChatMessage(ChatRole.System, "You translate technical error messages into user-friendly explanations that non-technical users can understand and act upon."),
            new ChatMessage(ChatRole.User, "System.IO.FileNotFoundException: Could not find file 'C:\\Projects\\App\\config.json'."),
            new ChatMessage(ChatRole.Assistant, "The application couldn't find an important configuration file (config.json) that it needs to run properly. This might happen if the file was accidentally deleted or moved. You could try reinstalling the application to restore the missing file."),
            new ChatMessage(ChatRole.User, "System.Net.WebException: The remote server returned an error: (401) Unauthorized."),
            new ChatMessage(ChatRole.Assistant, "The application couldn't access an online service because your login credentials weren't accepted. This might happen if your password has expired or been changed. Try signing out and signing back in with your username and password."),
            new ChatMessage(ChatRole.User, technicalError)
        },
        Temperature = 0.5f,
        MaxTokens = 300
    };

    Response<ChatCompletions> response = await client.GetChatCompletionsAsync(options);
    return response.Value.Choices[0].Message.Content;
}

By providing examples of technical errors and their user-friendly translations, we help the model understand the expected style, tone, and level of detail for its responses.

3. Chain-of-Thought Prompting

For complex reasoning tasks, guiding the model through a step-by-step thinking process can improve accuracy:

csharp

public async Task<string> AnalyzePerformanceBottleneckAsync(string performanceIssueDescription)
{
    OpenAIClient client = new OpenAIClient(
        new Uri("https://your-resource-name.openai.azure.com/" ),
        new AzureKeyCredential("your-api-key"));

    // Chain-of-thought prompting
    ChatCompletionsOptions options = new ChatCompletionsOptions
    {
        DeploymentName = "gpt-4", // Or your deployment name
        Messages =
        {
            new ChatMessage(ChatRole.System, @"You are a .NET performance optimization expert. When analyzing performance issues:
1. First, identify the potential bottlenecks based on the symptoms described
2. For each potential bottleneck, evaluate the likelihood and impact
3. For the most likely bottlenecks, suggest specific diagnostic approaches
4. Finally, recommend potential solutions with code examples where appropriate
Explain your reasoning at each step."),
            new ChatMessage(ChatRole.User, performanceIssueDescription)
        },
        Temperature = 0.4f,
        MaxTokens = 1500
    };

    Response<ChatCompletions> response = await client.GetChatCompletionsAsync(options);
    return response.Value.Choices[0].Message.Content;
}

By explicitly instructing the model to follow a specific reasoning process, we encourage more thorough analysis and better-structured responses.

4. Output Structuring

When you need responses in a specific format, especially for programmatic processing, be explicit about the structure:

csharp

public async Task<string> GenerateApiEndpointSpecificationAsync(string endpointDescription)
{
    OpenAIClient client = new OpenAIClient(
        new Uri("https://your-resource-name.openai.azure.com/" ),
        new AzureKeyCredential("your-api-key"));

    // Output structuring
    ChatCompletionsOptions options = new ChatCompletionsOptions
    {
        DeploymentName = "gpt-4", // Or your deployment name
        Messages =
        {
            new ChatMessage(ChatRole.System, @"You generate OpenAPI specifications for REST API endpoints. 
Return your response as a JSON object with the following structure:
{
  ""path"": ""string (the API endpoint path)"",
  ""method"": ""string (HTTP method: GET, POST, PUT, DELETE, etc.)"",
  ""summary"": ""string (brief description)"",
  ""parameters"": [
    {
      ""name"": ""string"",
      ""in"": ""string (query, path, header, or body)"",
      ""required"": boolean,
      ""type"": ""string (string, integer, boolean, etc.)"",
      ""description"": ""string""
    }
  ],
  ""responses"": {
    ""200"": {
      ""description"": ""string"",
      ""schema"": {}
    },
    ""400"": {
      ""description"": ""string""
    }
  }
}
Do not include any explanations or markdown formatting, just the JSON object."),
            new ChatMessage(ChatRole.User, endpointDescription)
        },
        Temperature = 0.2f,
        MaxTokens = 1000
    };

    Response<ChatCompletions> response = await client.GetChatCompletionsAsync(options);
    return response.Value.Choices[0].Message.Content;
}

By providing a detailed template and explicitly requesting JSON without additional explanations, we increase the likelihood of getting a properly structured response that can be directly parsed by our application.

Optimizing Token Usage

Azure OpenAI Service charges based on the number of tokens processed, so optimizing token usage is important for cost efficiency. Here are some strategies:

1. Token-Aware Prompt Design

Be concise and focused in your prompts, removing unnecessary information:

csharp

// Less efficient (more tokens)
string verbosePrompt = @"
I would like you to help me generate some C# code for a simple web API controller. 
The controller should handle CRUD operations for a 'Product' entity. 
A Product has the following properties: Id (int), Name (string), Description (string), 
Price (decimal), and Category (string). Please create a controller with all the necessary 
methods to handle these operations. Make sure to include proper documentation comments 
and follow best practices for RESTful API design.
";

// More efficient (fewer tokens)
string concisePrompt = @"
Generate a C# Web API controller for CRUD operations on Product entity:
- Properties: Id(int), Name(string), Description(string), Price(decimal), Category(string)
- Include standard REST endpoints
- Add XML documentation
- Follow RESTful best practices
";

2. Implement a Token Counting Utility

Create a utility to estimate token counts for better monitoring and optimization:

csharp

public class TokenCounter
{
    // Approximation: 1 token ≈ 4 characters for English text
    private const double CharactersPerToken = 4.0;

    public static int EstimateTokenCount(string text)
    {
        if (string.IsNullOrEmpty(text))
            return 0;
            
        return (int)Math.Ceiling(text.Length / CharactersPerToken);
    }
    
    public static int EstimateTokenCount(ChatCompletionsOptions options)
    {
        int totalTokens = 0;
        
        foreach (var message in options.Messages)
        {
            totalTokens += EstimateTokenCount(message.Content);
        }
        
        return totalTokens;
    }
    
    public static void LogTokenUsage(string prompt, string response)
    {
        int promptTokens = EstimateTokenCount(prompt);
        int responseTokens = EstimateTokenCount(response);
        int totalTokens = promptTokens + responseTokens;
        
        Console.WriteLine($"Token usage - Prompt: {promptTokens}, Response: {responseTokens}, Total: {totalTokens}");
    }
}

3. Implement Response Streaming for Long Outputs

For longer responses, use streaming to process content incrementally:

csharp

public async Task StreamResponseAsync(string prompt, Action<string> onChunk)
{
    OpenAIClient client = new OpenAIClient(
        new Uri("https://your-resource-name.openai.azure.com/" ),
        new AzureKeyCredential("your-api-key"));

    ChatCompletionsOptions options = new ChatCompletionsOptions
    {
        DeploymentName = "gpt-35-turbo", // Or your deployment name
        Messages =
        {
            new ChatMessage(ChatRole.User, prompt)
        },
        Temperature = 0.7f,
        MaxTokens = 2000
    };

    await foreach (StreamingChatCompletionsUpdate update in client.GetChatCompletionsStreaming(options))
    {
        if (update.ContentUpdate != null)
        {
            onChunk(update.ContentUpdate);
        }
    }
}

// Usage example
await StreamResponseAsync("Generate a detailed tutorial on Entity Framework Core migrations", 
    chunk => Console.Write(chunk)); // In a real app, you might append to a UI element