Summary: This post explores how to implement sophisticated agent-like behaviors using Microsoft Semantic Kernel. Learn how to create autonomous agents that can plan, reason, and execute complex tasks by combining skills, memory, and planning capabilities.
Introduction
Microsoft Semantic Kernel has evolved significantly since its introduction, moving beyond simple skill execution to enable more sophisticated agent-like behaviors. With the right architecture and implementation patterns, developers can create autonomous agents that can plan, reason, and execute complex tasks with minimal human intervention.
In this post, we’ll explore how to implement agent patterns using Microsoft Semantic Kernel. We’ll go beyond basic skills to create systems that can understand user goals, break them down into steps, execute those steps, and adapt to changing conditions. By the end of this article, you’ll have the knowledge to build sophisticated AI agents that can tackle complex tasks in your .NET applications.
Understanding Agent Architecture
Before diving into implementation, let’s understand what makes an AI agent different from a simple skill-based system.
Key Components of an Agent
A sophisticated AI agent typically consists of these components:
- Goal Understanding: The ability to interpret user requests and understand the underlying goals
- Planning: Breaking down goals into actionable steps
- Memory: Storing and retrieving relevant information
- Execution: Carrying out the planned steps
- Reflection: Evaluating results and adapting plans
- Tool Use: Leveraging various tools and APIs to accomplish tasks
Agent Architecture in Semantic Kernel
In Semantic Kernel, we can implement these components using:
- Semantic Functions: For goal understanding, planning, and reflection
- Native Functions: For execution and tool use
- Memory Store: For storing and retrieving information
- Planner: For breaking down goals into steps
- Context Variables: For maintaining state across steps
Setting Up Your Development Environment
Let’s start by setting up your development environment.
Prerequisites
To follow along with this tutorial, you’ll need:
- Visual Studio 2022 or Visual Studio Code
- .NET 8 SDK
- An Azure subscription with access to Azure OpenAI Service
- Basic familiarity with Microsoft Semantic Kernel
Creating a New Project
Let’s create a new console application:
bash
dotnet new console -n SemanticKernelAgent
cd SemanticKernelAgent
Installing Required Packages
Add the necessary packages to your project:
bash
dotnet add package Microsoft.SemanticKernel
dotnet add package Microsoft.SemanticKernel.Planners.Handlebars
dotnet add package Microsoft.SemanticKernel.Connectors.AI.OpenAI
dotnet add package Microsoft.SemanticKernel.Connectors.Memory.AzureCognitiveSearch
Basic Project Setup
Let’s set up the basic structure of our project:
csharp
using Microsoft.SemanticKernel;
using Microsoft.SemanticKernel.Connectors.AI.OpenAI;
using Microsoft.SemanticKernel.Memory;
using Microsoft.SemanticKernel.Planners;
using Microsoft.SemanticKernel.Planning;
using System;
using System.Threading.Tasks;
class Program
{
static async Task Main(string[] args)
{
// Configure Semantic Kernel
var builder = Kernel.CreateBuilder();
// Add OpenAI service
builder.AddAzureOpenAIChatCompletion(
deploymentName: Environment.GetEnvironmentVariable("OPENAI_DEPLOYMENT_NAME") ?? "gpt-4",
endpoint: Environment.GetEnvironmentVariable("OPENAI_ENDPOINT") ?? "",
apiKey: Environment.GetEnvironmentVariable("OPENAI_API_KEY") ?? "");
// Build the kernel
var kernel = builder.Build();
// Initialize agent
var agent = new SemanticKernelAgent(kernel);
// Run the agent
await agent.RunAsync("Help me plan a trip to Seattle for 3 days");
}
}
Implementing the Agent Core
Now, let’s implement the core agent class:
csharp
public class SemanticKernelAgent
{
private readonly Kernel _kernel;
private readonly ISemanticTextMemory _memory;
private readonly HandlebarsPlanner _planner;
public SemanticKernelAgent(Kernel kernel)
{
_kernel = kernel;
// Initialize memory
_memory = new VolatileMemoryStore().ToSemanticTextMemory();
// Initialize planner
_planner = new HandlebarsPlanner();
// Register agent skills
RegisterAgentSkills();
}
private void RegisterAgentSkills()
{
// Register core agent skills
_kernel.ImportFunctions(new GoalUnderstandingSkill(), "GoalUnderstanding");
_kernel.ImportFunctions(new ReflectionSkill(), "Reflection");
_kernel.ImportFunctions(new MemorySkill(_memory), "Memory");
// Register domain-specific skills
_kernel.ImportFunctions(new WebSearchSkill(), "WebSearch");
_kernel.ImportFunctions(new CalendarSkill(), "Calendar");
_kernel.ImportFunctions(new WeatherSkill(), "Weather");
_kernel.ImportFunctions(new TravelSkill(), "Travel");
}
public async Task RunAsync(string userInput)
{
Console.WriteLine($"User: {userInput}");
try
{
// Step 1: Understand the goal
var goal = await UnderstandGoalAsync(userInput);
Console.WriteLine($"Goal: {goal}");
// Step 2: Create a plan
var plan = await CreatePlanAsync(goal);
Console.WriteLine($"Plan: {plan}");
// Step 3: Execute the plan
var result = await ExecutePlanAsync(plan);
Console.WriteLine($"Agent: {result}");
// Step 4: Reflect on the result
await ReflectOnResultAsync(goal, result);
}
catch (Exception ex)
{
Console.WriteLine($"Error: {ex.Message}");
}
}
private async Task<string> UnderstandGoalAsync(string userInput)
{
var variables = new KernelArguments
{
["input"] = userInput
};
var result = await _kernel.InvokeAsync("GoalUnderstanding", "ExtractGoal", variables);
return result.GetValue<string>() ?? userInput;
}
private async Task<Plan> CreatePlanAsync(string goal)
{
// Create a plan using the Handlebars planner
return await _planner.CreatePlanAsync(_kernel, goal);
}
private async Task<string> ExecutePlanAsync(Plan plan)
{
// Execute the plan
var result = await _kernel.InvokeAsync(plan);
return result.GetValue<string>() ?? "I couldn't complete the task.";
}
private async Task ReflectOnResultAsync(string goal, string result)
{
var variables = new KernelArguments
{
["goal"] = goal,
["result"] = result
};
var reflection = await _kernel.InvokeAsync("Reflection", "EvaluateResult", variables);
// Store the reflection in memory for future reference
await _memory.SaveInformationAsync(
"reflections",
reflection.GetValue<string>() ?? "",
$"Reflection on goal: {goal}");
}
}
Implementing Core Agent Skills
Let’s implement the core skills that our agent needs:
Goal Understanding Skill
csharp
public class GoalUnderstandingSkill
{
[KernelFunction]
[Description("Extracts the core goal from a user request")]
public async Task<string> ExtractGoal(
[Description("The user's input request")] string input,
Kernel kernel)
{
var prompt = @"
Extract the core goal from the user's request. Focus on what they're trying to achieve,
not just the literal request. Return only the goal, concisely stated.
User request: {{$input}}
Core goal:";
var function = kernel.CreateFunctionFromPrompt(prompt, new OpenAIPromptExecutionSettings { Temperature = 0.0 });
var result = await kernel.InvokeAsync(function, new KernelArguments { ["input"] = input });
return result.GetValue<string>() ?? input;
}
[KernelFunction]
[Description("Identifies constraints and preferences in a user request")]
public async Task<string> IdentifyConstraints(
[Description("The user's input request")] string input,
Kernel kernel)
{
var prompt = @"
Identify any constraints and preferences in the user's request.
These could include time constraints, budget limitations, personal preferences, etc.
Format the output as a JSON object with 'constraints' and 'preferences' arrays.
User request: {{$input}}
Constraints and preferences:";
var function = kernel.CreateFunctionFromPrompt(prompt, new OpenAIPromptExecutionSettings { Temperature = 0.0 });
var result = await kernel.InvokeAsync(function, new KernelArguments { ["input"] = input });
return result.GetValue<string>() ?? "{}";
}
}
Reflection Skill
csharp
public class ReflectionSkill
{
[KernelFunction]
[Description("Evaluates the result of a task against the original goal")]
public async Task<string> EvaluateResult(
[Description("The original goal")] string goal,
[Description("The result of the task execution")] string result,
Kernel kernel)
{
var prompt = @"
Evaluate how well the result satisfies the original goal. Consider:
1. Did the result fully address the goal?
2. Were there any aspects of the goal that weren't addressed?
3. What could be improved in future attempts?
Goal: {{$goal}}
Result: {{$result}}
Evaluation:";
var function = kernel.CreateFunctionFromPrompt(prompt, new OpenAIPromptExecutionSettings { Temperature = 0.0 });
var evaluation = await kernel.InvokeAsync(function, new KernelArguments { ["goal"] = goal, ["result"] = result });
return evaluation.GetValue<string>() ?? "";
}
[KernelFunction]
[Description("Suggests improvements to the agent's approach")]
public async Task<string> SuggestImprovements(
[Description("The original goal")] string goal,
[Description("The result of the task execution")] string result,
Kernel kernel)
{
var prompt = @"
Based on the goal and the result, suggest specific improvements to the agent's approach.
Focus on concrete changes that could lead to better results in similar tasks.
Goal: {{$goal}}
Result: {{$result}}
Suggested improvements:";
var function = kernel.CreateFunctionFromPrompt(prompt, new OpenAIPromptExecutionSettings { Temperature = 0.0 });
var suggestions = await kernel.InvokeAsync(function, new KernelArguments { ["goal"] = goal, ["result"] = result });
return suggestions.GetValue<string>() ?? "";
}
}
Memory Skill
csharp
public class MemorySkill
{
private readonly ISemanticTextMemory _memory;
public MemorySkill(ISemanticTextMemory memory)
{
_memory = memory;
}
[KernelFunction]
[Description("Saves information to the agent's memory")]
public async Task<string> SaveInformation(
[Description("The collection to save the information to")] string collection,
[Description("The information to save")] string information,
[Description("A description or key for the information")] string description)
{
await _memory.SaveInformationAsync(collection, information, description);
return $"Information saved to {collection}";
}
[KernelFunction]
[Description("Retrieves information from the agent's memory")]
public async Task<string> RetrieveInformation(
[Description("The collection to search")] string collection,
[Description("The query to search for")] string query,
[Description("The maximum number of results to return")] int limit = 5)
{
var results = _memory.SearchAsync(collection, query, limit);
var memories = new List<string>();
await foreach (var memory in results)
{
memories.Add($"- {memory.Metadata.Description}: {memory.Metadata.Text}");
}
if (memories.Count == 0)
{
return "No relevant information found in memory.";
}
return string.Join("\n", memories);
}
}
Implementing Domain-Specific Skills
Now, let’s implement some domain-specific skills that our agent can use:
Web Search Skill
csharp
public class WebSearchSkill
{
private readonly HttpClient _httpClient = new HttpClient( );
[KernelFunction]
[Description("Searches the web for information")]
public async Task<string> SearchWeb(
[Description("The search query")] string query)
{
// In a real implementation, you would use a search API like Bing
// This is a simplified example
try
{
// Simulate a web search
await Task.Delay(1000);
return $"Here are some search results for '{query}':\n" +
"1. Result 1: Description of the first result\n" +
"2. Result 2: Description of the second result\n" +
"3. Result 3: Description of the third result";
}
catch (Exception ex)
{
return $"Error searching the web: {ex.Message}";
}
}
[KernelFunction]
[Description("Gets information from a specific URL")]
public async Task<string> GetWebPage(
[Description("The URL to retrieve")] string url)
{
try
{
// In a real implementation, you would parse the HTML and extract meaningful content
var response = await _httpClient.GetStringAsync(url );
// Simulate content extraction
return $"Content from {url}:\n" +
"This is a simplified simulation of content extraction from a web page.";
}
catch (Exception ex)
{
return $"Error retrieving web page: {ex.Message}";
}
}
}