Implementing Agents with Microsoft Semantic Kernel: Beyond Basic Skills

Summary: This post explores how to implement sophisticated agent-like behaviors using Microsoft Semantic Kernel. Learn how to create autonomous agents that can plan, reason, and execute complex tasks by combining skills, memory, and planning capabilities.

Introduction

Microsoft Semantic Kernel has evolved significantly since its introduction, moving beyond simple skill execution to enable more sophisticated agent-like behaviors. With the right architecture and implementation patterns, developers can create autonomous agents that can plan, reason, and execute complex tasks with minimal human intervention.

In this post, we’ll explore how to implement agent patterns using Microsoft Semantic Kernel. We’ll go beyond basic skills to create systems that can understand user goals, break them down into steps, execute those steps, and adapt to changing conditions. By the end of this article, you’ll have the knowledge to build sophisticated AI agents that can tackle complex tasks in your .NET applications.

Understanding Agent Architecture

Before diving into implementation, let’s understand what makes an AI agent different from a simple skill-based system.

Key Components of an Agent

A sophisticated AI agent typically consists of these components:

Goal Understanding: The ability to interpret user requests and understand the underlying goals
Planning: Breaking down goals into actionable steps
Memory: Storing and retrieving relevant information
Execution: Carrying out the planned steps
Reflection: Evaluating results and adapting plans
Tool Use: Leveraging various tools and APIs to accomplish tasks

Agent Architecture in Semantic Kernel

In Semantic Kernel, we can implement these components using:

Semantic Functions: For goal understanding, planning, and reflection
Native Functions: For execution and tool use
Memory Store: For storing and retrieving information
Planner: For breaking down goals into steps
Context Variables: For maintaining state across steps

Setting Up Your Development Environment

Let’s start by setting up your development environment.

Prerequisites

To follow along with this tutorial, you’ll need:

Visual Studio 2022 or Visual Studio Code
.NET 8 SDK
An Azure subscription with access to Azure OpenAI Service
Basic familiarity with Microsoft Semantic Kernel

Creating a New Project

Let’s create a new console application:

bash

dotnet new console -n SemanticKernelAgent
cd SemanticKernelAgent

Installing Required Packages

Add the necessary packages to your project:

bash

dotnet add package Microsoft.SemanticKernel
dotnet add package Microsoft.SemanticKernel.Planners.Handlebars
dotnet add package Microsoft.SemanticKernel.Connectors.AI.OpenAI
dotnet add package Microsoft.SemanticKernel.Connectors.Memory.AzureCognitiveSearch

Basic Project Setup

Let’s set up the basic structure of our project:

csharp

using Microsoft.SemanticKernel;
using Microsoft.SemanticKernel.Connectors.AI.OpenAI;
using Microsoft.SemanticKernel.Memory;
using Microsoft.SemanticKernel.Planners;
using Microsoft.SemanticKernel.Planning;
using System;
using System.Threading.Tasks;

class Program
{
    static async Task Main(string[] args)
    {
        // Configure Semantic Kernel
        var builder = Kernel.CreateBuilder();
        
        // Add OpenAI service
        builder.AddAzureOpenAIChatCompletion(
            deploymentName: Environment.GetEnvironmentVariable("OPENAI_DEPLOYMENT_NAME") ?? "gpt-4",
            endpoint: Environment.GetEnvironmentVariable("OPENAI_ENDPOINT") ?? "",
            apiKey: Environment.GetEnvironmentVariable("OPENAI_API_KEY") ?? "");
        
        // Build the kernel
        var kernel = builder.Build();
        
        // Initialize agent
        var agent = new SemanticKernelAgent(kernel);
        
        // Run the agent
        await agent.RunAsync("Help me plan a trip to Seattle for 3 days");
    }
}

Implementing the Agent Core

Now, let’s implement the core agent class:

csharp

public class SemanticKernelAgent
{
    private readonly Kernel _kernel;
    private readonly ISemanticTextMemory _memory;
    private readonly HandlebarsPlanner _planner;
    
    public SemanticKernelAgent(Kernel kernel)
    {
        _kernel = kernel;
        
        // Initialize memory
        _memory = new VolatileMemoryStore().ToSemanticTextMemory();
        
        // Initialize planner
        _planner = new HandlebarsPlanner();
        
        // Register agent skills
        RegisterAgentSkills();
    }
    
    private void RegisterAgentSkills()
    {
        // Register core agent skills
        _kernel.ImportFunctions(new GoalUnderstandingSkill(), "GoalUnderstanding");
        _kernel.ImportFunctions(new ReflectionSkill(), "Reflection");
        _kernel.ImportFunctions(new MemorySkill(_memory), "Memory");
        
        // Register domain-specific skills
        _kernel.ImportFunctions(new WebSearchSkill(), "WebSearch");
        _kernel.ImportFunctions(new CalendarSkill(), "Calendar");
        _kernel.ImportFunctions(new WeatherSkill(), "Weather");
        _kernel.ImportFunctions(new TravelSkill(), "Travel");
    }
    
    public async Task RunAsync(string userInput)
    {
        Console.WriteLine($"User: {userInput}");
        
        try
        {
            // Step 1: Understand the goal
            var goal = await UnderstandGoalAsync(userInput);
            Console.WriteLine($"Goal: {goal}");
            
            // Step 2: Create a plan
            var plan = await CreatePlanAsync(goal);
            Console.WriteLine($"Plan: {plan}");
            
            // Step 3: Execute the plan
            var result = await ExecutePlanAsync(plan);
            Console.WriteLine($"Agent: {result}");
            
            // Step 4: Reflect on the result
            await ReflectOnResultAsync(goal, result);
        }
        catch (Exception ex)
        {
            Console.WriteLine($"Error: {ex.Message}");
        }
    }
    
    private async Task<string> UnderstandGoalAsync(string userInput)
    {
        var variables = new KernelArguments
        {
            ["input"] = userInput
        };
        
        var result = await _kernel.InvokeAsync("GoalUnderstanding", "ExtractGoal", variables);
        return result.GetValue<string>() ?? userInput;
    }
    
    private async Task<Plan> CreatePlanAsync(string goal)
    {
        // Create a plan using the Handlebars planner
        return await _planner.CreatePlanAsync(_kernel, goal);
    }
    
    private async Task<string> ExecutePlanAsync(Plan plan)
    {
        // Execute the plan
        var result = await _kernel.InvokeAsync(plan);
        return result.GetValue<string>() ?? "I couldn't complete the task.";
    }
    
    private async Task ReflectOnResultAsync(string goal, string result)
    {
        var variables = new KernelArguments
        {
            ["goal"] = goal,
            ["result"] = result
        };
        
        var reflection = await _kernel.InvokeAsync("Reflection", "EvaluateResult", variables);
        
        // Store the reflection in memory for future reference
        await _memory.SaveInformationAsync(
            "reflections", 
            reflection.GetValue<string>() ?? "", 
            $"Reflection on goal: {goal}");
    }
}

Implementing Core Agent Skills

Let’s implement the core skills that our agent needs:

Goal Understanding Skill

csharp

public class GoalUnderstandingSkill
{
    [KernelFunction]
    [Description("Extracts the core goal from a user request")]
    public async Task<string> ExtractGoal(
        [Description("The user's input request")] string input,
        Kernel kernel)
    {
        var prompt = @"
            Extract the core goal from the user's request. Focus on what they're trying to achieve, 
            not just the literal request. Return only the goal, concisely stated.

            User request: {{$input}}
            
            Core goal:";
        
        var function = kernel.CreateFunctionFromPrompt(prompt, new OpenAIPromptExecutionSettings { Temperature = 0.0 });
        var result = await kernel.InvokeAsync(function, new KernelArguments { ["input"] = input });
        
        return result.GetValue<string>() ?? input;
    }
    
    [KernelFunction]
    [Description("Identifies constraints and preferences in a user request")]
    public async Task<string> IdentifyConstraints(
        [Description("The user's input request")] string input,
        Kernel kernel)
    {
        var prompt = @"
            Identify any constraints and preferences in the user's request. 
            These could include time constraints, budget limitations, personal preferences, etc.
            Format the output as a JSON object with 'constraints' and 'preferences' arrays.

            User request: {{$input}}
            
            Constraints and preferences:";
        
        var function = kernel.CreateFunctionFromPrompt(prompt, new OpenAIPromptExecutionSettings { Temperature = 0.0 });
        var result = await kernel.InvokeAsync(function, new KernelArguments { ["input"] = input });
        
        return result.GetValue<string>() ?? "{}";
    }
}

Reflection Skill

csharp

public class ReflectionSkill
{
    [KernelFunction]
    [Description("Evaluates the result of a task against the original goal")]
    public async Task<string> EvaluateResult(
        [Description("The original goal")] string goal,
        [Description("The result of the task execution")] string result,
        Kernel kernel)
    {
        var prompt = @"
            Evaluate how well the result satisfies the original goal. Consider:
            1. Did the result fully address the goal?
            2. Were there any aspects of the goal that weren't addressed?
            3. What could be improved in future attempts?
            
            Goal: {{$goal}}
            Result: {{$result}}
            
            Evaluation:";
        
        var function = kernel.CreateFunctionFromPrompt(prompt, new OpenAIPromptExecutionSettings { Temperature = 0.0 });
        var evaluation = await kernel.InvokeAsync(function, new KernelArguments { ["goal"] = goal, ["result"] = result });
        
        return evaluation.GetValue<string>() ?? "";
    }
    
    [KernelFunction]
    [Description("Suggests improvements to the agent's approach")]
    public async Task<string> SuggestImprovements(
        [Description("The original goal")] string goal,
        [Description("The result of the task execution")] string result,
        Kernel kernel)
    {
        var prompt = @"
            Based on the goal and the result, suggest specific improvements to the agent's approach.
            Focus on concrete changes that could lead to better results in similar tasks.
            
            Goal: {{$goal}}
            Result: {{$result}}
            
            Suggested improvements:";
        
        var function = kernel.CreateFunctionFromPrompt(prompt, new OpenAIPromptExecutionSettings { Temperature = 0.0 });
        var suggestions = await kernel.InvokeAsync(function, new KernelArguments { ["goal"] = goal, ["result"] = result });
        
        return suggestions.GetValue<string>() ?? "";
    }
}

Memory Skill

csharp

public class MemorySkill
{
    private readonly ISemanticTextMemory _memory;
    
    public MemorySkill(ISemanticTextMemory memory)
    {
        _memory = memory;
    }
    
    [KernelFunction]
    [Description("Saves information to the agent's memory")]
    public async Task<string> SaveInformation(
        [Description("The collection to save the information to")] string collection,
        [Description("The information to save")] string information,
        [Description("A description or key for the information")] string description)
    {
        await _memory.SaveInformationAsync(collection, information, description);
        return $"Information saved to {collection}";
    }
    
    [KernelFunction]
    [Description("Retrieves information from the agent's memory")]
    public async Task<string> RetrieveInformation(
        [Description("The collection to search")] string collection,
        [Description("The query to search for")] string query,
        [Description("The maximum number of results to return")] int limit = 5)
    {
        var results = _memory.SearchAsync(collection, query, limit);
        
        var memories = new List<string>();
        await foreach (var memory in results)
        {
            memories.Add($"- {memory.Metadata.Description}: {memory.Metadata.Text}");
        }
        
        if (memories.Count == 0)
        {
            return "No relevant information found in memory.";
        }
        
        return string.Join("\n", memories);
    }
}

Implementing Domain-Specific Skills

Now, let’s implement some domain-specific skills that our agent can use:

Web Search Skill

csharp

public class WebSearchSkill
{
    private readonly HttpClient _httpClient = new HttpClient( );
    
    [KernelFunction]
    [Description("Searches the web for information")]
    public async Task<string> SearchWeb(
        [Description("The search query")] string query)
    {
        // In a real implementation, you would use a search API like Bing
        // This is a simplified example
        
        try
        {
            // Simulate a web search
            await Task.Delay(1000);
            
            return $"Here are some search results for '{query}':\n" +
                   "1. Result 1: Description of the first result\n" +
                   "2. Result 2: Description of the second result\n" +
                   "3. Result 3: Description of the third result";
        }
        catch (Exception ex)
        {
            return $"Error searching the web: {ex.Message}";
        }
    }
    
    [KernelFunction]
    [Description("Gets information from a specific URL")]
    public async Task<string> GetWebPage(
        [Description("The URL to retrieve")] string url)
    {
        try
        {
            // In a real implementation, you would parse the HTML and extract meaningful content
            var response = await _httpClient.GetStringAsync(url );
            
            // Simulate content extraction
            return $"Content from {url}:\n" +
                   "This is a simplified simulation of content extraction from a web page.";
        }
        catch (Exception ex)
        {
            return $"Error retrieving web page: {ex.Message}";
        }
    }
}