Building AI-Powered Document Processing Systems with .NET

Summary: This article explores how to build AI-powered document processing systems using .NET and Azure AI Document Intelligence. Learn how to extract structured data from various document types, implement intelligent document classification, and create end-to-end document processing workflows that integrate with your existing .NET applications.

Introduction

Document processing is a critical function for many organizations, involving the extraction, analysis, and management of information from various document types such as invoices, receipts, contracts, forms, and reports. Traditionally, document processing has been labor-intensive, requiring manual data entry and review, which is time-consuming, error-prone, and costly.

Artificial intelligence has revolutionized document processing by automating the extraction of structured data from unstructured or semi-structured documents. AI-powered document processing systems can recognize text, understand document layouts, classify document types, and extract relevant information with high accuracy, significantly reducing manual effort and improving efficiency.

For .NET developers, building AI-powered document processing systems has become more accessible with the availability of Azure AI Document Intelligence (formerly Form Recognizer) and other AI services that can be seamlessly integrated with .NET applications. These services provide pre-built models for common document types and the ability to train custom models for specific document formats.

In this article, we’ll explore how to build AI-powered document processing systems using .NET and Azure AI services. We’ll cover the fundamentals of document processing, examine different approaches to document analysis, and provide detailed code examples for implementing document processing workflows in .NET applications. By the end, you’ll have a comprehensive understanding of how to leverage AI to create efficient and accurate document processing solutions.

Understanding AI-Powered Document Processing

Before diving into implementation details, let’s establish a clear understanding of what AI-powered document processing is and how it can benefit your applications.

What is AI-Powered Document Processing?

AI-powered document processing refers to the use of artificial intelligence techniques to automate the extraction, classification, and analysis of information from documents. Unlike traditional optical character recognition (OCR), which simply converts images of text into machine-readable text, AI-powered document processing goes further by:

Understanding Document Structure: Recognizing the layout and organization of documents
Classifying Document Types: Automatically identifying the type of document (invoice, receipt, contract, etc.)
Extracting Structured Data: Pulling specific fields and their values from documents
Interpreting Context: Understanding the meaning and relationships between extracted data
Learning from Examples: Improving accuracy over time through machine learning

Benefits of AI-Powered Document Processing

Implementing AI-powered document processing in your .NET applications can provide numerous benefits:

Increased Efficiency: Automate manual data entry and reduce processing time from hours to seconds
Improved Accuracy: Reduce errors associated with manual data entry
Cost Reduction: Lower operational costs by reducing manual processing
Scalability: Process thousands of documents simultaneously
Enhanced Data Accessibility: Convert unstructured document data into structured, searchable information
Better Compliance: Ensure consistent processing and maintain audit trails
Faster Decision-Making: Provide quick access to document data for business decisions
Improved Customer Experience: Accelerate document-dependent processes like loan approvals or claims processing

Document Processing Pipeline

A typical AI-powered document processing pipeline consists of several stages:

Document Capture: Acquiring document images through scanning, uploading, or email
Preprocessing: Enhancing image quality, correcting orientation, and removing noise
Document Classification: Identifying the type of document
Text Recognition: Extracting text from document images (OCR)
Layout Analysis: Understanding the document’s structure and organization
Data Extraction: Identifying and extracting specific fields and their values
Post-processing: Validating, normalizing, and enriching extracted data
Integration: Sending processed data to downstream systems (databases, ERP, CRM, etc.)

Setting Up the Development Environment

Let’s start by setting up our development environment for building AI-powered document processing systems with .NET.

Prerequisites

To follow along with this tutorial, you’ll need:

Visual Studio 2022 or Visual Studio Code
.NET 9 SDK
Azure account with access to Azure AI Document Intelligence
Basic knowledge of C# and .NET development
Sample documents for testing (invoices, receipts, forms, etc.)

Creating a New .NET Project

Let’s create a new .NET console application that we’ll use to implement our document processing system:

bash

# Create a new .NET console application
dotnet new console -n DocumentProcessingSystem
cd DocumentProcessingSystem

# Add required packages
dotnet add package Azure.AI.FormRecognizer
dotnet add package Azure.Storage.Blobs
dotnet add package Microsoft.Extensions.Configuration
dotnet add package Microsoft.Extensions.Configuration.Json
dotnet add package Microsoft.Extensions.DependencyInjection
dotnet add package Microsoft.Extensions.Logging
dotnet add package Microsoft.Extensions.Logging.Console

Setting Up Azure Resources

We’ll need several Azure resources for our document processing system:

Azure AI Document Intelligence: For document analysis and data extraction
Azure Blob Storage: For storing documents
Azure Key Vault: For securely storing API keys and connection strings

You can set up these resources using the Azure Portal, Azure CLI, or Azure Resource Manager templates. Here’s an example using Azure CLI:

bash

# Set variables
resourceGroup="document-processing"
location="eastus"
storageAccount="docprocessingstorage"
formRecognizer="doc-intelligence"
keyVault="doc-processing-kv"

# Create resource group
az group create --name $resourceGroup --location $location

# Create storage account
az storage account create --name $storageAccount --resource-group $resourceGroup --location $location --sku Standard_LRS

# Create container
az storage container create --name "documents" --account-name $storageAccount --auth-mode login

# Create Form Recognizer resource
az cognitiveservices account create --name $formRecognizer --resource-group $resourceGroup --kind FormRecognizer --sku S0 --location $location

# Create Key Vault
az keyvault create --name $keyVault --resource-group $resourceGroup --location $location

Setting Up Configuration

Let’s create a configuration file to store our Azure resource information:

json

// appsettings.json
{
  "AzureSettings": {
    "DocumentIntelligenceEndpoint": "https://your-form-recognizer-resource.cognitiveservices.azure.com/",
    "DocumentIntelligenceKey": "your-form-recognizer-key",
    "BlobStorageConnectionString": "your-blob-storage-connection-string",
    "BlobContainerName": "documents"
  },
  "Logging": {
    "LogLevel": {
      "Default": "Information",
      "Microsoft": "Warning",
      "System": "Warning"
    }
  }
}

Now, let’s set up our application to use this configuration:

csharp

// Program.cs
using System;
using System.IO;
using System.Threading.Tasks;
using Microsoft.Extensions.Configuration;
using Microsoft.Extensions.DependencyInjection;
using Microsoft.Extensions.Logging;
using DocumentProcessingSystem.Services;

namespace DocumentProcessingSystem
{
    class Program
    {
        static async Task Main(string[] args )
        {
            // Build configuration
            var configuration = new ConfigurationBuilder()
                .SetBasePath(Directory.GetCurrentDirectory())
                .AddJsonFile("appsettings.json", optional: false, reloadOnChange: true)
                .Build();
            
            // Configure services
            var serviceProvider = new ServiceCollection()
                .AddLogging(builder =>
                {
                    builder.AddConsole();
                    builder.SetMinimumLevel(LogLevel.Information);
                })
                .AddSingleton<IConfiguration>(configuration)
                .AddSingleton<DocumentProcessingService>()
                .AddSingleton<DocumentStorageService>()
                .BuildServiceProvider();
            
            // Get services
            var logger = serviceProvider.GetService<ILogger<Program>>();
            var documentProcessingService = serviceProvider.GetService<DocumentProcessingService>();
            
            logger.LogInformation("Document Processing System started");
            
            // Process a sample document
            if (args.Length > 0)
            {
                var filePath = args[0];
                if (File.Exists(filePath))
                {
                    logger.LogInformation($"Processing document: {filePath}");
                    await documentProcessingService.ProcessDocumentAsync(filePath);
                }
                else
                {
                    logger.LogError($"File not found: {filePath}");
                }
            }
            else
            {
                logger.LogInformation("No document specified. Please provide a file path as an argument.");
            }
        }
    }
}

Implementing Document Storage

Let’s implement a service for storing and retrieving documents from Azure Blob Storage:

csharp

// Services/DocumentStorageService.cs
using System;
using System.IO;
using System.Threading.Tasks;
using Azure.Storage.Blobs;
using Azure.Storage.Blobs.Models;
using Microsoft.Extensions.Configuration;
using Microsoft.Extensions.Logging;

namespace DocumentProcessingSystem.Services
{
    public class DocumentStorageService
    {
        private readonly BlobServiceClient _blobServiceClient;
        private readonly BlobContainerClient _containerClient;
        private readonly ILogger<DocumentStorageService> _logger;
        
        public DocumentStorageService(IConfiguration configuration, ILogger<DocumentStorageService> logger)
        {
            _logger = logger;
            
            var connectionString = configuration["AzureSettings:BlobStorageConnectionString"];
            var containerName = configuration["AzureSettings:BlobContainerName"];
            
            _blobServiceClient = new BlobServiceClient(connectionString);
            _containerClient = _blobServiceClient.GetBlobContainerClient(containerName);
        }
        
        public async Task<string> UploadDocumentAsync(string filePath)
        {
            try
            {
                var fileName = Path.GetFileName(filePath);
                var blobName = $"{DateTime.UtcNow:yyyyMMddHHmmss}_{fileName}";
                var blobClient = _containerClient.GetBlobClient(blobName);
                
                _logger.LogInformation($"Uploading document {fileName} to blob storage as {blobName}");
                
                using (var fileStream = File.OpenRead(filePath))
                {
                    await blobClient.UploadAsync(fileStream, new BlobHttpHeaders
                    {
                        ContentType = GetContentType(fileName)
                    });
                }
                
                _logger.LogInformation($"Document uploaded successfully. Blob URI: {blobClient.Uri}");
                
                return blobClient.Uri.ToString();
            }
            catch (Exception ex)
            {
                _logger.LogError(ex, $"Error uploading document {filePath}");
                throw;
            }
        }
        
        public async Task<Stream> DownloadDocumentAsync(string blobUri)
        {
            try
            {
                var blobUriObj = new Uri(blobUri);
                var blobName = Path.GetFileName(blobUriObj.LocalPath);
                var blobClient = _containerClient.GetBlobClient(blobName);
                
                _logger.LogInformation($"Downloading document {blobName} from blob storage");
                
                var memoryStream = new MemoryStream();
                await blobClient.DownloadToAsync(memoryStream);
                memoryStream.Position = 0;
                
                return memoryStream;
            }
            catch (Exception ex)
            {
                _logger.LogError(ex, $"Error downloading document {blobUri}");
                throw;
            }
        }
        
        public async Task DeleteDocumentAsync(string blobUri)
        {
            try
            {
                var blobUriObj = new Uri(blobUri);
                var blobName = Path.GetFileName(blobUriObj.LocalPath);
                var blobClient = _containerClient.GetBlobClient(blobName);
                
                _logger.LogInformation($"Deleting document {blobName} from blob storage");
                
                await blobClient.DeleteAsync();
                
                _logger.LogInformation($"Document deleted successfully");
            }
            catch (Exception ex)
            {
                _logger.LogError(ex, $"Error deleting document {blobUri}");
                throw;
            }
        }
        
        private string GetContentType(string fileName)
        {
            var extension = Path.GetExtension(fileName).ToLowerInvariant();
            
            return extension switch
            {
                ".pdf" => "application/pdf",
                ".jpg" => "image/jpeg",
                ".jpeg" => "image/jpeg",
                ".png" => "image/png",
                ".tiff" => "image/tiff",
                ".tif" => "image/tiff",
                ".bmp" => "image/bmp",
                ".heic" => "image/heic",
                _ => "application/octet-stream"
            };
        }
    }
}