Summary: This article explores how to build AI-powered document processing systems using .NET and Azure AI Document Intelligence. Learn how to extract structured data from various document types, implement intelligent document classification, and create end-to-end document processing workflows that integrate with your existing .NET applications.
Introduction
Document processing is a critical function for many organizations, involving the extraction, analysis, and management of information from various document types such as invoices, receipts, contracts, forms, and reports. Traditionally, document processing has been labor-intensive, requiring manual data entry and review, which is time-consuming, error-prone, and costly.
Artificial intelligence has revolutionized document processing by automating the extraction of structured data from unstructured or semi-structured documents. AI-powered document processing systems can recognize text, understand document layouts, classify document types, and extract relevant information with high accuracy, significantly reducing manual effort and improving efficiency.
For .NET developers, building AI-powered document processing systems has become more accessible with the availability of Azure AI Document Intelligence (formerly Form Recognizer) and other AI services that can be seamlessly integrated with .NET applications. These services provide pre-built models for common document types and the ability to train custom models for specific document formats.
In this article, we’ll explore how to build AI-powered document processing systems using .NET and Azure AI services. We’ll cover the fundamentals of document processing, examine different approaches to document analysis, and provide detailed code examples for implementing document processing workflows in .NET applications. By the end, you’ll have a comprehensive understanding of how to leverage AI to create efficient and accurate document processing solutions.
Understanding AI-Powered Document Processing
Before diving into implementation details, let’s establish a clear understanding of what AI-powered document processing is and how it can benefit your applications.
What is AI-Powered Document Processing?
AI-powered document processing refers to the use of artificial intelligence techniques to automate the extraction, classification, and analysis of information from documents. Unlike traditional optical character recognition (OCR), which simply converts images of text into machine-readable text, AI-powered document processing goes further by:
- Understanding Document Structure: Recognizing the layout and organization of documents
- Classifying Document Types: Automatically identifying the type of document (invoice, receipt, contract, etc.)
- Extracting Structured Data: Pulling specific fields and their values from documents
- Interpreting Context: Understanding the meaning and relationships between extracted data
- Learning from Examples: Improving accuracy over time through machine learning
Benefits of AI-Powered Document Processing
Implementing AI-powered document processing in your .NET applications can provide numerous benefits:
- Increased Efficiency: Automate manual data entry and reduce processing time from hours to seconds
- Improved Accuracy: Reduce errors associated with manual data entry
- Cost Reduction: Lower operational costs by reducing manual processing
- Scalability: Process thousands of documents simultaneously
- Enhanced Data Accessibility: Convert unstructured document data into structured, searchable information
- Better Compliance: Ensure consistent processing and maintain audit trails
- Faster Decision-Making: Provide quick access to document data for business decisions
- Improved Customer Experience: Accelerate document-dependent processes like loan approvals or claims processing
Document Processing Pipeline
A typical AI-powered document processing pipeline consists of several stages:
- Document Capture: Acquiring document images through scanning, uploading, or email
- Preprocessing: Enhancing image quality, correcting orientation, and removing noise
- Document Classification: Identifying the type of document
- Text Recognition: Extracting text from document images (OCR)
- Layout Analysis: Understanding the document’s structure and organization
- Data Extraction: Identifying and extracting specific fields and their values
- Post-processing: Validating, normalizing, and enriching extracted data
- Integration: Sending processed data to downstream systems (databases, ERP, CRM, etc.)
Setting Up the Development Environment
Let’s start by setting up our development environment for building AI-powered document processing systems with .NET.
Prerequisites
To follow along with this tutorial, you’ll need:
- Visual Studio 2022 or Visual Studio Code
- .NET 9 SDK
- Azure account with access to Azure AI Document Intelligence
- Basic knowledge of C# and .NET development
- Sample documents for testing (invoices, receipts, forms, etc.)
Creating a New .NET Project
Let’s create a new .NET console application that we’ll use to implement our document processing system:
bash
# Create a new .NET console application
dotnet new console -n DocumentProcessingSystem
cd DocumentProcessingSystem
# Add required packages
dotnet add package Azure.AI.FormRecognizer
dotnet add package Azure.Storage.Blobs
dotnet add package Microsoft.Extensions.Configuration
dotnet add package Microsoft.Extensions.Configuration.Json
dotnet add package Microsoft.Extensions.DependencyInjection
dotnet add package Microsoft.Extensions.Logging
dotnet add package Microsoft.Extensions.Logging.Console
Setting Up Azure Resources
We’ll need several Azure resources for our document processing system:
- Azure AI Document Intelligence: For document analysis and data extraction
- Azure Blob Storage: For storing documents
- Azure Key Vault: For securely storing API keys and connection strings
You can set up these resources using the Azure Portal, Azure CLI, or Azure Resource Manager templates. Here’s an example using Azure CLI:
bash
# Set variables
resourceGroup="document-processing"
location="eastus"
storageAccount="docprocessingstorage"
formRecognizer="doc-intelligence"
keyVault="doc-processing-kv"
# Create resource group
az group create --name $resourceGroup --location $location
# Create storage account
az storage account create --name $storageAccount --resource-group $resourceGroup --location $location --sku Standard_LRS
# Create container
az storage container create --name "documents" --account-name $storageAccount --auth-mode login
# Create Form Recognizer resource
az cognitiveservices account create --name $formRecognizer --resource-group $resourceGroup --kind FormRecognizer --sku S0 --location $location
# Create Key Vault
az keyvault create --name $keyVault --resource-group $resourceGroup --location $location
Setting Up Configuration
Let’s create a configuration file to store our Azure resource information:
json
// appsettings.json
{
"AzureSettings": {
"DocumentIntelligenceEndpoint": "https://your-form-recognizer-resource.cognitiveservices.azure.com/",
"DocumentIntelligenceKey": "your-form-recognizer-key",
"BlobStorageConnectionString": "your-blob-storage-connection-string",
"BlobContainerName": "documents"
},
"Logging": {
"LogLevel": {
"Default": "Information",
"Microsoft": "Warning",
"System": "Warning"
}
}
}
Now, let’s set up our application to use this configuration:
csharp
// Program.cs
using System;
using System.IO;
using System.Threading.Tasks;
using Microsoft.Extensions.Configuration;
using Microsoft.Extensions.DependencyInjection;
using Microsoft.Extensions.Logging;
using DocumentProcessingSystem.Services;
namespace DocumentProcessingSystem
{
class Program
{
static async Task Main(string[] args )
{
// Build configuration
var configuration = new ConfigurationBuilder()
.SetBasePath(Directory.GetCurrentDirectory())
.AddJsonFile("appsettings.json", optional: false, reloadOnChange: true)
.Build();
// Configure services
var serviceProvider = new ServiceCollection()
.AddLogging(builder =>
{
builder.AddConsole();
builder.SetMinimumLevel(LogLevel.Information);
})
.AddSingleton<IConfiguration>(configuration)
.AddSingleton<DocumentProcessingService>()
.AddSingleton<DocumentStorageService>()
.BuildServiceProvider();
// Get services
var logger = serviceProvider.GetService<ILogger<Program>>();
var documentProcessingService = serviceProvider.GetService<DocumentProcessingService>();
logger.LogInformation("Document Processing System started");
// Process a sample document
if (args.Length > 0)
{
var filePath = args[0];
if (File.Exists(filePath))
{
logger.LogInformation($"Processing document: {filePath}");
await documentProcessingService.ProcessDocumentAsync(filePath);
}
else
{
logger.LogError($"File not found: {filePath}");
}
}
else
{
logger.LogInformation("No document specified. Please provide a file path as an argument.");
}
}
}
}
Implementing Document Storage
Let’s implement a service for storing and retrieving documents from Azure Blob Storage:
csharp
// Services/DocumentStorageService.cs
using System;
using System.IO;
using System.Threading.Tasks;
using Azure.Storage.Blobs;
using Azure.Storage.Blobs.Models;
using Microsoft.Extensions.Configuration;
using Microsoft.Extensions.Logging;
namespace DocumentProcessingSystem.Services
{
public class DocumentStorageService
{
private readonly BlobServiceClient _blobServiceClient;
private readonly BlobContainerClient _containerClient;
private readonly ILogger<DocumentStorageService> _logger;
public DocumentStorageService(IConfiguration configuration, ILogger<DocumentStorageService> logger)
{
_logger = logger;
var connectionString = configuration["AzureSettings:BlobStorageConnectionString"];
var containerName = configuration["AzureSettings:BlobContainerName"];
_blobServiceClient = new BlobServiceClient(connectionString);
_containerClient = _blobServiceClient.GetBlobContainerClient(containerName);
}
public async Task<string> UploadDocumentAsync(string filePath)
{
try
{
var fileName = Path.GetFileName(filePath);
var blobName = $"{DateTime.UtcNow:yyyyMMddHHmmss}_{fileName}";
var blobClient = _containerClient.GetBlobClient(blobName);
_logger.LogInformation($"Uploading document {fileName} to blob storage as {blobName}");
using (var fileStream = File.OpenRead(filePath))
{
await blobClient.UploadAsync(fileStream, new BlobHttpHeaders
{
ContentType = GetContentType(fileName)
});
}
_logger.LogInformation($"Document uploaded successfully. Blob URI: {blobClient.Uri}");
return blobClient.Uri.ToString();
}
catch (Exception ex)
{
_logger.LogError(ex, $"Error uploading document {filePath}");
throw;
}
}
public async Task<Stream> DownloadDocumentAsync(string blobUri)
{
try
{
var blobUriObj = new Uri(blobUri);
var blobName = Path.GetFileName(blobUriObj.LocalPath);
var blobClient = _containerClient.GetBlobClient(blobName);
_logger.LogInformation($"Downloading document {blobName} from blob storage");
var memoryStream = new MemoryStream();
await blobClient.DownloadToAsync(memoryStream);
memoryStream.Position = 0;
return memoryStream;
}
catch (Exception ex)
{
_logger.LogError(ex, $"Error downloading document {blobUri}");
throw;
}
}
public async Task DeleteDocumentAsync(string blobUri)
{
try
{
var blobUriObj = new Uri(blobUri);
var blobName = Path.GetFileName(blobUriObj.LocalPath);
var blobClient = _containerClient.GetBlobClient(blobName);
_logger.LogInformation($"Deleting document {blobName} from blob storage");
await blobClient.DeleteAsync();
_logger.LogInformation($"Document deleted successfully");
}
catch (Exception ex)
{
_logger.LogError(ex, $"Error deleting document {blobUri}");
throw;
}
}
private string GetContentType(string fileName)
{
var extension = Path.GetExtension(fileName).ToLowerInvariant();
return extension switch
{
".pdf" => "application/pdf",
".jpg" => "image/jpeg",
".jpeg" => "image/jpeg",
".png" => "image/png",
".tiff" => "image/tiff",
".tif" => "image/tiff",
".bmp" => "image/bmp",
".heic" => "image/heic",
_ => "application/octet-stream"
};
}
}
}