Skip to content

A collection of samples demonstrating techniques for processing documents with Azure AI including AI Studio, OpenAI, Document Intelligence, etc.

License

Notifications You must be signed in to change notification settings

Azure-Samples/azure-ai-document-processing-samples

Repository files navigation

page_type languages products name description
sample
python
bicep
azure
ai-services
azure-openai
Document Processing with Azure AI Samples
This collection of samples demonstrates how to use various Azure AI capabilities to build solution to extract structured data, classify, and analyze documents.

Document Processing with Azure AI Samples

This repository contains a collection of code samples that demonstrate how to use various Azure AI capabilities to process documents.

The samples are intended to help engineering teams establish techniques with Azure AI Foundry, Azure OpenAI, and Azure Document Intelligence to build solutions to extract structured data, classify, and analyze documents.

The techniques demonstrated take advance of various capabilities from each service to:

  • Reduce complexity of custom model training by taking advantage of the capabilities of Generative AI models to analyze and classify documents.
  • Improve reliability in document processing by utilizing combining AI service capbilities to extract structured data from any document type, with high accuracy and confidence.
  • Simplify document processing workflows by providing reusable code and patterns that can be easily modified and evaluated for most use cases.

Contents

Samples

Note

All data extraction samples provide both an accuracy and confidence score for the extracted data. The accuracy score is calculated based on the similarity between the extracted data and the ground truth data. The confidence score is calculated based on OCR analysis confidence and logprobs in Azure OpenAI requests.

Sample Description Example Use Cases
Data Extraction - Azure AI Document Intelligence + Azure OpenAI GPT-4o Demonstrates how to use Azure AI Document Intelligence pre-built layout and Azure OpenAI GPT models to extract structured data from documents. Predominantly text-based documents such as invoices, receipts, and forms.
Data Extraction - Azure AI Document Intelligence + Phi-3.5 MoE Demonstrates how to use Azure AI Document Intelligence pre-built layout and Microsoft's Phi-3 models to extract structured data from documents. Predominantly text-based documents such as invoices, receipts, and forms.
Data Extraction - Azure OpenAI GPT-4o with Vision Demonstrates how to use Azure OpenAI GPT-4o and GPT-4o-mini models to extract structured data from documents using their built-in vision capabilities. Complex documents with a mix of text and images, including diagrams, signatures, selection marks, etc. such as reports and contracts.
Data Extraction - Comprehensive Azure AI Document Intelligence + Azure OpenAI GPT-4o with Vision Demonstrates how to improve the accuracy and confidence in extracting structured data from documents by combining Azure AI Document Intelligence and Azure OpenAI GPT-4o models with vision capabilities. Any structured or unstructured document type.
Classification - Azure OpenAI GPT-4o with Vision Demonstrates how to use Azure OpenAI GPT-4o and GPT-4o-mini models to classify documents using their built-in vision capabilities. Processing multiple documents types or documents with varying purposes, such as contracts, legal documents, and emails.
Classification - Azure AI Document Intelligence + Embeddings Demonstrates how to use Azure AI Document Intelligence pre-built layout and embeddings models to classify documents based on their content. Processing multiple documents types or documents with varying purposes, such as contracts, legal documents, and emails.

Getting Started

The sample repository comes with a Dev Container that contains all the necessary tools and dependencies to run the sample.

Important

An Azure subscription is required to run these samples. If you don't have an Azure subscription, create an account.

Setup on GitHub Codespaces

To use the Dev Container in GitHub Codespaces, follow these steps:

  1. Click on the Code button in the repository and select Codespaces.
  2. Click on the + button to create a new Codespace using the provided .devcontainer\devcontainer.json configuration.
  3. Once the Codespace is created, continue to the Azure environment setup section.

Setup on Local

To use the Dev Container, you need to have the following tools installed on your local machine:

To setup a local development environment, follow these steps:

Important

Ensure that Docker Desktop is running on your local machine.

  1. Clone the repository to your local machine.
  2. Open the repository in Visual Studio Code.
  3. Press F1 to open the command palette and type Dev Containers: Reopen in Container.

Once the Dev Container is up and running, continue to the Azure environment setup section.

Deploy the Azure environment

Once the Dev Container is up and running, you can setup the necessary Azure services and run the samples in the repository by running the following command in a pwsh terminal:

Note

For the most optimal sample experience, it is recommended to run the samples in East US which will provide support for all the services used in the samples. Find out more about region availability for Azure AI Document Intelligence, and GPT-4o, Phi-3.5 MoE, and text-embedding-3-large models.

az login

./Setup-Environment.ps1 -DeploymentName <UniqueDeploymentName> -Location <AzureRegion> -SkipInfrastructure $false

Note

If a specific tenant is required, use the --tenant <TenantId> parameter in the az login command.

The script will deploy the following resources to your Azure subscription:

  • Azure AI Foundry Hub & Project, a development platform for building AI solutions that integrates with Azure AI Services in a secure manner using Microsoft Entra ID for authentication.
  • Azure AI Services, a managed service for all Azure AI Services, including Azure OpenAI and Azure AI Document Intelligence.
    • Note: GPT-4o and GPT-4o-mini will be deployed as Global Standard models with 10K TPM quota allocation. text-embedding-3-large will be deployed as a Standard model with 115K TPM quota allocation. These can be adjusted based on your quota availability in the main.bicep file.
  • Azure Storage Account, required by Azure AI Foundry.
  • Azure Monitor, used to store logs and traces for monitoring and troubleshooting purposes.
  • Azure Container Registry, used to store container images for the Azure AI Foundry environment.

Note

All resources are secured by default with Microsoft Entra ID using Azure RBAC. Your user client ID will be added with the necessary least-privilege roles to access the resources created. A user-assigned managed identity will also be deployed for the Azure AI Foundry environment.

After the script completes, you can run any of the samples in the repository by following their instructions.

Contributing

You can contribute to the repository by opening an issue or submitting a pull request. For more information, see the Contributing guide.

License

This project is licensed under the MIT License.

About

A collection of samples demonstrating techniques for processing documents with Azure AI including AI Studio, OpenAI, Document Intelligence, etc.

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published