title | titleSuffix | description | author | manager | ms.service | ms.topic | ms.date | ms.author | monikerRange |
---|---|---|---|---|---|---|---|---|---|
What is Azure AI Document Intelligence? |
Azure AI services |
Azure AI Document Intelligence is a machine-learning based OCR and intelligent document processing service to automate extraction of key data from forms and documents. |
laujan |
nitinme |
azure-ai-document-intelligence |
overview |
02/06/2025 |
lajanuar |
<=doc-intel-4.0.0 |
:::moniker range="doc-intel-4.0.0"
[!INCLUDE applies to v4.0]
:::moniker-end
:::moniker range="doc-intel-3.1.0" [!INCLUDE applies to v3.1]
:::moniker-end
:::moniker range="doc-intel-3.0.0" [!INCLUDE applies to v3.0]
:::moniker-end
:::moniker range="doc-intel-2.1.0" [!INCLUDE applies to v2.1]
:::moniker-end
Azure AI Document Intelligence is a cloud-based Azure AI service that enables you to build intelligent document processing solutions. Massive amounts of data, spanning a wide variety of data types, are stored in forms and documents. Document Intelligence enables you to effectively manage the velocity at which data is collected and processed and is key to improved operations, informed data-driven decisions, and enlightened innovation.
For information on region access, see Azure AI Services Product Availability by Region.
| ✔️ Document analysis models | ✔️ Prebuilt models | ✔️ Custom models |
Document analysis (general extraction) models enable text extraction from forms and documents and return structured business-ready content ready for your organization's action, use, or development.
:::moniker range="doc-intel-4.0.0"
:::row::: :::column::: Read | Extract printed and handwritten text. :::column-end::: :::column span=""::: Layout | Extract text, tables, and document structure. :::column-end::: :::row-end:::
:::moniker-end
:::moniker range="<=doc-intel-3.1.0"
:::row:::
:::column:::
Read | Extract printed
and handwritten text.
:::column-end:::
:::column span="":::
Layout | Extract text, tables,
and document structure.
:::column-end:::
:::column span="":::
General document | Extract text,
structure, and key-value pairs.
:::column-end:::
:::row-end:::
:::moniker-end
Prebuilt models enable you to add intelligent document processing to your apps and flows without having to train and build your own models.
:::moniker range="doc-intel-4.0.0"
:::row::: :::column span=""::: Bank Statement | Extract account information and details from bank statements. :::column-end::: :::column span=""::: Check | Extract relevant information from checks. :::column-end::: :::column span=""::: Contract | Extract agreement and party details. :::column-end::: :::row-end::: :::row::: :::column span=""::: Credit card | Extract payment card information. :::column-end::: :::column span=""::: Invoice | Extract customer and vendor details. :::column-end::: :::column span=""::: Pay Stub | Extract pay stub details. :::column-end::: :::column span=""::: Receipt | Extract sales transaction details. :::column-end::: :::row-end:::
:::row:::
:::column span="":::
Unified US tax | Extract from any US tax forms supported.
:::column-end:::
:::column span="":::
US Tax W-2 | Extract taxable compensation details.
:::column-end:::
:::column span="":::
US Tax 1098 | Extract 1098
variation details.
:::column-end:::
:::column span="":::
US Tax 1099 | Extract 1099
variation details.
:::column-end:::
:::column span="":::
US Tax 1040 | Extract 1040
variation details.
:::column-end:::
:::row-end:::
:::row::: :::column span=""::: US mortgage 1003 | Extract loan application details. :::column-end::: :::column span=""::: US mortgage 1004 | Extract information from appraisal. :::column-end::: :::column span=""::: US mortgage 1005 | Extract information from validation of employment. :::column-end::: :::column span=""::: US mortgage 1008 | Extract loan transmittal details. :::column-end::: :::column span=""::: US mortgage disclosure | Extract final closing loan terms. :::column-end::: :::row-end:::
:::row::: :::column span=""::: Health Insurance card | Extract insurance coverage details. :::column-end::: :::column span=""::: Identity | Extract verification details. :::column-end::: :::column span=""::: Marriage certificate | Extract certified marriage information. :::column-end::: :::row-end:::
:::moniker-end
:::moniker range="<=doc-intel-3.1.0"
:::row:::
:::column span="":::
Invoice | Extract customer
and vendor details.
:::column-end:::
:::column span="":::
Receipt | Extract sales
transaction details.
:::column-end:::
:::column span="":::
Identity | Extract identification
and verification details.
:::column-end:::
:::row-end:::
:::row:::
:::column span="":::
Health Insurance card | Extract health insurance details.
:::column-end:::
:::column span="":::
Business card | Extract business contact details.
:::column-end:::
:::column span="":::
Contract | Extract agreement
and party details.
:::column-end:::
:::row-end:::
:::row:::
:::column span="":::
US Tax W-2 | Extract taxable
compensation details.
:::column-end:::
:::column span="":::
US Tax 1098 | Extract 1098
variation details.
:::column-end:::
:::row-end:::
:::moniker-end
Custom models are trained using your labeled datasets to extract distinct data from forms and documents, specific to your use cases. Standalone custom models can be combined to create composed models.
✔️ Document field extraction models are trained to extract labeled fields from documents.
:::row::: :::column::: :::column-end::: :::column span=""::: Custom neural | Extract data from mixed-type documents. :::column-end::: :::column span=""::: Custom template | Extract data from static layouts. :::column-end::: :::column span=""::: Custom composed | Extract data using a collection of models. :::column-end::: :::row-end:::
✔️ Custom classifiers identify document types before invoking an extraction model.
:::row::: :::column span=""::: Custom classifier | Identify designated document types (classes) before invoking an extraction model. :::column-end::: :::row-end:::
Document Intelligence supports optional features that can be enabled and disabled depending on the document extraction scenario:
[!INCLUDE model analysis features]
You can use Document Intelligence to automate document processing in applications and workflows, enhance data-driven strategies, and enrich document search capabilities. Use the links in the table to learn more about each model and browse development options.
:::image type="content" source="media/overview/analyze-read.png" alt-text="Screenshot of Read model analysis using Document Intelligence Studio.":::
Model ID | Description | Automation use cases | Development options |
---|---|---|---|
prebuilt-read | ● Extract text from documents. ● Data extraction |
● Digitizing any document. ● Compliance and auditing. ● Processing handwritten notes before translation. |
● Document Intelligence Studio ● REST API ● C# SDK ● Python SDK ● Java SDK ● JavaScript |
[!div class="nextstepaction"] Return to model types
:::image type="content" source="media/overview/analyze-layout.png" alt-text="Screenshot of the layout model analysis using Document Intelligence Studio.":::
Model ID | Description | Automation use cases | Development options |
---|---|---|---|
prebuilt-layout | ● Extract text and layout information from documents. ● Data extraction |
● Document indexing and retrieval by structure. ● Financial and medical report analysis. |
● Document Intelligence Studio ● REST API ● C# SDK ● Python SDK ● Java SDK ● JavaScript |
[!div class="nextstepaction"] Return to model types
:::moniker range="doc-intel-3.1.0 || doc-intel-3.0.0"
:::image type="content" source="media/overview/analyze-general-document.png" alt-text="Screenshot of General document model analysis using Document Intelligence Studio.":::
Model ID | Description | Automation use cases | Development options |
---|---|---|---|
prebuilt-document | ● Extract text,layout, and key-value pairs from documents. ● Data and field extraction |
● Key-value pair extraction. ● Form processing. ● Survey data collection and analysis. |
● Document Intelligence Studio ● REST API |
[!div class="nextstepaction"] Return to model types
:::moniker-end
:::image type="content" source="media/overview/analyze-invoice.png" alt-text="Screenshot of Invoice model analysis using Document Intelligence Studio.":::
Model ID | Description | Automation use cases | Development options |
---|---|---|---|
prebuilt-invoice | ● Extract key information from invoices. ● Data and field extraction |
● Accounts payable processing. ● Automated tax recording and reporting. |
● Document Intelligence Studio ● REST API ● C# SDK ● Python SDK ● Java SDK ● JavaScript |
[!div class="nextstepaction"] Return to model types
:::image type="content" source="media/overview/analyze-receipt.png" alt-text="Screenshot of Receipt model analysis using Document Intelligence Studio.":::
Model ID | Description | Automation use cases | Development options |
---|---|---|---|
prebuilt-receipt | ● Extract key information from receipts. ● Data and field extraction ● Receipt model v3.0 supports processing of single-page hotel receipts. |
● Expense management. ● Consumer behavior data analysis. ● Customer loyalty program. ● Merchandise return processing. ● Automated tax recording and reporting. |
● Document Intelligence Studio ● REST API ● C# SDK ● Python SDK ● Java SDK ● JavaScript |
[!div class="nextstepaction"] Return to model types
:::image type="content" source="media/overview/analyze-id-document.png" alt-text="Screenshot of Identity (ID) document model analysis using Document Intelligence Studio.":::
Model ID | Description | Automation use cases | Development options |
---|---|---|---|
prebuilt-idDocument | ● Extract key information from passports and ID cards. ● Document types ● Extract endorsements, restrictions, and vehicle classifications from US driver's licenses. |
● Know your customer (KYC) financial services guidelines compliance. ● Medical account management. ● Identity checkpoints and gateways. ● Hotel registration. |
● Document Intelligence Studio ● REST API ● C# SDK ● Python SDK ● Java SDK ● JavaScript |
[!div class="nextstepaction"] Return to model types
:::image type="content" source="media/studio/overview-check.png" alt-text="Screenshot of Check model analysis using Document Intelligence Studio.":::
Model ID | Description | Automation use cases | Development options |
---|---|---|---|
prebuilt-check | ● Extract key information from checks. ● Data and field extraction |
● Credit management. ● Automated lender management. |
● Document Intelligence Studio ● REST API ● C# SDK ● Python SDK ● Java SDK ● JavaScript |
[!div class="nextstepaction"] Return to model types
:::image type="content" source="media/studio/overview-pay-stub.png" alt-text="Screenshot of pay stub model analysis using Document Intelligence Studio.":::
Model ID | Description | Automation use cases | Development options |
---|---|---|---|
prebuilt-paystub | ● Extract key information from pay stubs. ● Data and field extraction |
● Employee payroll detail verification. ● Fraud detection for employment. ● Automated tax processing. |
● Document Intelligence Studio ● REST API ● C# SDK ● Python SDK ● Java SDK ● JavaScript |
[!div class="nextstepaction"] Return to model types
:::image type="content" source="media/studio/overview-bank-statement.png" alt-text="Screenshot of Bank statement model analysis using Document Intelligence Studio.":::
Model ID | Description | Automation use cases | Development options |
---|---|---|---|
prebuilt-bankStatement | ● Extract key information from bank statements. ● Data and field extraction |
● Tax Processing use cases. ● Automated accounting management. ● Credit-debit management. ● Loan documentation processing. |
● Document Intelligence Studio ● REST API ● C# SDK ● Python SDK ● Java SDK ● JavaScript |
[!div class="nextstepaction"] Return to model types
:::image type="content" source="media/overview/analyze-health-insurance.png" alt-text="Screenshot of Health insurance card model analysis using Document Intelligence Studio.":::
Model ID | Description | Automation use cases | Development options |
---|---|---|---|
prebuilt-healthInsuranceCard.us | ● Extract key information from US health insurance cards. ● Data and field extraction |
● Coverage and eligibility verification. ● Predictive modeling. ● Value-based analytics. |
● Document Intelligence Studio ● REST API ● C# SDK ● Python SDK ● Java SDK ● JavaScript |
[!div class="nextstepaction"] Return to model types
:::image type="content" source="media/overview/analyze-contract.png" alt-text="Screenshot of Contract model extraction using Document Intelligence Studio.":::
Model ID | Description | Development options |
---|---|---|
prebuilt-contract | Extract contract agreement and party details. ● Data and field extraction |
● Document Intelligence Studio ● REST API ● REST API ● C# SDK ● Python SDK ● Java SDK ● JavaScript |
[!div class="nextstepaction"] Return to model types
:::image type="content" source="media/overview/analyze-credit-debit.png" alt-text="Screenshot of Credit card image model analysis using Document Intelligence Studio.":::
Model ID | Description | Development options |
---|---|---|
prebuilt-creditCard | Extract contract agreement and party details. ● Data and field extraction |
● Document Intelligence Studio ● REST API ● REST API ● C# SDK ● Python SDK ● Java SDK ● JavaScript |
[!div class="nextstepaction"] Return to model types
:::image type="content" source="media/overview/analyze-marriage-certificate.png" alt-text="Screenshot of Marriage certificate document model analysis using Document Intelligence Studio.":::
Model ID | Description | Development options |
---|---|---|
prebuilt-marriageCertificate.us | Extract contract agreement and party details. ● Data and field extraction |
● Document Intelligence Studio ● REST API ● REST API ● C# SDK ● Python SDK ● Java SDK ● JavaScript |
:::image type="content" source="media/overview/analyze-1003.png" alt-text="Screenshot of US Mortgage 1003 document model analysis using Document Intelligence Studio.":::
Model ID | Description | Automation use cases | Development options |
---|---|---|---|
prebuilt-mortgage.us.1003 | ● Extract key information from 1003 loan applications. ● Data and field extraction |
● Fannie Mae and Freddie Mac documentation requirements. | ● Document Intelligence Studio ● REST API ● C# SDK ● Python SDK ● Java SDK ● JavaScript |
[!div class="nextstepaction"] Return to model types
:::image type="content" source="media/studio/overview-mortgage-1004.png" alt-text="Screenshot of US Mortgage 1004 document model analysis using Document Intelligence Studio.":::
Model ID | Description | Automation use cases | Development options |
---|---|---|---|
prebuilt-mortgage.us.1004 | ● Extract key information from 1004 appraisals. ● Data and field extraction |
● Fannie Mae and Freddie Mac documentation requirements. ● Uniform Residential Appraisal report to help lender/client with the market value of the subject property. |
● Document Intelligence Studio ● REST API ● C# SDK ● Python SDK ● Java SDK ● JavaScript |
[!div class="nextstepaction"] Return to model types
:::image type="content" source="media/studio/overview-mortgage-1005.png" alt-text="Screenshot of US Mortgage 1005 document model analysis using Document Intelligence Studio.":::
Model ID | Description | Automation use cases | Development options |
---|---|---|---|
prebuilt-mortgage.us.1005 | ● Extract key information from 1005 validation of employment. ● Data and field extraction |
● Fannie Mae and Freddie Mac documentation requirements. ● Verification of employment document to determine the qualification as a prospective mortgagor. |
● Document Intelligence Studio ● REST API ● C# SDK ● Python SDK ● Java SDK ● JavaScript |
[!div class="nextstepaction"] Return to model types
:::image type="content" source="media/overview/analyze-1008.png" alt-text="Screenshot of US Mortgage 1008 document model analysis using Document Intelligence Studio.":::
Model ID | Description | Automation use cases | Development options |
---|---|---|---|
prebuilt-mortgage.us.1008 | ● Extract key information from Uniform Underwriting and Transmittal Summary. ● Data and field extraction |
● Loan underwriting processing using summary data. | ● Document Intelligence Studio ● REST API ● C# SDK ● Python SDK ● Java SDK ● JavaScript |
[!div class="nextstepaction"] Return to model types
:::image type="content" source="media/overview/analyze-closing-disclosure.png" alt-text="Screenshot of US Mortgage closing disclosure document model analysis using Document Intelligence Studio.":::
Model ID | Description | Automation use cases | Development options |
---|---|---|---|
prebuilt-mortgage.us.closingDisclosure | ● Extract key information from Uniform Underwriting and Transmittal Summary. ● Data and field extraction |
● Mortgage loan final details requirements. | ● Document Intelligence Studio ● REST API ● C# SDK ● Python SDK ● Java SDK ● JavaScript |
[!div class="nextstepaction"] Return to model types
:::image type="content" source="media/overview/analyze-w2.png" alt-text="Screenshot of W-2 model analysis using Document Intelligence Studio.":::
Model ID | Description | Automation use cases | Development options |
---|---|---|---|
prebuilt-tax.us.w2 | ● Extract key information from IRS US W2 tax forms (year 2018-2021). ● |
● Automated tax document management. ● Mortgage loan application processing. |
● Document Intelligence Studio ● REST API ● C# SDK ● Python SDK ● Java SDK ● JavaScript |
[!div class="nextstepaction"] Return to model types
:::image type="content" source="media/overview/analyze-1098.png" alt-text="Screenshot of US 1098 tax form analyzed in the Document Intelligence Studio.":::
Model ID | Description | Development options |
---|---|---|
prebuilt-tax.us.1098{variation } |
● Extract key information from 1098-form variations. ● |
● Document Intelligence Studio ● REST API ● C# SDK ● Python SDK ● Java SDK ● JavaScript |
[!div class="nextstepaction"] Return to model types
:::image type="content" source="media/overview/analyze-1099.png" alt-text="Screenshot of US 1099 tax form analyzed in the Document Intelligence Studio." lightbox="media/overview/analyze-1099.png":::
Model ID | Description | Development options |
---|---|---|
prebuilt-tax.us.1099{variation } |
● Extract information from 1099-form variations. ● |
● Document Intelligence Studio ● REST API ● C# SDK ● Python SDK ● Java SDK ● JavaScript |
[!div class="nextstepaction"] Return to model types
:::image type="content" source="media/overview/analyze-1040.png" alt-text="Screenshot of US tax 1040 tax form model analysis using Document Intelligence Studio.":::
Model ID | Description | Development options |
---|---|---|
prebuilt-tax.us.1040{variation } |
● Extract information from 1040-form variations. ● |
● Document Intelligence Studio ● REST API ● C# SDK ● Python SDK ● Java SDK ● JavaScript |
:::moniker range=">=doc-intel-4.0.0"
Model ID | Description | Development options |
---|---|---|
prebuilt-tax.us | ●Extract information from any of the supported US tax forms. | ● Document Intelligence Studio ● REST API ● C# SDK ● Python SDK ● Java SDK ● JavaScript |
:::moniker-end
:::moniker range="<=doc-intel-3.1.0"
:::image type="content" source="media/overview/analyze-business-card.png" alt-text="Screenshot of Business card model analysis using Document Intelligence Studio.":::
Model ID | Description | Automation use cases | Development options |
---|---|---|---|
prebuilt-businessCard | ● Extract key information from business cards. ● Data and field extraction |
● Sales lead and marketing management. | ● Document Intelligence Studio ● REST API ● C# SDK ● Python SDK ● Java SDK ● JavaScript |
[!div class="nextstepaction"] Return to model types
:::moniker-end
:::image type="content" source="media/overview/custom-train.png" alt-text="Screenshot of Custom model training using Document Intelligence Studio.":::
About | Description | Automation use cases | Development options |
---|---|---|---|
Custom model | Extracts information from forms and documents into structured data based on a model created from a set of representative training document sets. | Extract distinct data from forms and documents specific to your business and use cases. | ● Document Intelligence Studio ● REST API ● C# SDK ● Java SDK ● JavaScript SDK ● Python SDK |
[!div class="nextstepaction"] Return to custom model types
:::image type="content" source="media/overview/analyze-custom-neural.png" alt-text="Screenshot of Custom Neural model analysis using Document Intelligence Studio.":::
Note
To train a custom neural model, set the buildMode
property to neural
.
For more information, see Training a neural model
About | Description | Automation use cases | Development options |
---|---|---|---|
Custom Neural model | The custom neural model is used to extract labeled data from structured (surveys, questionnaires), semi-structured (invoices, purchase orders), and unstructured documents (contracts, letters). | Extract text data, checkboxes, and tabular fields from structured and unstructured documents. | Document Intelligence Studio ● REST API ● C# SDK ● Java SDK ● JavaScript SDK ● Python SDK |
[!div class="nextstepaction"] Return to custom model types
:::image type="content" source="media/overview/analyze-custom-template.png" alt-text="Screenshot of Custom Template model analysis using Document Intelligence Studio.":::
Note
To train a custom template model, set the buildMode
property to template
.
For more information, see Training a template model
About | Description | Automation use cases | Development options |
---|---|---|---|
Custom Template model | The custom template model extracts labeled values and fields from structured and semi-structured documents. |
Extract key data from highly structured documents with defined visual templates or common visual layouts, forms. | ● Document Intelligence Studio ● REST API ● C# SDK ● Python SDK ● Java SDK ● JavaScript SDK |
[!div class="nextstepaction"] Return to custom model types
About | Description | Automation use cases | Development options |
---|---|---|---|
Composed custom models | A composed model is created by taking a collection of custom models and assigning them to a single model built from your form types. | Useful when you train several models and want to group them to analyze similar form types like purchase orders. | ● Document Intelligence Studio ● REST API ● C# SDK ● Java SDK ● JavaScript SDK ● Python SDK |
[!div class="nextstepaction"] Return to custom model types
:::moniker range=">=doc-intel-3.1.0"
:::image type="content" source="media/overview/custom-classifier-labeling.png" alt-text="Screenshot of Custom classification model labeling in Document Intelligence Studio.":::
About | Description | Automation use cases | Development options |
---|---|---|---|
Composed classification model | Custom classification models combine layout and language features to detect, identify, and classify documents within an input file. | ● A loan application packaged containing application form, payslip, and, bank statement. ● A collection of scanned invoices. |
● Document Intelligence Studio ● REST API |
[!div class="nextstepaction"] Return to custom model types
:::moniker-end
:::moniker range="doc-intel-2.1.0"
Azure AI Document Intelligence is a cloud-based Azure AI service for developers to build intelligent document processing solutions. Document Intelligence applies machine-learning-based optical character recognition (OCR) and document understanding technologies to extract text, tables, structure, and key-value pairs from documents. You can also label and train custom models to automate data extraction from structured, semi-structured, and unstructured documents. To learn more about each model, see the Concepts articles:
Model type | Model name |
---|---|
Document analysis model | ● Layout analysis model |
Prebuilt models | ● Invoice model ● Receipt model ● Identity document (ID) model ● Business card model |
Custom models | ● Custom model ● Composed model |
:::moniker-end
:::moniker range="doc-intel-2.1.0"
[!INCLUDE applies to v2.1]
Tip
- For an enhanced experience and advanced model quality, try the Document Intelligence v3.0 Studio:
- The v3.0 Studio supports any model trained with v2.1 labeled data.
- You can refer to the API migration guide for detailed information about migrating from v2.1 to v3.0.
Use the links in the table to learn more about each model and browse the API references:
:::moniker-end
As with all AI services, developers using the Document Intelligence service should be aware of Microsoft policies on customer data. See our Data, privacy, and security for Document Intelligence page.
:::moniker range=">=doc-intel-3.0.0"
-
Try processing your own forms and documents with the Document Intelligence Studio.
-
Complete a Document Intelligence quickstart and get started creating a document processing app in the development language of your choice.
:::moniker-end
:::moniker range="doc-intel-2.1.0"
-
Try processing your own forms and documents with the Document Intelligence Sample Labeling tool.
-
Complete a Document Intelligence quickstart and get started creating a document processing app in the development language of your choice.
:::moniker-end