n8n community node for Azure Document Intelligence (Form Recognizer)
npm install n8n-nodes-azure-document-intelligenceThis is an n8n community node that integrates Azure Document Intelligence (formerly Form Recognizer) into your n8n workflows.
Azure Document Intelligence is a cloud-based service that uses machine learning models to extract text, key-value pairs, tables, and structures from documents. Perfect for automated document processing, form recognition, invoice extraction, and OCR tasks.
n8n is a fair-code licensed workflow automation platform.
This is an unofficial community node and is not affiliated with, endorsed by, or supported by Microsoft Corporation or n8n GmbH.
Azure, Azure Document Intelligence, Form Recognizer, and related trademarks are property of Microsoft Corporation. Users must comply with Microsoft's Azure AI Services terms and conditions.
This package is provided "as is" under the MIT License without warranty of any kind.
- Installation
- Features
- Credentials
- Usage
- Supported Models
- Parameters
- Multiple Outputs
- Examples
- Resources
- Version History
Follow the installation guide in the n8n community nodes documentation.
``bash`
npm install n8n-nodes-azure-document-intelligence
`bashClone this repository
git clone https://github.com/mlangcode/n8n-nodes-azure-document-intelligence.git
cd n8n-nodes-azure-document-intelligence
Features
✅ Multiple Prebuilt Models: Support for 9 prebuilt models (read, layout, invoice, receipt, ID, business card, etc.)
✅ Flexible Input: Binary data, URL, or base64-encoded content
✅ Three Outputs: Separate outputs for content, structured data, and tables
✅ Markdown Support: Extract documents in markdown or plain text format
✅ Table Processing: Automatically identifies headers and converts tables to structured data
✅ Page Selection: Analyze specific pages from multi-page documents
✅ Locale Support: Specify language hints for better recognition
✅ Long-Running Operations: Automatic polling for document analysis completion
✅ Error Handling: Comprehensive error messages and validation
✅ Binary Data Support: Seamlessly integrate with n8n's binary data field
Credentials
This node uses Azure Document Intelligence credentials with the following fields:
- Endpoint: Your Azure Document Intelligence endpoint URL (e.g.,
https://your-resource.cognitiveservices.azure.com)
- API Key: Your Azure Document Intelligence subscription key
- API Version: The API version to use (default: 2024-11-30)$3
1. In n8n, go to Credentials → New
2. Search for "Azure Document Intelligence"
3. Fill in your endpoint URL and API key
4. Click Save
Usage
$3
1. Add the "Azure Document Intelligence" node to your workflow
2. Configure your Azure Document Intelligence credentials
3. Select the appropriate prebuilt model for your document type
4. Choose input source (binary data, URL, or base64)
5. Configure additional options as needed
The node subtitle will display the selected model for easy identification.
Supported Models
The node supports the following prebuilt models:
$3
- Read (OCR): Basic optical character recognition for extracting printed and handwritten text
- Layout: Extract text, tables, selection marks, and document structure$3
- General Document: Extract key-value pairs, entities, and general structure from any document type$3
- Invoice: Extract vendor name, invoice date, total, line items, and other invoice fields
- Receipt: Extract merchant name, transaction date, total, and line items from receipts
- ID Document: Extract information from passports, driver's licenses, and identity cards
- Business Card: Extract contact information including names, companies, emails, and phone numbers$3
- Health Insurance Card (US): Extract member information, group numbers, and insurance details
- W-2 Tax Form (US): Extract employer information, wages, and tax withholding dataParameters
$3
- Model: Select the prebuilt model appropriate for your document type
- Input Source: Choose how to provide the document:
- Binary Data: Use document from a previous node's binary field
- URL: Provide a public URL to the document
- Base64: Provide base64-encoded document content
$3
#### Binary Data
- Binary Property: Name of the binary property (default:
data)#### URL
- Document URL: Public URL to the document
#### Base64
- Base64 Content: Base64-encoded string of the document
$3
- Content Type: Specify the document MIME type (PDF, JPEG, PNG, TIFF, BMP, HEIF)
- Output Content Format: Choose between
text or markdown for extracted content (for read/layout models)
- Pages: Specify which pages to analyze (e.g., 1-3,5 or 1,3,5-7)
- Locale: Language hint for text recognition (e.g., en-US, de-DE, fr-FR)Multiple Outputs
The node has three outputs for flexible workflow routing:
$3
Contains: Raw text or markdown content extracted from the document`json
{
"content": "# Invoice\n\nVendor: Acme Corp...",
"contentLength": 1234,
"model": "prebuilt-layout"
}
`Use this for:
- Text extraction and OCR workflows
- Full document content for further processing
- Feeding to LLMs or text analysis nodes
$3
Contains: Extracted fields, key-value pairs, and structured information`json
{
"model": "prebuilt-invoice",
"pageCount": 2,
"documents": [
{
"docType": "invoice",
"fields": {
"VendorName": { "content": "Acme Corp", "confidence": 0.99 },
"InvoiceTotal": { "content": "1,234.56", "confidence": 0.98 },
"InvoiceDate": { "content": "2024-01-15", "confidence": 0.97 }
}
}
],
"pages": [...]
}
`Use this for:
- Extracting specific fields (invoice data, receipt information)
- Key-value pair extraction
- Document field validation and processing
$3
Contains: Processed tables with identified headers and structured row data`json
{
"tableCount": 2,
"tables": [
{
"headers": ["Item", "Quantity", "Price", "Total"],
"dataRows": [
{ "Item": "Widget A", "Quantity": "5", "Price": "$10.00", "Total": "$50.00" },
{ "Item": "Widget B", "Quantity": "3", "Price": "$15.00", "Total": "$45.00" }
]
}
],
"model": "prebuilt-layout"
}
`Use this for:
- Extracting tabular data from documents
- Processing invoice line items
- Converting document tables to structured data for databases
$3
When errors occur (and "Continue on Fail" is enabled):
- Error details are sent to all three outputs
- Includes HTTP status codes and error messages
- Workflow continues instead of stopping
Examples
$3
Workflow:
`
HTTP Request (download PDF)
→ Azure Document Intelligence
Model: Read (OCR)
Input Source: Binary Data
Binary Property: data
→ [Content Output] → Process extracted text
`---
$3
Workflow:
`
HTTP Request (get invoice PDF)
→ Azure Document Intelligence
Model: Invoice
Input Source: Binary Data
→ [Structured Data Output]
→ Code Node: Extract $.documents[0].fields
→ Store in database
`Extracted Fields:
- VendorName
- CustomerName
- InvoiceDate
- InvoiceTotal
- DueDate
- Line items
---
$3
Workflow:
`
Read Binary File (read document)
→ Azure Document Intelligence
Model: Layout
Input Source: Binary Data
Output Format: Markdown
→ [Tables Output]
→ Code Node: Process table rows
→ Send to Google Sheets
`---
$3
Workflow:
`
Azure Document Intelligence
Model: Read (OCR)
Input Source: URL
Document URL: https://example.com/document.pdf
→ [Content Output]
→ Send extracted text to analysis
`---
$3
Workflow:
`
Webhook (receive uploaded image)
→ Azure Document Intelligence
Model: Business Card
Input Source: Binary Data
→ [Structured Data Output]
→ Extract contact fields:
- Name
- Company
- Email
- Phone
→ Add to CRM
`---
$3
Workflow:
`
Azure Document Intelligence
Model: Layout
Input Source: Binary Data
Pages: 1-5,10
Output Format: Markdown
→ Process only specified pages
`---
$3
Workflow:
`
Email Trigger (receipt attachments)
→ Azure Document Intelligence
Model: Receipt
Input Source: Binary Data
→ [Structured Data Output]
→ Extract:
- MerchantName
- TransactionDate
- Total
- Items
→ Log to expense tracking system
`---
Resources
- n8n community nodes documentation
- Azure Document Intelligence documentation
- Azure Document Intelligence models
- Azure AI Services
Compatibility
- Requires n8n version 1.60.0 or later
- Compatible with Azure Document Intelligence API version 2024-11-30 (GA)
- Supports all Azure Document Intelligence prebuilt models
Supported Document Types
- PDF (application/pdf)
- JPEG (image/jpeg)
- PNG (image/png)
- TIFF (image/tiff)
- BMP (image/bmp)
- HEIF (image/heif)
Troubleshooting
$3
- Verify your API key is correct
- Ensure the endpoint URL is correct and includes https://`Contributions are welcome! Please feel free to submit a Pull Request.
mlangcode
For issues, questions, or contributions, please visit the GitHub repository.