n8n node for Mistral OCR API integration with structured annotations
npm install n8n-nodes-mistral-ocrA powerful n8n community node for document OCR (Optical Character Recognition) using Mistral AI's OCR API. Extract text and structured data from documents with ease!
- š Basic OCR: Extract text from documents (PDFs, images)
- šÆ Smart Templates: Pre-configured templates for common document types (invoices, contracts, IDs, etc.)
- š ļø Custom Fields: Define your own data extraction fields
- š Element Analysis: Extract data from charts, tables, and figures
- š§ Advanced Mode: Full JSON schema control for power users
- š± User-Friendly UI: No JSON knowledge required for basic use
- Invoices/Bills - Extract amounts, dates, customer info
- Letters/Correspondence - Extract sender, recipient, dates, references
- Contracts - Extract parties, dates, amounts, terms
- Receipts - Extract store info, amounts, items
- ID Documents - Extract names, birth dates, ID numbers
- Research Papers - Extract titles, authors, abstracts, keywords
``bash`
npm install n8n-nodes-mistral-ocr
1. Get your Mistral API key from Mistral AI
2. In n8n, create a new credential of type "Mistral API"
3. Enter your API key
The project has been modularized for better maintainability:
``
nodes/MistralOcr/
āāā MistralOcr.node.ts # Main node implementation
āāā types/
ā āāā index.ts # TypeScript type definitions
āāā templates/
ā āāā documentTemplates.ts # Predefined document templates
āāā utils/
ā āāā nodeProperties.ts # UI property definitions
ā āāā schemaUtils.ts # Schema helper functions
āāā constants/
ā āāā defaults.ts # Default values and constants
āāā mistral.svg # Node icon
- types/: Contains all TypeScript interfaces and type definitions
- templates/: Predefined schemas for common document types (invoices, contracts, etc.)
- utils/: Helper functions for schema building, parsing, and UI configuration
- constants/: Default values, API endpoints, and limits
`json`
{
"contract_value": {
"type": "number",
"description": "Total contract value"
},
"client_name": {
"type": "string",
"description": "Name of the client"
},
"due_date": {
"type": "string",
"description": "Payment due date"
}
}
- Document Annotations: Maximum 8 pages per request
- File Size: Up to 50MB per document
- Total Pages: Up to 1000 pages per document
- File Expiry: 1-168 hours (default: 24 hours)
The node includes intelligent rate limiting and error handling:
- Automatic Retry: 429 errors (rate limits) are automatically retried with exponential backoff
- Smart Backoff: Delays increase exponentially (1s, 2s, 4s) with randomization to avoid thundering herd
- Descriptive Errors: Clear error messages when rate limits are exceeded
- File Validation: Pre-upload validation for file size and format
- Graceful Degradation: Continue-on-fail support for batch processing
`bash`
npm run build
`bash`
npm run lint
npm run lint:fix
The project follows n8n's coding standards:
- TypeScript for all implementations
- ESLint + Prettier for code formatting
- Comprehensive JSDoc documentation
- Modular architecture for maintainability
The node includes comprehensive error handling:
- Invalid JSON schema detection
- API rate limiting awareness
- File upload validation
- Graceful degradation with continue-on-fail
Contributions are welcome! The modular structure makes it easy to:
- Add new document templates in templates/documentTemplates.tstypes/index.ts
- Extend type definitions in utils/
- Add utility functions in constants/defaults.ts`
- Update constants in
MIT
- Documentation: GitHub Repository
- Issues: GitHub Issues
- Mistral API: Mistral Documentation