Using Document AI for Invoice Coding
The Invoice Coding API can extract structured invoice data directly from documents using Kaunt Document AI. Instead of mapping and uploading pre-structured invoice data, you submit the document itself and Kaunt handles both extraction and coding in a single flow.
This is the recommended approach for integrating invoices that exist as PDFs, scans, or images. It replaces the previous integration using Azure Form Recognizer.
How It Works
The Invoice Coding API provides dedicated document endpoints that internally invoke Document AI:
- Training data:
POST .../postedinvoices/document— submit historical invoice documents for training the AI model - Coding proposals:
POST .../invoicecodingproposals/document/process— submit new invoice documents to receive coding proposals
Both endpoints also have batch variants (/document/createmany and /document/processmany) for submitting multiple documents in a single request.
When you submit a document to one of these endpoints, the following happens:
- The document is sent to Document AI using the tenant and company from the request path.
- Document AI extracts structured invoice data from the document.
- The extracted data is mapped into the Kaunt invoice format.
- For proposals, the mapped invoice is run through the Invoice Coding pipeline.
The request body is a DocumentRequestDto containing the document file alongside metadata such as externalInvoiceId and optional vendor/buyer identifiers.
Response and Document AI Access
The response includes a documentExtractionInfo object with:
documentId— the Document AI document identifierextractionUrl— URL to retrieve the raw extraction result
This allows you to access the full Document AI output via the Document AI APIs, for example to display the extracted content to users or to submit feedback on the extraction.
Webhooks
Both Document AI and Invoice Coding webhooks are triggered during processing:
DocumentAIDocumentResultReady/DocumentAIDocumentResultFailed— when extraction completesInvoiceCodingProposalReady/InvoiceCodingProposalFailed— when coding proposals are ready (proposals endpoint only)
For details on subscribing to webhooks, see the Webhook Guide.
Custom Fields and Instructions
Custom fields and instructions configured through the Document AI API apply when documents are processed through the Invoice Coding document endpoints.
Custom fields defined for the tenant or company are extracted from the document and mapped to invoice fields. The Invoice Coding AI model uses these as input alongside standard fields, which means well-defined custom fields can improve coding accuracy.
Instructions created via Document AI feedback also apply during extraction. Any instruction targeting the tenant, company, vendor, or buyer scope is evaluated and applied when the document is processed.
For details on configuring custom fields and writing effective instructions, see the Document AI Integration Guide and the Best Practices for Writing Feedback and Instructions.
Migrating from Azure Form Recognizer
If you are currently using the legacy Document Text Extraction integration (Azure Form Recognizer), the document endpoints described above are a direct replacement. The same /document/ endpoints are used — Document AI is now the extraction service behind them. No changes to your API calls are required, but you gain access to Document AI features such as custom fields, instructions, and feedback.