Skip to main content

Invoice & PO Data Extractor

Invoice & PO Data Extractor header

Upload an invoice, purchase order, or credit note (PDF or image) and get back structured data: line items, totals, payment terms, and dates. The output follows a strict schema so it can feed directly into your ERP, accounting system, or reconciliation workflow.

You describe it

Invoice & PO Data Extractor

Upload an invoice, purchase order, or credit note (PDF or image) and get back structured data: line items, totals, payment terms, and dates. The output follows a strict schema so it can feed directly into your ERP, accounting system, or reconciliation workflow.

Inputs

FieldTypeDetails
DocumentFilePDF or image (JPG, PNG) of the invoice, PO, or credit note
Document TypeDropdownInvoice / Purchase Order / Credit Note

Output schema

Header

FieldFormatNotes
Document NumberTextInvoice number, PO number, or credit note number
Document DateYYYY-MM-DDThe date printed on the document
Vendor NameTextThe company issuing the document
Vendor AddressTextFull address as shown
Vendor Tax IDText (optional)EIN, VAT number, GST number if present
Buyer NameTextThe company receiving the goods/services
Buyer AddressTextFull address as shown
Buyer Tax IDText (optional)If present
CurrencyISO 4217 codee.g., USD, EUR, GBP
Purchase Order ReferenceText (optional)PO number referenced on an invoice, if present

Line items

Each line item contains:

FieldFormatNotes
Line NumberIntegerSequential, starting at 1
DescriptionTextProduct or service description
QuantityDecimalNumber of units
UnitText (optional)e.g., "hours," "units," "licenses"
Unit PriceDecimalPrice per unit before tax
Tax RatePercentage (optional)Tax rate applied to this line, if shown
Tax AmountDecimal (optional)Tax for this line item
Line TotalDecimalQuantity x Unit Price (pre-tax)

Totals

FieldFormatNotes
SubtotalDecimalSum of all line totals
Tax TotalDecimalTotal tax across all line items
Shipping / HandlingDecimal (optional)If present
DiscountDecimal (optional)If present, shown as a positive number
Total DueDecimalFinal amount owed

Payment terms

FieldFormatNotes
Due DateYYYY-MM-DDWhen payment is due
Payment TermsTexte.g., "Net 30," "Due on receipt," "2/10 Net 30"
Payment MethodText (optional)e.g., "Wire transfer," "ACH," "Check"
Bank DetailsText (optional)Account number, routing number, IBAN if provided
Early Payment DiscountText (optional)e.g., "2% if paid within 10 days"

Validation rules

These checks run automatically against the extracted data:

  • Line items must sum to subtotal. If the sum of all Line Total values doesn't equal Subtotal (within a $0.02 rounding tolerance), flag the discrepancy in a validation_warnings field.

  • Tax calculation check. If individual tax rates are present, verify that each line's Tax Amount equals Line Total x Tax Rate (within rounding tolerance).

  • Total Due check. Verify that Subtotal + Tax Total + Shipping - Discount = Total Due. Flag any mismatch.

  • Date sanity. Due Date should be on or after Document Date. If it's before, flag it.

  • Currency consistency. All monetary values should be in the same currency. If mixed currencies appear, flag it.

Handling poor-quality documents

  • If the document is a scanned image with low resolution, extract what you can and set a confidence field to low on any values you're uncertain about.

  • If a field is completely unreadable, set its value to null and add a note in extraction_notes explaining what couldn't be read.

  • If the document contains handwritten annotations or corrections, extract the corrected value and note the original in extraction_notes.

Edge cases

  • Multi-page documents: extract data across all pages. Line items often span pages.

  • Documents with no line items (e.g., a flat-fee invoice with just a total): create a single line item with the description and total.

  • Credit notes: amounts should still be expressed as positive numbers, but the Document Type field distinguishes this from an invoice.

  • Multiple tax rates: handle per-line-item tax rates. If different lines have different rates, capture each rate individually.

  • Foreign currency with conversion: if the document shows both original and converted currency amounts, extract both and note which is primary.

We build it

Extract Data

Upload an invoice, purchase order, or credit note file and extract structured header, line item, totals, payment terms, and validation metadata suitable for ERP or accounting workflows.

Document Upload & Settings

Upload the invoice, purchase order, or credit note and specify its type.

Try me

Inputs

FieldTypeDetails
DocumentFilePDF or image (JPG, PNG) of the invoice, PO, or credit note
Document TypeDropdownInvoice / Purchase Order / Credit Note

Output schema

Header

FieldFormatNotes
Document NumberTextInvoice number, PO number, or credit note number
Document DateYYYY-MM-DDThe date printed on the document
Vendor NameTextThe company issuing the document
Vendor AddressTextFull address as shown
Vendor Tax IDText (optional)EIN, VAT number, GST number if present
Buyer NameTextThe company receiving the goods/services
Buyer AddressTextFull address as shown
Buyer Tax IDText (optional)If present
CurrencyISO 4217 codee.g., USD, EUR, GBP
Purchase Order ReferenceText (optional)PO number referenced on an invoice, if present

Line items

Each line item contains:

FieldFormatNotes
Line NumberIntegerSequential, starting at 1
DescriptionTextProduct or service description
QuantityDecimalNumber of units
UnitText (optional)e.g., "hours," "units," "licenses"
Unit PriceDecimalPrice per unit before tax
Tax RatePercentage (optional)Tax rate applied to this line, if shown
Tax AmountDecimal (optional)Tax for this line item
Line TotalDecimalQuantity x Unit Price (pre-tax)

Totals

FieldFormatNotes
SubtotalDecimalSum of all line totals
Tax TotalDecimalTotal tax across all line items
Shipping / HandlingDecimal (optional)If present
DiscountDecimal (optional)If present, shown as a positive number
Total DueDecimalFinal amount owed

Payment terms

FieldFormatNotes
Due DateYYYY-MM-DDWhen payment is due
Payment TermsTexte.g., "Net 30," "Due on receipt," "2/10 Net 30"
Payment MethodText (optional)e.g., "Wire transfer," "ACH," "Check"
Bank DetailsText (optional)Account number, routing number, IBAN if provided
Early Payment DiscountText (optional)e.g., "2% if paid within 10 days"

Validation rules

These checks run automatically against the extracted data:

  • Line items must sum to subtotal. If the sum of all Line Total values doesn't equal Subtotal (within a $0.02 rounding tolerance), flag the discrepancy in a validation_warnings field.

  • Tax calculation check. If individual tax rates are present, verify that each line's Tax Amount equals Line Total x Tax Rate (within rounding tolerance).

  • Total Due check. Verify that Subtotal + Tax Total + Shipping - Discount = Total Due. Flag any mismatch.

  • Date sanity. Due Date should be on or after Document Date. If it's before, flag it.

  • Currency consistency. All monetary values should be in the same currency. If mixed currencies appear, flag it.

Handling poor-quality documents

  • If the document is a scanned image with low resolution, extract what you can and set a confidence field to low on any values you're uncertain about.

  • If a field is completely unreadable, set its value to null and add a note in extraction_notes explaining what couldn't be read.

  • If the document contains handwritten annotations or corrections, extract the corrected value and note the original in extraction_notes.

Edge cases

  • Multi-page documents: extract data across all pages. Line items often span pages.

  • Documents with no line items (e.g., a flat-fee invoice with just a total): create a single line item with the description and total.

  • Credit notes: amounts should still be expressed as positive numbers, but the Document Type field distinguishes this from an invoice.

  • Multiple tax rates: handle per-line-item tax rates. If different lines have different rates, capture each rate individually.

  • Foreign currency with conversion: if the document shows both original and converted currency amounts, extract both and note which is primary.

Ready to Automate?

Get started with this workflow template in minutes. No complex setup required.

View Documentation