Extraction API
Turn messy documents into structured data
Extract the fields you need from PDFs, forms, images, and unstructured documents with a clean API.
Create your account, then enable API access to get your API key and 50 free credits. Create account
What you get
A reliable extraction layer that turns documents into data your workflows can use.
Structured JSON output
Get consistent fields and normalized output your systems can consume.
Deskew and preprocess first
Correct perspective and image quality issues before extraction for better reliability.
Flexible field extraction
Define the fields you need and evolve the schema over time.
Easy workflow integration
Connect extraction results into downstream tools and automations.
PDFs, images, receipts, and forms
Configurable schema
Batch and async patterns
API keys and usage controls
From document to structured output
A simple flow that fits into existing systems.
Send document
Request
Send a PDF, image, or form through a clean API.
Deskew, extract, and normalize
Seconds
Preprocess image quality, extract fields, and normalize formats.
Reduce manual retyping and cleanup across the workflow.
Return structured output
Response
Get structured JSON back and pass it to downstream systems.
FAQ
Quick answers before you start.
What file types are supported?
JPEG, PNG, WEBP, HEIC/HEIF, and PDF.
What file types are supported?
JPEG, PNG, WEBP, HEIC/HEIF, and PDF.
Send a file URL or base64 to POST /v1/parse/receipt. HEIC/HEIF images convert to PNG before OCR, and tilted photos are deskewed automatically.
What fields do I get back?
Structured JSON: store_name, purchase_date, total, and a line-item array.
What fields do I get back?
Structured JSON: store_name, purchase_date, total, and a line-item array.
Receipts and invoices return fields like store_name, store_address, purchase_date, and total, plus an items array (name, category, price), the raw OCR text, and a requestId for tracing.
When do I get results back?
In the same API response synchronously.
When do I get results back?
In the same API response synchronously.
You POST the file and the structured JSON comes back in the same response, so there's no polling or webhooks to manage. PDFs are processed per page, so larger files take a bit longer.
How does pricing work?
Start with 50 free credits, then pay as you go.
How does pricing work?
Start with 50 free credits, then pay as you go.
Images cost 1 credit, PDFs cost 1 credit per page, and standalone deskewing is 0.5 credits. No subscription—buy more credit bundles when you need them.
Start with docs and a working example, then integrate extraction into your workflow.
Includes 50 free credits to get started.