Extraction API

Turn messy documents into structured data

Extract the fields you need from PDFs, forms, images, and unstructured documents with a clean API.

Create your account, then enable API access to get your API key and 50 free credits. Create account

What you get

A reliable extraction layer that turns documents into data your workflows can use.

Structured JSON output

Get consistent fields and normalized output your systems can consume.

Deskew and preprocess first

Correct perspective and image quality issues before extraction for better reliability.

Flexible field extraction

Define the fields you need and evolve the schema over time.

Easy workflow integration

Connect extraction results into downstream tools and automations.

PDFs, images, receipts, and forms

Configurable schema

Batch and async patterns

API keys and usage controls

From document to structured output

A simple flow that fits into existing systems.

Send document

Request

Send a PDF, image, or form through a clean API.

Deskew, extract, and normalize

Seconds

Preprocess image quality, extract fields, and normalize formats.

Reduce manual retyping and cleanup across the workflow.

Return structured output

Response

Get structured JSON back and pass it to downstream systems.

FAQ

Quick answers before you start.

What file types are supported?

JPEG, PNG, WEBP, HEIC/HEIF, and PDF.

Send a file URL or base64 to POST /v1/parse/receipt. HEIC/HEIF images convert to PNG before OCR, and tilted photos are deskewed automatically.

What fields do I get back?

Structured JSON: store_name, purchase_date, total, and a line-item array.

Receipts and invoices return fields like store_name, store_address, purchase_date, and total, plus an items array (name, category, price), the raw OCR text, and a requestId for tracing.

When do I get results back?

In the same API response synchronously.

You POST the file and the structured JSON comes back in the same response, so there's no polling or webhooks to manage. PDFs are processed per page, so larger files take a bit longer.

How does pricing work?

Start with 50 free credits, then pay as you go.

Images cost 1 credit, PDFs cost 1 credit per page, and standalone deskewing is 0.5 credits. No subscription—buy more credit bundles when you need them.

Ready to test with your own documents?

Start with docs and a working example, then integrate extraction into your workflow.

Includes 50 free credits to get started.