VisionAllen Institute for AI

olmOCR

Open-source OCR that converts PDFs and scans into clean, structured text.

Categories: VisionData Ops
Pricing: FREE
Source: Open source
Hosting: Self-host
Platforms: CLIAPIWeb
Models: Self-contained (on-device)
Verified: Jun 19, 2026

olmOCR is an open-source toolkit from the Allen Institute for AI that turns PDFs and document images into clean, reading-order plain text, preserving tables, equations, and handwriting. It runs a fine-tuned 7B vision-language model with a document-anchoring prompting technique, and is built for cheap, dataset-scale conversion for LLM training and retrieval. Released with model weights, training data, and inference code; runs on your own GPUs or via third-party inference providers.

Pros & cons

Fully open source (Apache 2.0)
Strong accuracy on complex layouts
Very low cost to run at scale
Handles tables, equations, handwriting
Self-hostable, data stays on your infra

Requires a capable GPU to self-host
Not a turnkey hosted product
Built for batch, dataset-scale workflows

olmOCR

Docling

Reducto

LlamaParse

Mindee