SearchInfiniFlow Inc.

RAGFlow

Open-source RAG engine with deep document understanding.

Categories: SearchOrchestration
Pricing: FREEMIUM
Source: Open core
Hosting: Hybrid
Platforms: WebAPI
Models: Model-agnostic
Verified: Jun 20, 2026

RAGFlow is an open-source retrieval-augmented generation engine that turns complex documents—PDFs, slides, spreadsheets, scans, and web pages—into grounded, citation-backed context for LLMs. Its DeepDoc parser and hybrid vector plus full-text search aim for reliable, hallucination-resistant question answering, and it now bundles an agent-orchestration layer. Self-host the Apache-2.0 engine or use the managed cloud.

Pros & cons

Apache-2.0, fully self-hostable
Deep document, table, and scan parsing
Citation-backed answers
Hybrid vector + full-text search
Built-in agent orchestration

Heavier setup than hosted RAG APIs
Cloud tiers cap apps and storage
Resource-intensive to self-host

RAGFlow

LlamaIndex

Haystack

Ragie

Onyx