Skip to content

SearchInfiniFlow Inc.

RAGFlow

Open-source RAG engine with deep document understanding.

Pricing
FREEMIUM
Source
Open core
Hosting
Hybrid
Platforms
WebAPI
Models
Model-agnostic
Verified
Jun 20, 2026

RAGFlow is an open-source retrieval-augmented generation engine that turns complex documents—PDFs, slides, spreadsheets, scans, and web pages—into grounded, citation-backed context for LLMs. Its DeepDoc parser and hybrid vector plus full-text search aim for reliable, hallucination-resistant question answering, and it now bundles an agent-orchestration layer. Self-host the Apache-2.0 engine or use the managed cloud.

Pros & cons

  • Apache-2.0, fully self-hostable
  • Deep document, table, and scan parsing
  • Citation-backed answers
  • Hybrid vector + full-text search
  • Built-in agent orchestration
  • Heavier setup than hosted RAG APIs
  • Cloud tiers cap apps and storage
  • Resource-intensive to self-host

Tags

View all Search
  • View LlamaIndex details
    OrchestrationFREEMIUMOpen core

    LlamaIndex

    LlamaIndex

    The data framework for LLM apps — RAG, agents, and document workflows.

    An open-source framework (Python + TypeScript) for connecting LLMs to your data — ingestion, indexing, retrieval, and agentic document workflows. Pairs with the managed LlamaCloud (LlamaParse) for production parsing and extraction. The most-used RAG framework after LangChain.

    Best-in-class RAG primitives
    Narrower than full orchestration frameworks
    • framework
    • rag
    • agents
    • open-source
  • View Haystack details
    OrchestrationFREEOSS

    Haystack

    deepset

    Open-source Python framework for production RAG and agents.

    Orchestration framework for building LLM applications as modular pipelines — retrieval, routing, memory, and generation wired together with explicit, traceable data flow. It is model- and store-agnostic, integrating major providers and vector databases behind a stable component API. Aimed at production: serialization, logging, and deployment across cloud or on-prem.

    Composable, typed pipeline architecture
    Pipeline model has a learning curve
    • rag
    • pipelines
    • python
    • open-source
    • +1
  • View Ragie details
    SearchFREEMIUM

    Ragie

    Ragie, Corp

    Managed RAG-as-a-service — the context engine for AI agents and apps.

    Ragie is a fully managed retrieval-augmented-generation platform. It ingests data through native connectors like Google Drive and Notion, parses multimodal content (PDFs, images, audio, video), and serves hybrid vector + keyword + summary retrieval over an API and MCP server. Developers add accurate, grounded context to LLM apps without building their own ingestion and retrieval pipeline.

    Fully managed, fast to integrate
    Production tier starts at $500/month
    • rag
    • retrieval
    • context
    • mcp
  • View Onyx details
    AssistantFREEMIUMOpen core

    Onyx

    Onyx

    Open-source, self-hosted AI chat and enterprise search over your own docs.

    Onyx (formerly Danswer) is an open-source AI chat and RAG platform that connects to your company's docs and apps for grounded, cited answers, and works with any LLM. It is self-hosted via Docker/Kubernetes and supports local models, keeping data on your own infrastructure. The core is MIT-licensed and free; an open-core model puts optional enterprise features under a separate license, and the vendor also offers a managed cloud.

    Self-hosted, data stays on your infra
    Some features under separate license
    • open-source
    • self-hosted
    • rag
    • enterprise-search
    • +1