Skip to content

InferenceRelace

Relace

Purpose-built AI models and infrastructure for coding agents.

Category
Inference
Pricing
FREEMIUM
Hosting
Hybrid
Platforms
WebAPI
Models
Self-contained (on-device)
Verified
Jun 20, 2026

Relace builds specialized models and infrastructure that slot into AI code-generation products. Its Instant Apply model merges partial diffs from frontier models into full files at thousands of tokens per second, and its two-stage code retrieval (embedding search plus a code reranker) finds the right context fast. Relace also offers managed repository hosting with automatic per-commit indexing, so coding agents get cheaper, faster, more reliable edits and search.

Pros & cons

  • Instant Apply merges diffs very fast
  • SoTA code retrieval + reranker
  • Drops into existing codegen stacks
  • Powers Lovable, Create, Magic Patterns
  • Self-host / VPC option for enterprise
  • No public pricing detail
  • Narrow focus: codegen infra only
  • Newer and smaller than general LLM APIs

Tags

Further reading

View all Inference
  • View Morph details
    InferenceFREEMIUM

    Morph

    Morph

    Fast models that apply AI code edits to files in milliseconds.

    Infrastructure for coding agents centered on Fast Apply, a specialized model that merges AI-generated edits into files at ~10,500 tokens/sec instead of full-file rewrites or brittle search-and-replace. Also serves WarpGrep code search, context compaction, and a model router via an OpenAI-compatible API. Used in production by JetBrains, Vercel, and Webflow.

    Very fast edit-apply (~10,500 tok/s)
    Narrow, infra-layer use case
    • code-editing
    • fast-apply
    • coding-agents
    • api