Skip to content

InfraLiveKit, Inc.

LiveKit

Open-source framework and cloud for realtime voice, video, and physical AI agents.

Category
Infra
Pricing
FREEMIUM
Source
Open core
Hosting
Hybrid
Platforms
WebAPICLI
Models
Model-agnostic
Verified
Jun 13, 2026

LiveKit is the realtime infrastructure layer most voice-AI stacks are built on. Its open-source Agents framework wires together streaming speech-to-text, an LLM, and text-to-speech with reliable turn detection, interruption handling, and telephony, so agents can join a session and converse in near real time. You bring your own STT, LLM, and TTS providers; LiveKit handles the low-latency WebRTC transport. It runs self-hosted under Apache 2.0 or as a managed cloud platform.

Pros & cons

  • Powers ChatGPT Advanced Voice in production
  • Apache-2.0 open source, self-hostable
  • BYO STT/LLM/TTS — no model lock-in
  • Sub-second latency, turn detection, telephony
  • Managed cloud option alongside the OSS
  • Developer infrastructure, not a no-code product
  • You assemble and pay for STT/LLM/TTS separately
  • Realtime media ops add operational complexity

Tags

Further reading

View all Infra
  • View Vapi details
    VoiceFREEMIUM

    Vapi

    Vapi

    Voice agent infrastructure. Build a phone-agent in a weekend.

    Production voice-agent platform — telephony, STT, LLM, TTS, and interrupt handling stitched together so you call an endpoint and get a working phone agent. Pluggable models at every layer.

    Worth knowing

    Hit a ~$500M valuation in 2026 after Amazon picked it to power Ring's voice AI over 40 rival platforms; it has handled 1B+ calls.

    • voice-agents
    • telephony
    • phone
    • real-time
  • View Retell AI details
    VoiceFREEMIUM

    Retell AI

    Retell AI

    Build, test, and deploy AI voice agents for phone calls.

    A no-code platform for humanlike voice agents that handle inbound and outbound phone calls — receptionists, IVR, and outbound campaigns. It bundles telephony (SIP / Twilio), a proprietary turn-taking model for low-latency conversations, prompts, tools, and call analytics. Pay-as-you-go pricing with free starter credits.

    Worth knowing

    Founded 2023 by ex-ByteDance, Google and Meta alumni; a YC W24 startup at ~$40M annualized revenue with a ~25-person team.

    • voice-agents
    • telephony
    • call-automation
    • no-code
  • View Cartesia details
    VoiceFREEMIUM

    Cartesia

    Cartesia

    Low-latency streaming TTS. Sub-100ms first audio.

    Streaming-first speech synthesis built around the Sonic family of state-space models. Aims at real-time agent voices where latency between turns is the product. Strong choice for sub-200ms voice loops.

    Worth knowing

    Founded in 2023 by the Stanford AI Lab team behind state-space models and Mamba, incl. Albert Gu and Karan Goel.

    • tts
    • streaming
    • low-latency
    • real-time