Pipecat

Open-source framework for real-time voice and multimodal AI agents.

Categories: VoiceOrchestration
Pricing: FREEMIUM
Source: Open core
Hosting: Hybrid
Platforms: APIWebiOSAndroid
Models: Multi-model
Verified: Jun 15, 2026

Pipecat is a Python framework for building voice and multimodal conversational agents that can listen, speak, and see in real time. It orchestrates streaming speech-to-text, an LLM, and text-to-speech into one low-latency pipeline, wiring together 40+ AI services with no vendor lock-in. Client SDKs cover JavaScript, React, React Native, Swift, Kotlin, C++, and ESP32, and Pipecat Cloud offers managed hosting at scale.

Capabilities 5

What it actually does — grouped by capability family.

Agent framework (primary capability)
Voice agent (primary capability)
Multi-agent orchestration (secondary capability)
Tool / function calling (secondary capability)

App / agent deployment (secondary capability)

Pros & cons

Self-hostable, no vendor lock-in
Mix and match any STT, LLM, and TTS
Real-time listen, speak, and see
Client SDKs across web/mobile/embedded
Optional managed Pipecat Cloud

Python framework, not a no-code tool
You wire and pay for each AI service
Tuning latency needs real expertise
Smaller ecosystem than hosted rivals

Tags

View all Voice →

View LiveKit details
InfraFREEMIUMOpen core
LiveKit
LiveKit, Inc.
Open-source framework and cloud for realtime voice, video, and physical AI agents.
LiveKit is the realtime infrastructure layer most voice-AI stacks are built on. Its open-source Agents framework wires together streaming speech-to-text, an LLM, and text-to-speech with reliable turn detection, interruption handling, and telephony, so agents can join a session and converse in near real time. You bring your own STT, LLM, and TTS providers; LiveKit handles the low-latency WebRTC transport. It runs self-hosted under Apache 2.0 or as a managed cloud platform.
Powers ChatGPT Advanced Voice in production
Developer infrastructure, not a no-code product
- voice-ai
- webrtc
- realtime
- agents
- +2
Open
View Vapi details
VoiceFREEMIUM
Vapi
Vapi
Voice agent infrastructure. Build a phone-agent in a weekend.
Production voice-agent platform — telephony, STT, LLM, TTS, and interrupt handling stitched together so you call an endpoint and get a working phone agent. Pluggable models at every layer.
Telephony and interrupts handled
Per-minute costs stack across layers
- voice-agents
- telephony
- phone
- real-time
Open
View Retell AI details
VoiceFREEMIUM
Retell AI
Retell AI
Build, test, and deploy AI voice agents for phone calls.
A no-code platform for humanlike voice agents that handle inbound and outbound phone calls — receptionists, IVR, and outbound campaigns. It bundles telephony (SIP / Twilio), a proprietary turn-taking model for low-latency conversations, prompts, tools, and call analytics. Pay-as-you-go pricing with free starter credits.
Inbound and outbound call handling
Per-minute costs stack with LLM/TTS
- voice-agents
- telephony
- call-automation
- no-code
Open
View Synthflow details
VoiceFREEMIUM
Synthflow
Synthflow AI
No-code platform for AI voice agents that automate phone calls.
Synthflow is an enterprise voice-AI platform for building and deploying AI agents that handle phone calls — inbound and outbound — without code. A visual flow designer, in-house telephony, real-time monitoring, and 200+ integrations across calendars, CRMs, and telephony let teams stand up agents for customer service, appointment scheduling, and lead qualification. It runs on a pay-as-you-go model: build and test free, then pay per call once live.
No-code visual flow designer
Focused on phone/voice, not broad chat
- voice-agents
- no-code
- telephony
- contact-center
- +1
Open

Open Pipecat

Capabilities 5

Pros & cons

Tags

LiveKit

Vapi

Retell AI

Synthflow