LMNT

Streaming text-to-speech with voice cloning for real-time apps.

Category: Voice
Pricing: FREEMIUM
Source: Proprietary
Hosting: Cloud
Platforms: WebAPI
Models: Self-contained (on-device)
Verified: Jun 16, 2026

LMNT is an AI text-to-speech platform that turns text into natural speech with ultra-low latency, built for conversational agents, games, and real-time apps. It supports instant voice cloning from a short sample and multilingual synthesis, and is exposed as a developer API plus a web playground. It is offered as a built-in voice provider across major voice-agent frameworks.

Capabilities 2

What it actually does — grouped by capability family.

Speech synthesis (TTS) (primary capability)
Voice cloning (primary capability)

Pros & cons

Multilingual streaming synthesis
Instant voice cloning from a sample
Free tier plus affordable paid plans
Integrates with major voice-agent stacks
Commercial license on paid tiers

Smaller voice library than ElevenLabs
Quality trails top expressive TTS models
Less brand recognition than incumbents

Tags

View all Voice →

View Cartesia details
VoiceFREEMIUM
Cartesia
Cartesia
Low-latency streaming text-to-speech for real-time voice.
Streaming-first speech synthesis built around the Sonic family of state-space models. Aims at real-time agent voices where latency between turns is the product. Strong choice for sub-200ms voice loops.
Streaming over WebSocket for fast first audio
Long-form expressive texture trails ElevenLabs
- tts
- streaming
- low-latency
- real-time
Open
View Rime details
VoiceFREEMIUM
Rime
Rime
Enterprise text-to-speech built for real-time voice agents.
Rime builds AI voice models for high-stakes business conversations like IVRs, contact centers, and AI phone agents. Its Arcana and Mist models target ultra-low latency and natural, conversational delivery, with deterministic pronunciation control so terms are spoken consistently without retraining. Rime can be deployed on-prem, in a VPC, or via cloud API, and is offered directly or through voice-AI partner platforms.
Very low latency for real-time voice agents
Enterprise-focused, not a consumer tool
- text-to-speech
- voice-ai
- tts
- contact-center
- +1
Open
View ElevenLabs details
VoiceFREEMIUM
ElevenLabs
ElevenLabs
Text-to-speech, voice cloning, and multilingual dubbing.
Hosted speech synthesis at near-human quality — TTS, voice cloning, multilingual dubbing, and conversational voice agents. Default choice when you need a voice that sounds like a person, not a robot.
Best-in-class voice realism
Pricier than commodity TTS at scale
- tts
- voice-cloning
- dubbing
- multilingual
Open
View Neuphonic details
VoiceFREEMIUMOpen core
Neuphonic
Neuphonic
Ultra-low-latency text-to-speech that runs on-device.
Neuphonic is a voice-AI company building text-to-speech and voice cloning that run locally with very low latency. Its cloud API targets real-time voice agents, and in October 2025 it open-sourced NeuTTS Air, a 748M-parameter speech language model that runs on CPU via llama.cpp and clones a voice from a few seconds of audio. Aimed at private, offline, and voice-agent use cases.
On-device, CPU-only synthesis
Cloud API pricing not clearly published
- text-to-speech
- voice-cloning
- on-device
- open-source
Open

Open LMNT

Capabilities 2

Pros & cons

Tags

Cartesia

Rime

ElevenLabs

Neuphonic