Cohere

Enterprise-grade LLMs, embeddings, and retrieval built for private deployment.

Category: Inference
Pricing: FREEMIUM
Source: Proprietary
Hosting: Hybrid
Platforms: WebAPI
Models: Self-contained (on-device)
Verified: Jun 13, 2026

Cohere builds large language models for the enterprise rather than the consumer. Its Command models cover agentic, multimodal, and multilingual generation; Embed and Rerank power high-quality search and retrieval; Aya is a multilingual research family spanning 70+ languages; and North is a workplace AI platform built on top. Cohere's emphasis is data control — models can run in a private VPC, on-premises, or via a managed Model Vault.

Capabilities 5

What it actually does — grouped by capability family.

Tool / function calling (secondary capability)

Model inference / serving (primary capability)
Fine-tuning / training (secondary capability)

Embeddings (primary capability)

Transcription (STT) (secondary capability)

Pros & cons

Strong Rerank/Embed retrieval models
Command models for agentic generation
Multilingual (Aya, 70+ languages)
Enterprise data-control focus

No consumer chat product to speak of
Smaller ecosystem than OpenAI/Anthropic
Production usage is paid

View Together AI details
InferenceFREEMIUM
Together AI
Together
Hosted inference and fine-tuning for open-weights models.
Hosted inference and fine-tuning across hundreds of open-weights models (Llama, Mistral, DeepSeek, Qwen, etc.). Strong pricing for inference-at-scale; LoRA + full fine-tuning supported.
LoRA and full fine-tuning
Open models only, no frontier closed models
- inference
- fine-tuning
- open-weights
- lora
Open
View Fireworks AI details
InferenceFREEMIUM
Fireworks AI
Fireworks AI
Fast inference + fine-tuning. Production deployments at scale.
Optimized inference platform for open-weights models with strong latency numbers and serverless + dedicated deployment options. Fine-tuning supported; vision and audio models alongside text.
Custom FireAttention inference stack
Usage pricing scales with traffic
- inference
- fine-tuning
- low-latency
- production
Open
View DeepSeek details
AssistantFREEMIUM
DeepSeek
DeepSeek
Open, low-cost chat with strong reasoning. Free to use.
DeepSeek's assistant — chat with a reasoning mode and web search, backed by the open-weight DeepSeek models that reset the cost curve for frontier-grade quality.
Reasoning mode (R1)
China-hosted data concerns
- chat
- assistant
- reasoning
- open-weights
Open

Open Cohere

Cohere

Capabilities 5

Pros & cons

Tags

Further reading

Together AI

Fireworks AI

DeepSeek