Runware

One pay-as-you-go API for multi-modal AI inference.

Category: Inference
Pricing: FREEMIUM
Source: Proprietary
Hosting: Cloud
Platforms: APIWeb
Models: Multi-model
Verified: Jun 14, 2026

Runware is a unified AI inference platform that exposes 400K+ models — image, video, audio, text, and 3D — behind a single pay-as-you-go API. It runs a proprietary Sonic Inference Engine on custom GPU hardware, claiming sub-second cold starts and up to 10x lower cost per generation than typical hosted inference. A REST and WebSocket API plus a web playground let developers swap models without per-provider integrations.

Capabilities 6

What it actually does — grouped by capability family.

Model inference / serving (primary capability)
Multi-model access (primary capability)

Text-to-image (secondary capability)
Image editing (secondary capability)
Text-to-video (secondary capability)
Image upscaling (secondary capability)

Pros & cons

400K+ models via one API
Pay-per-request, no commitments
Image, video, audio, 3D, and LLMs
Swap models without per-provider work

Proprietary, cloud-only
Only $2 free credits to trial
Pricing varies by model/params

View fal details
InferenceFREEMIUM
fal
fal
Serverless inference API for image, video, audio, and 3D models.
A generative-media inference platform exposing FLUX, Kling, Veo, Wan, Stable Diffusion, and 600+ image/video/audio/3D models through one fast, serverless API — no GPUs to manage and near-zero cold starts. Pay per output or per GPU-second; free starter credits to test. Popular as the production backend for AI media features.
600+ generative-media models
Media-focused, not a general LLM host
- generative-media
- image-gen
- video-gen
- serverless
Open
View Replicate details
InferenceFREEMIUM
Replicate
Replicate
Run, fine-tune, and deploy thousands of open models via one API.
A platform to run open-source models with one API call — image, video, audio, and language — plus fine-tuning and custom deploys with pay-per-second billing. No infra to manage.
Image, video, audio, and language models
Cold starts on less-popular models
- model-hosting
- fine-tuning
- api
- open-source
Open
View Together AI details
InferenceFREEMIUM
Together AI
Together
Hosted inference and fine-tuning for open-weights models.
Hosted inference and fine-tuning across hundreds of open-weights models (Llama, Mistral, DeepSeek, Qwen, etc.). Strong pricing for inference-at-scale; LoRA + full fine-tuning supported.
LoRA and full fine-tuning
Open models only, no frontier closed models
- inference
- fine-tuning
- open-weights
- lora
Open
View OpenRouter details
InferenceFREEMIUM
OpenRouter
OpenRouter
One OpenAI-compatible API in front of models from every provider.
A unified gateway that routes a single endpoint and API key to models from Anthropic, OpenAI, Google, Meta, DeepSeek, xAI, and more — swap models by changing one parameter, with automatic fallbacks and one consolidated bill. Pass-through token pricing plus dozens of free models.
Swap models by changing one parameter
Adds a routing hop vs direct provider
- gateway
- routing
- multi-model
- fallbacks
Open

Open Runware

Runware

Capabilities 6

Pros & cons

Tags

Further reading

fal

Replicate

Together AI

OpenRouter