Skip to content

InferenceRunware

Runware

One pay-as-you-go API for 400K+ image, video, audio, and 3D models.

Category
Inference
Pricing
FREEMIUM
Hosting
Cloud
Platforms
APIWeb
Models
Multi-model
Verified
Jun 14, 2026

Runware is a unified AI inference platform that exposes 400K+ models — image, video, audio, text, and 3D — behind a single pay-as-you-go API. It runs a proprietary Sonic Inference Engine on custom GPU hardware, claiming sub-second cold starts and up to 10x lower cost per generation than typical hosted inference. A REST and WebSocket API plus a web playground let developers swap models without per-provider integrations.

Pros & cons

  • 400K+ models via one API
  • Pay-per-request, no commitments
  • Claims up to 10x lower cost
  • Sub-second cold starts
  • Image, video, audio, 3D, and LLMs
  • Proprietary, cloud-only
  • Only $2 free credits to trial
  • Pricing varies by model/params

Tags

Further reading

View all Inference
  • View fal details
    InferenceFREEMIUM

    fal

    fal

    Serverless inference API for image, video, audio, and 3D models.

    A generative-media inference platform exposing FLUX, Kling, Veo, Wan, Stable Diffusion, and 600+ image/video/audio/3D models through one fast, serverless API — no GPUs to manage and near-zero cold starts. Pay per output or per GPU-second; free starter credits to test. Popular as the production backend for AI media features.

    Worth knowing

    Raised a $140M Series D led by Sequoia in December 2025 at a $4.5B valuation.

    • generative-media
    • image-gen
    • video-gen
    • serverless
  • View Replicate details
    InferenceFREEMIUM

    Replicate

    Replicate

    Run, fine-tune, and deploy thousands of open models via one API.

    A platform to run open-source models with one API call — image, video, audio, and language — plus fine-tuning and custom deploys with pay-per-second billing. No infra to manage.

    Worth knowing

    Co-founded by Ben Firshman, who built the original Docker Compose; its Cog packaging format is essentially 'Docker for machine learning.'

    • model-hosting
    • fine-tuning
    • api
    • open-source
  • View Together AI details
    InferenceFREEMIUM

    Together AI

    Together

    Fine-tuning + inference for open-weights models. Broad coverage.

    Hosted inference and fine-tuning across hundreds of open-weights models (Llama, Mistral, DeepSeek, Qwen, etc.). Strong pricing for inference-at-scale; LoRA + full fine-tuning supported.

    Worth knowing

    Co-founded by Stanford's Percy Liang and FlashAttention author Tri Dao; raised $305M at a $3.3B valuation.

    • inference
    • fine-tuning
    • open-weights
    • lora
  • View OpenRouter details
    InferenceFREEMIUM

    OpenRouter

    OpenRouter

    One OpenAI-compatible API in front of 300+ models from every provider.

    A unified gateway that routes a single endpoint and API key to models from Anthropic, OpenAI, Google, Meta, DeepSeek, xAI, and more — swap models by changing one parameter, with automatic fallbacks and one consolidated bill. Pass-through token pricing plus dozens of free models.

    Worth knowing

    Founded by OpenSea co-founder Alex Atallah; hit unicorn status in 2025 with a $113M Series B led by Alphabet's CapitalG.

    • gateway
    • routing
    • multi-model
    • fallbacks