Fine-tuning AI apps

Platforms and tooling for fine-tuning, distilling, and adapting foundation models on your own data.

9 apps · researched & kept current by Claude Code

Filter & search these 9 apps

View Lamini details
Fine-tuningPAID
Lamini
Lamini
Enterprise platform to tune and run open LLMs in your own environment.
Lamini is an enterprise LLM platform for fine-tuning open models and serving them, designed to run on-prem, in a VPC, or on Lamini's cloud — including on AMD GPUs. It pairs tuning (LoRA/PEFT and memory tuning to reduce hallucinations) with an inference stack and agentic pipelines, accessed via a Python client, REST API, or web UI. Built for teams that need to keep models and data in-house.
Keeps models and data fully in-house
Enterprise-focused pricing
- fine-tuning
- llm
- enterprise
- on-prem
Open
View Tinker details
Fine-tuningPAID
Tinker
Thinking Machines Lab
Managed fine-tuning API with low-level control over the training loop.
Tinker is Thinking Machines Lab's training API for fine-tuning open-weight LLMs. It exposes low-level primitives — forward_backward, optim_step, sample — so researchers keep full control of data and algorithms while the service handles distributed GPU scheduling and failure recovery. LoRA-based runs cover models from small Llamas up to large mixture-of-experts like Qwen-235B and Kimi K2, and trained weights can be downloaded.
Exposes forward_backward, optim_step, sample
LoRA-based only (no full fine-tune)
- fine-tuning
- lora
- post-training
- research
- +1
Open
View Axolotl details
Fine-tuningFREEOSS
Axolotl
Axolotl AI
Open-source post-training for LLMs — LoRA to RL, all from one YAML config.
An open-source (Apache-2.0) framework that streamlines post-training for open-weight models: full fine-tuning, LoRA/QLoRA, preference tuning (DPO, IPO, KTO, ORPO), reinforcement learning (GRPO), reward modeling and quantization-aware training, configured through a single YAML file with no scripting. Wraps Hugging Face Transformers, PEFT, TRL and DeepSpeed, and supports dozens of model families including multimodal vision and audio models.
Apache-2.0 with 12k+ GitHub stars
Needs your own GPUs or cloud compute
- fine-tuning
- lora
- rlhf
- open-source
Open
View Predibase details
Fine-tuningPAID
Predibase
Predibase (Rubrik)
Fine-tune open-source LLMs and serve them in production.
Predibase is an enterprise platform for fine-tuning open-source models and serving them in production. It pairs a post-training stack — supervised fine-tuning plus an end-to-end reinforcement fine-tuning (RFT) flow — with an optimized inference engine, and its open-source LoRAX framework serves many fine-tuned LoRA adapters from a single GPU. Runs as managed SaaS or inside your own VPC.
Fine-tune + serve in one place
Enterprise-priced
- fine-tuning
- lora
- rft
- inference
- +1
Open
View Replicate details
InferenceFREEMIUM
Replicate
Replicate
Run, fine-tune, and deploy thousands of open models via one API.
A platform to run open-source models with one API call — image, video, audio, and language — plus fine-tuning and custom deploys with pay-per-second billing. No infra to manage.
Image, video, audio, and language models
Cold starts on less-popular models
- model-hosting
- fine-tuning
- api
- open-source
Open
View Unsloth details
Fine-tuningFREEMIUMOpen core
Unsloth
Unsloth AI
Fine-tune open LLMs faster with far less VRAM.
An open-source (Apache-2.0) framework for fine-tuning and running open-weight models with custom CUDA kernels — roughly 2x faster training and large VRAM savings, so 7B–13B models fit on a single consumer GPU. Free tier runs on Colab/Kaggle or locally; Pro and Enterprise tiers add multi-GPU and multi-node speedups. Exports to GGUF/Safetensors for llama.cpp, vLLM, and Ollama.
LoRA, QLoRA, and full fine-tuning
Multi-GPU speedups are paid tiers
- fine-tuning
- lora
- open-source
- training
Open
View OpenPipe details
Fine-tuningPAID
OpenPipe
OpenPipe
Replace frontier-model spend with a fine-tuned small model.
Captures your production OpenAI / Anthropic calls, builds a dataset, fine-tunes a small open-weights model on your traffic, then serves the swap behind your existing SDK. The pitch: 10x cost reduction at parity.
Uses your production logs as training data
Needs enough quality traffic to distill
- fine-tuning
- cost-reduction
- drop-in
- open-weights
Open
View Together AI details
InferenceFREEMIUM
Together AI
Together
Hosted inference and fine-tuning for open-weights models.
Hosted inference and fine-tuning across hundreds of open-weights models (Llama, Mistral, DeepSeek, Qwen, etc.). Strong pricing for inference-at-scale; LoRA + full fine-tuning supported.
LoRA and full fine-tuning
Open models only, no frontier closed models
- inference
- fine-tuning
- open-weights
- lora
Open
View Fireworks AI details
InferenceFREEMIUM
Fireworks AI
Fireworks AI
Fast inference + fine-tuning. Production deployments at scale.
Optimized inference platform for open-weights models with strong latency numbers and serverless + dedicated deployment options. Fine-tuning supported; vision and audio models alongside text.
Custom FireAttention inference stack
Usage pricing scales with traffic
- inference
- fine-tuning
- low-latency
- production
Open

Fine-tuning AI apps

Lamini

Tinker

Axolotl

Predibase

Replicate

Unsloth

OpenPipe

Together AI

Fireworks AI