Moondream
M87 Labs
Tiny open vision-language model for efficient image understanding.
An open-weights family of small vision-language models for captioning, visual Q&A, pointing, counting, and object detection — small enough to run on-device (checkpoints down to 0.5B on Hugging Face). Run it locally with the Photon engine, or call Moondream Cloud's OpenAI-compatible API with a free monthly credit tier and pay-per-image pricing.
Worth knowing
Built by M87 Labs, founded by AWS veterans; raised a $4.5M pre-seed backed by Felicis and GitHub's M12 fund in 2024.
- vision-language
- open-weights
- on-device
- object-detection