Skip to content

VisionLightly

Lightly

Computer-vision data curation, labeling, and model pretraining.

Category
Vision
Pricing
PAID
Hosting
Hybrid
Platforms
WebAPI
Models
Model-agnostic
Verified
Jun 13, 2026

Lightly is a computer-vision data platform that helps teams curate the most informative samples from large image and video datasets using embeddings, active learning, and near-duplicate detection. Its suite spans LightlyStudio (curation and labeling), LightlyTrain (self-supervised pretraining and fine-tuning of vision models), and LightlyEdge (smart data selection on devices). The aim is to cut labeling cost by training on the data that actually improves models.

Pros & cons

  • Embedding-based data curation and dedup
  • Active learning surfaces useful samples
  • Self-supervised pretraining (LightlyTrain)
  • Edge SDK for on-device selection
  • Open-source SSL library to build on
  • No public pricing; sales-led
  • Focused on vision, not other modalities
  • Best value needs large datasets
  • Smaller ecosystem than labeling giants

Tags

View all Vision
  • View Roboflow details
    VisionFREEMIUM

    Roboflow

    Roboflow

    Vision MLOps end-to-end. Annotate, train, deploy.

    Annotation tooling, auto-labelling, hosted training, and edge deployment for computer-vision projects. Strong default when you're shipping a custom vision model rather than reaching for a multimodal LLM.

    Worth knowing

    Its Roboflow Universe is one of the largest public computer-vision dataset and model hubs; $40M Series B led by GV in 2024.

    • annotation
    • training
    • deployment
    • edge
  • View Voxel51 details
    VisionFREEMIUMOpen core

    Voxel51

    Voxel51

    FiftyOne — open-source vision data platform.

    Open-source toolkit for exploring, debugging, and curating vision datasets. Strong story for finding model failure modes, balancing classes, and tracking experiment drift across visual data at scale.

    Worth knowing

    Spun out of the University of Michigan in 2016 by robotics prof Jason Corso and PhD student Brian Moore; Bessemer-led $30M Series B.

    • open-source
    • datasets
    • evaluation
    • python
  • View Encord details
    VisionPAID

    Encord

    Encord

    Data platform to curate, label, and manage AI training data.

    An enterprise data development platform for preparing high-quality training data across images, video, documents, audio, DICOM, and 3D point clouds. It pairs AI-assisted labeling (SAM auto-segmentation, object tracking) with data curation, model evaluation, and workflow tooling, plus LLM-powered data agents for document tasks. Used heavily in medical imaging, robotics, and other physical-AI domains.

    Worth knowing

    YC W21 company founded by two ex-high-frequency traders; raised a $30M Series B led by Next47 in 2024.

    • data-annotation
    • training-data
    • computer-vision
    • medical-imaging
    • +1
  • View Supervisely details
    VisionFREEMIUM

    Supervisely

    Supervisely

    All-in-one computer vision platform to curate, label, and train models.

    A unified computer vision platform covering data curation, annotation, model training, and deployment across images, video, 3D point clouds, and medical imagery. AI-assisted labeling, experiment tracking, and a large catalog of installable apps make it customizable for most CV workflows. Free for researchers and small teams; Pro and self-hostable Enterprise editions for companies.

    Worth knowing

    Grew out of Deep Systems, a deep-learning consultancy its founders built in 2013, before launching as a product in 2017.

    • computer-vision
    • data-annotation
    • labeling
    • model-training
    • +1