Skip to content

Data OpsScale AI

Scale AI

Training data, evaluations, and enterprise GenAI from the data-labeling giant.

Category
Data Ops
Pricing
PAID
Hosting
Cloud
Platforms
WebAPI
Models
Model-agnostic
Verified
Jun 11, 2026

Scale supplies the human-annotated training data behind most frontier AI labs through its Data Engine, spanning labeling, RLHF, and expert red-teaming. On top of the data business it runs evaluation leaderboards, an enterprise GenAI platform, and Donovan, its platform for the US public sector.

Pros & cons

  • Frontier-scale human data ops
  • Expert annotator network
  • Evaluation leaderboards
  • Public-sector offerings (Donovan)
  • Enterprise sales, no public pricing
  • Meta stake raised neutrality concerns
  • Some labs cut engagements post-deal

Tags

Further reading

View all Data Ops
  • View Encord details
    VisionPAID

    Encord

    Encord

    Data platform to curate, label, and manage AI training data.

    An enterprise data development platform for preparing high-quality training data across images, video, documents, audio, DICOM, and 3D point clouds. It pairs AI-assisted labeling (SAM auto-segmentation, object tracking) with data curation, model evaluation, and workflow tooling, plus LLM-powered data agents for document tasks. Used heavily in medical imaging, robotics, and other physical-AI domains.

    Worth knowing

    YC W21 company founded by two ex-high-frequency traders; raised a $30M Series B led by Next47 in 2024.

    • data-annotation
    • training-data
    • computer-vision
    • medical-imaging
    • +1
  • View Label Studio details
    Data OpsFREEMIUMOpen core

    Label Studio

    HumanSignal

    Open-source multi-type data labeling and AI evaluation.

    Widely-used open-source tool for labeling and annotating data across images, text, audio, video, and time-series, with a standardized export format for training and fine-tuning. ML backends can pre-label data to speed up human review, and it increasingly doubles as a human-in-the-loop AI evaluation surface. Maintained by HumanSignal, which offers a hosted Starter tier and Label Studio Enterprise.

    Worth knowing

    Maker Heartex rebranded to HumanSignal in June 2023; Label Studio has labeled 200M+ data points.

    • data-labeling
    • open-source
    • annotation
    • human-in-the-loop
    • +1
  • View Dataloop details
    VisionPAID

    Dataloop

    Dataloop

    Enterprise data engine for labeling and managing unstructured AI data.

    An AI-ready data platform that manages, labels, and orchestrates unstructured data — images, video, LiDAR, audio, and text — across the model lifecycle. It pairs data management and human-in-the-loop annotation with a serverless pipeline layer for pre/post-processing, RLHF, and RAG, plus a model-and-app marketplace. Originally focused on computer-vision production pipelines.

    Worth knowing

    The Israeli startup (founded 2017) was acquired by Dell in a ~$120M all-cash deal in late 2025.

    • data-labeling
    • computer-vision
    • annotation
    • mlops
  • View Supervisely details
    VisionFREEMIUM

    Supervisely

    Supervisely

    All-in-one computer vision platform to curate, label, and train models.

    A unified computer vision platform covering data curation, annotation, model training, and deployment across images, video, 3D point clouds, and medical imagery. AI-assisted labeling, experiment tracking, and a large catalog of installable apps make it customizable for most CV workflows. Free for researchers and small teams; Pro and self-hostable Enterprise editions for companies.

    Worth knowing

    Grew out of Deep Systems, a deep-learning consultancy its founders built in 2013, before launching as a product in 2017.

    • computer-vision
    • data-annotation
    • labeling
    • model-training
    • +1