Iris

MCP-native eval and observability server for AI agents.

Categories: EvalObservabilityMCP
Pricing: FREEMIUM
Source: Open core
Hosting: Hybrid
Platforms: API
Models: BYO key / model
Verified: Jun 20, 2026

Iris scores AI agent output, catches safety failures, and enforces cost budgets — exposed as an MCP server rather than an SDK. Any MCP-compatible agent discovers its tools (trace logging, output evaluation, rule management, LLM-as-judge) and uses them automatically, with no code changes, so every output flowing through the protocol gets evaluated. It detects PII leaks, prompt injection, hallucinations, and budget anomalies. The core is MIT-licensed and free to self-host; a managed cloud adds dashboards and alerting.

Pros & cons

MIT-licensed open-source core
No-code, MCP-native integration
Free self-host, free cloud tier
PII, injection, and cost checks

Newer, niche MCP-focused tool
Best fit for MCP-based agents
Smaller ecosystem than SDK evals

Iris

Braintrust

DeepEval

Promptfoo