Benchmarks

Conjunto de datos público

Raw benchmark data available for free. No API key required for read access. Use this data in your own tools, dashboards, or research.

256

Models tracked

136 active

Providers

active APIs

37806

Benchmark runs

all time

Test questions

Q3 2026

Download

Full benchmark dataset as JSON — models, providers, and most recent run per model. Updated every 6 hours. CORS-open for browser fetch.

Download JSON →GET /api/md/es/dataset

The /api/md/[lang]/dataset endpoint returns the full benchmark dataset as JSON.

Field	Type	Description
id	bigint	Unique run ID
model_id	bigint	FK → models.id
run_type	varchar(20)	"speed" \| "intelligence" \| "health"
started_at	timestamptz	Run start time (UTC)
ended_at	timestamptz	Run end time (UTC)
latency_p50_ms	integer	Median latency (ms) — null if not applicable
latency_p95_ms	integer	95th-percentile latency (ms)
quality_score	integer	Judge score 0–100 — null until Q3 2026
error_count	integer	API errors in this run
raw_data	jsonb	Provider-specific response payload
created_at	timestamptz	Row creation time (UTC)

Field	Type	Description
id	bigint	Unique model ID
provider_id	bigint	FK → providers.id
slug	varchar(100)	URL-safe identifier (e.g. claude-sonnet-4-6)
name	varchar(200)	Display name
parameter_size	varchar(20)	e.g. "70B", "unknown"
context_window	integer	Max context in tokens
price_input_per_1m_cents	integer	Input price in cents per 1M tokens
price_output_per_1m_cents	integer	Output price in cents per 1M tokens
tier	varchar(2)	"A" \| "B" \| "C" — content priority tier
is_active	boolean	Whether model is currently tested