Skip to content

Daily Arena

Match replay

Replaying a stored match — no models are called.

⚖ Multi-judge consensus — our trademark
Tokonomix multi-council + judge + blind-spot detection — lower cost, and it catches the mistakes one model misses.
Multi-council · lower costMulti-judge · cross-familyBlind-spot detection · catch the missed mistakeN-team · groups vs each other
Game type
Turns: 3
Speed1×
multilingual_support · roundTurn 0 / 3
The cheapest model that keeps up on quality appears here.
0 / 3
Claude Haiku 4.5
Anthropic
CL
€—score
HP
100
Claude Opus 4.1
Anthropic
CL
€—score
HP
100
Claude Sonnet 4.5
Anthropic
CL
€—score
HP
100
Deep Research Preview (Apr-21-2026)
Google Gemini
DE
€—score
HP
100
Deep Research Max Preview (Apr-21-2026)
Google Gemini
DE
€—score
HP
100
DeepSeek v4 Pro
OpenRouter
DE
€—score
HP
100
Customer
Press “Next turn” to begin.

Final verdictcost, quality & voorsprong

PlayersCostQualityWinsVoorsprong / status
Claude Haiku 4.5€0.0053961100 HP
Claude Opus 4.1€0.0600920100 HP
Claude Sonnet 4.5€0.0132910100 HP
Deep Research Preview (Apr-21-2026)€0.00000082 HP
Deep Research Max Preview (Apr-21-2026)€0.000000100 HP
DeepSeek v4 Pro€0.0008850100 HP
0 / 3Drone damage = jury-majority strength · HP = live voorsprong · € = real cost

Honesty boundary

Advantage starts at 100; each turn the weakest active model loses the derived damage — damage = 16 + 24·margin, margin = (winner − runner-up) ÷ score-scale (deriveRoundOutcomes v8.1-tokonomix).

An exact tie has no decisive winner — no strike, no damage that turn.

Reaching 0 advantage is NOT elimination: every model still answers each turn. The real winner is the end-of-round judge panel below, shown for all models.

Damage reflects the relative gap between top scores, not absolute quality — winning a low-scoring turn deals the same as winning a high-scoring one.

Score-scale is the highest turn-score seen in this replay (0–10 or 0–100); one high turn can make the others look closer.

Zero model dispatch — pure render of the stored round. Switching the view changes the picture, never the numbers.
Back to the arena