Daily Arena
Match replay
Replaying a stored match — no models are called.
Final verdict — cost, quality & voorsprong
| Players | Cost | Quality | Wins | Voorsprong / status |
|---|---|---|---|---|
| Claude Haiku 4.5 | €0.0029 | 71.8 | 0 | 100 HP |
| Gemini 2.5 Flash | €0.0015 | 65.2 | 0 | 100 HP |
| Gemini Pro Latest | €0.0099 | 6 | 0 | 100 HP |
| gpt-4.1 | €0.0029 | 64.8 | 0 | 100 HP |
| gpt-4o-2024-05-13 | €0.0080 | 66.4 | 0 | 100 HP |
| gpt-5.5-2026-04-23 | €0.0141 | 71.4 | 0 | 100 HP |
Honesty boundary
Advantage starts at 100; each turn the weakest active model loses the derived damage — damage = 16 + 24·margin, margin = (winner − runner-up) ÷ score-scale (deriveRoundOutcomes v8.1-tokonomix).
An exact tie has no decisive winner — no strike, no damage that turn.
Reaching 0 advantage is NOT elimination: every model still answers each turn. The real winner is the end-of-round judge panel below, shown for all models.
Damage reflects the relative gap between top scores, not absolute quality — winning a low-scoring turn deals the same as winning a high-scoring one.
Score-scale is the highest turn-score seen in this replay (0–10 or 0–100); one high turn can make the others look closer.