Daily Arena
Match replay
Replaying a stored match — no models are called.
Final verdict — cost, quality & voorsprong
| Players | Cost | Quality | Wins | Voorsprong / status |
|---|---|---|---|---|
| Claude Opus 4.7 | €0.2375 | 65 | 0 | 100 HP |
| gpt-5.5 | €0.1857 | 68 | 6 | 100 HP |
| DeepSeek v3.2 | €0.0065 | 58.5 | 1 | 100 HP |
| Llama 3.3 70B Instruct | €0.0025 | 72.5 | 0 | 100 HP |
| Llama 4 Scout | €0.0020 | 72.5 | 0 | 82 HP |
| Nous Hermes 3 70B | €0.0082 | 2.5 | 0 | drained |
Honesty boundary
Advantage starts at 100; each turn the weakest active model loses the derived damage — damage = 16 + 24·margin, margin = (winner − runner-up) ÷ score-scale (deriveRoundOutcomes v8.1-tokonomix).
An exact tie has no decisive winner — no strike, no damage that turn.
Reaching 0 advantage is NOT elimination: every model still answers each turn. The real winner is the end-of-round judge panel below, shown for all models.
Damage reflects the relative gap between top scores, not absolute quality — winning a low-scoring turn deals the same as winning a high-scoring one.
Score-scale is the highest turn-score seen in this replay (0–10 or 0–100); one high turn can make the others look closer.