İçeriğe geç

Benchmarks

Hız testi

P50 = median response time for a standard 500-token output. Measured from EU (Amsterdam). P95 = tail latency — 95% of requests complete within this time. Three runs per model per test cycle; values are medians across cycles.

Tier S< 200 ms
Tier A< 500 ms
Tier B< 1000 ms
Tier C> 1000 ms
P50 (median)P95 (tail)
01ppl
Tier S22 ms
P95: 389 ms
02Mistral-7B-Instruct-v0.3
Tier S115 ms
P95: 176 ms
03Mistral-Nemo-Instruct-2407
Tier S117 ms
P95: 191 ms
04Mistral-Small-3.2-24B-Instruct-2506
Tier S120 ms
P95: 158 ms
05Qwen2.5-VL-72B-Instruct
Tier S125 ms
P95: 541 ms
06Meta-Llama-3_3-70B-Instruct
Tier S127 ms
P95: 172 ms
07Llama-3.1-8B-Instruct
Tier S130 ms
P95: 232 ms
08Mistral Voxtral Small 24B
Tier S134 ms
P95: 137 ms
09Llama 4 Scout
Tier S165 ms
P95: 430 ms
10Llama 4 Maverick
Tier S179 ms
P95: 238 ms
How we measure: Each model receives an identical prompt targeting a ~500-token output. We run 3 sequential calls per test cycle and compute P50/P95 across the distribution. Tests run 4× per day from a single EU endpoint. Network overhead is included.