Skip to content

Benchmarks

Leaderboard

All active models ranked by P50 latency — the median response time for a standard 500-token output, measured from EU (Amsterdam). Green < 500 ms, yellow 500–1000 ms, red > 1000 ms.

Filter:
#
1Mistral-Nemo-Instruct-2407110
2Llama-3.1-8B-Instruct126
3Mistral-7B-Instruct-v0.3131
4Meta-Llama-3_3-70B-Instruct140
5Qwen2.5-VL-72B-Instruct141
6Mistral-Small-3.2-24B-Instruct-2506157
7NVIDIA Nemotron Super 49B v1.5A173
8Llama 3.3 70B InstructA174
9DeepSeek v3.2A181
10Llama 4 ScoutA182
11Qwen3.5-397B-A17B184
12Nous Hermes 3 70BA184
13MiniMax M2.5A199
14Mistral Voxtral Small 24BA226
15Llama 4 MaverickA249
16gpt-oss-20bC280
17gpt-oss-120bC330
18Qwen3-Coder-30B-A3B-Instruct407
19gpt-5.4-nanoC432
20Qwen 2.5 VL 72B InstructA439
21Qwen3-32B451
22gpt-5-chat-latestC453
23gpt-5.4-miniA487
24gpt-4o-miniC504
25Qwen3.5-9B516
26Cohere Command-AA521
27gpt-4oC542
28Gemini 2.5 Flash-LiteB581
29gpt-4.1B639
30o3C669
31Claude Haiku 4.5A711
32DeepSeek v4 ProA715
33gpt-4.1-miniC721
34gpt-4.1-nanoC772
35gpt-5.1-chat-latestC787
36gpt-5.2-chat-latestC800
37gpt-5.1B803
38gpt-5.3-chat-latestC823
39o4-miniC840
40gpt-5.2B856
41Claude Opus 4.8A867
42Qwen 3.7 MaxA877
43o3-miniC906
44gpt-5C907
45Gemini 2.5 FlashA1030
46gpt-5.4A1069
47Claude Sonnet 4.6A1165
48gpt-5-miniC1322
49gpt-5-nanoC1329
50Claude Opus 4.7B1339
51Claude Opus 4.5B1344
52gpt-5.5C1369
53gpt-3.5-turbo-16k1440
54Gemini 2.5 ProA1440
55Claude Sonnet 4.5B1876
56gpt-3.5-turboC2203
57gpt-3.5-turbo-01252209
58Gemini 3.1 Flash Lite2301
59Claude Opus 4.1C2368
60gpt-3.5-turbo-11062399
61Gemini Flash-Lite LatestC2524
62Qwen 3.6 PlusA2613
63gpt-4o-2024-11-20C2617
64gpt-4o-2024-05-13C2665
65Nano Banana2873
66Claude Opus 4.6B3069
67gpt-4.1-nano-2025-04-143655
68Gemini Flash LatestB3827
69Gemini 3.5 FlashA3984
70gpt-4o-search-previewC4031
71Gemini Robotics-ER 1.6 Preview4190
72Gemini 3 Flash PreviewC4311
73Nano Banana 24330
74gpt-4.1-2025-04-144906
75gpt-5-search-api-2025-10-145380
76gpt-4C5540
77gpt-4o-mini-search-previewC5662
78gpt-4.1-mini-2025-04-145732
79gpt-4o-search-preview-2025-03-115743
80Gemini Pro LatestC6398
81gpt-5-search-apiC6462
82gpt-4o-mini-search-preview-2025-03-116736
83Gemini 3.1 Pro PreviewC6790
84gpt-4o-2024-08-06C6866
85gpt-4o-mini-2024-07-18C7064
86Gemini 3.1 Pro Preview Custom ToolsC7298
87gpt-4-06138426
88Nano Banana Pro10741
89Nano Banana Pro11201
90gpt-4-turboC14489
91gpt-4-turbo-2024-04-09C16898

91 of 91 models · click column headers to sort

Fast (< 500 ms)
Medium (500–1000 ms)
Slow (> 1000 ms)
Updated every 6 hours · P50 = median latency · P95 = tail latency