Skip to content

Benchmarks

Leaderboard

All active models ranked by P50 latency — the median response time for a standard 500-token output, measured from EU (Amsterdam). Green < 500 ms, yellow 500–1000 ms, red > 1000 ms.

Filter:
#
1pplC22
2Qwen2.5-VL-72B-Instruct112
3Mistral-Nemo-Instruct-2407119
4Mistral-7B-Instruct-v0.3128
5Meta-Llama-3_3-70B-Instruct132
6Llama-3.1-8B-Instruct133
7Mistral Voxtral Small 24BA141
8Mistral-Small-3.2-24B-Instruct-2506173
9Llama 4 ScoutA175
10Nous Hermes 3 70BA190
11Qwen3-Coder-30B-A3B-Instruct203
12Qwen3.5-397B-A17B243
13NVIDIA Nemotron Super 49B v1.5A250
14Llama 4 MaverickA315
15Llama 3.3 70B InstructA316
16gpt-oss-120bC343
17gpt-4.1-nanoC354
18gpt-4o-miniC373
19gpt-5-chat-latestC389
20Qwen3-32B420
21Qwen 2.5 VL 72B InstructA430
22gpt-4.1-miniC440
23gpt-5.4-nanoC440
24o3-miniC447
25Gemini 2.5 Flash-LiteB451
26Qwen3.5-9B452
27gpt-4oC454
28Cohere Command-AA462
29Gemini 2.5 FlashA507
30MiniMax M2.5A510
31gpt-oss-20bC533
32gpt-5.4A552
33o3C574
34gpt-5.4-miniA602
35o4-miniC615
36gpt-5C701
37Claude Haiku 4.5A728
38gpt-5.1B757
39gpt-5-nanoC791
40DeepSeek v4 ProA797
41gpt-5.2B797
42gpt-5.2-chat-latestC824
43Google Lyria 3 Pro PreviewA832
44gpt-5.1-chat-latestC861
45Claude Opus 4.5B873
46gpt-4.1B882
47Claude Opus 4.8A889
48Qwen 3.6 PlusA909
49gpt-5.3-chat-latestC919
50Gemini 3.1 Flash Lite957
51gpt-5-miniC1017
52gpt-4o-2024-05-13C1049
53gpt-4.1-2025-04-141072
54Claude Sonnet 4.6A1088
55Qwen 3.7 MaxA1166
56gpt-5.5C1224
57gpt-4o-2024-11-20C1326
58gpt-3.5-turbo-11061328
59Gemini 2.5 ProA1331
60Gemini Flash-Lite LatestC1366
61Claude Sonnet 4.5B1745
62Nano Banana1808
63Claude Opus 4.6B1815
64Nano Banana 21887
65gpt-3.5-turboC1995
66gpt-3.5-turbo-16k2006
67gpt-4o-2024-08-06C2016
68gpt-4.1-nano-2025-04-142051
69Claude Opus 4.1C2119
70Claude Opus 4.7B2173
71gpt-3.5-turbo-01252331
72DeepSeek v3.2A2710
73Gemini Robotics-ER 1.6 Preview2764
74Gemini 3 Flash PreviewC2780
75gpt-4o-search-previewC2930
76gpt-4o-mini-search-previewC3388
77gpt-5-search-apiC3559
78gpt-4.1-mini-2025-04-143561
79Gemini 3.5 FlashA3938
80gpt-4o-mini-2024-07-18C3960
81Gemini Flash LatestB4051
82gpt-4o-mini-search-preview-2025-03-114627
83gpt-4o-search-preview-2025-03-114883
84gpt-5-search-api-2025-10-145351
85gpt-4-06135810
86Gemini 3.1 Pro Preview Custom ToolsC6069
87Gemini Pro LatestC6574
88Gemini 3.1 Pro PreviewC6937
89gpt-4-turbo-2024-04-09C7386
90gpt-4C7408
91Nano Banana Pro8045
92gpt-4-turboC9151
93Lyria 3 Clip Preview9402
94Gemma 4 31B ITC11240
95Gemma 4 26B A4B ITC12943
96Lyria 3 Pro Preview21413

96 of 96 models · click column headers to sort

Fast (< 500 ms)
Medium (500–1000 ms)
Slow (> 1000 ms)
Updated every 6 hours · P50 = median latency · P95 = tail latency