Tier C — Specialist

Runs in:FranceMade in:France

Mistral-Nemo-Instruct-2407

Tier C — Specialist

Tokonomix Editorial Team·Reviewed by Mes Kalkan·Published May 27, 2026·Last reviewed July 30, 2026

Section 01

Speed analysis

Latency measured across all benchmark runs. P50 (median) and P95 (95th percentile) give a realistic picture of response speed under normal and peak load.

P50 latency (median)P95 latency101 runs

Section 02

Quality scores

Evaluation results from judge-model scoring across diverse task categories. Scores reflect coherence, accuracy and instruction-following.

Creative

Factual

Multilingual

Reasoning

Section 03

Pricing history

Direct provider rates per million tokens, plus a typical-conversation cost estimate.

💰

API rates — Mistral-Nemo-Instruct-2407

$0.1300 per 1M input tokens

$0.1300 per 1M output tokens

≈ $0.0001 per typical conversation (800 tokens)

Input vs output price (per 1M tokens)

per 1M input tokens$0.1300

per 1M output tokens$0.1300

Pricing over time

Input & output per 1M tokens · step-line = price changes

$0.1300

input / 1M

— stable

$0.1300

output / 1M

— stable

2026-06-142026-07-052026-07-26

Input

Output

Price change

⟳ synced weekly

Section 04

Tokens per second

Throughput in tokens per second, derived from measured P50 latency. Higher is better; fluctuations track provider-side load.

Throughput (tokens / s)2000 / avg 1943

Estimated from P50 latency × 200 output tokens — the absolute number depends on this assumption; the trend is what matters.

Section 05

Capabilities

ownedBy: mistralai

Section 06

Availability

No measurements yet

We haven't recorded enough API calls to show availability stats for this model. Data appears once the model starts receiving live traffic.

Section 07

Tokonomix benchmark verdicts

⚖️

Endorsed by 2 judges

Independent LLM judges evaluated this model on our weekly intelligence tests

cohere/command-a20/100 · 1 runs

0 correct1 partial0 wrong0% accuracy

claude-sonnet-4-578/100 · 47 runs

31 correct6 partial10 wrong66% accuracy

● 2026-07-26

Mistral-Nemo quality plummets 38 points to 46.8, latency up 43%

Mistral-Nemo-Instruct-2407 on OVH AI Endpoints has experienced a severe performance degradation in the current benchmark window. Overall quality dropped dramatically from 84.9 to 46.8, representing a 38.1 point decline that affects nearly all measured capabilities. The multilingual category saw the most significant collapse, falling from 97 to just 26. Creative performance dropped from 75 to 58, while the model now scores 50 in factual tasks and 53 in reasoning. These new categories replace the previously measured coding capability, which scored 83 in the last window. Latency has also deteriorated substantially, with p50 response times increasing 43% from 3051ms to 4372ms. This combination of quality collapse and slower response times suggests either a model version change, infrastructure issues, or configuration problems at the provider level. The stability between benchmark windows has clearly been compromised. Users should exercise caution and potentially consider alternative providers or models until performance stabilizes and returns to previously demonstrated levels.

Quality

46.8

Latency p50

4,372 ms

Test runs

✗ Quality crashed 38.1 points✗ Multilingual dropped from 97 to 26✗ Latency increased 43%✗ Creative performance down 17 points

Last automated test

Jul 30, 2026 · 08:04 UTC · Speed benchmark

P50 latency

100 ms

P95 latency

322 ms

Errors

0 / 6 runs

Last reviewed by Tokonomix Team·July 30, 2026