Skip to content
Runs in:FranceMade in:France
OVH AI Endpoints (GRA)

Mistral-Nemo-Instruct-2407

Tokonomix Editorial Team·Reviewed by Mes Kalkan··
Section 01

Speed analysis

Latency measured across all benchmark runs. P50 (median) and P95 (95th percentile) give a realistic picture of response speed under normal and peak load.

P50 latency (median)P95 latency73 runs
9172213531984261505-2806-15ms
Section 02

Quality scores

Evaluation results from judge-model scoring across diverse task categories. Scores reflect coherence, accuracy and instruction-following.

100
Coding
93
Multilingual
75
Reasoning
Section 03

Pricing history

Direct provider rates per million tokens, plus a typical-conversation cost estimate.

💰
API rates — Mistral-Nemo-Instruct-2407
$0.1300 per 1M input tokens
$0.1300 per 1M output tokens
≈ $0.0001 per typical conversation (800 tokens)
Input vs output price (per 1M tokens)
per 1M input tokens$0.1300
per 1M output tokens$0.1300

Pricing over time

Input & output per 1M tokens · step-line = price changes

$0.1300

input / 1M

— stable

$0.1300

output / 1M

— stable

2026-06-142026-06-142026-06-14
Input
Output
Price change
⟳ synced weekly
Section 04

Tokens per second

Throughput in tokens per second, derived from measured P50 latency. Higher is better; fluctuations track provider-side load.

Throughput (tokens / s)1709 / avg 1509
2157426

Estimated from P50 latency × 200 output tokens — the absolute number depends on this assumption; the trend is what matters.

Section 05

Capabilities

ownedBy: mistralai
Section 06

Availability

Availability

No measurements yet

We haven't recorded enough API calls to show availability stats for this model. Data appears once the model starts receiving live traffic.

Section 07

Tokonomix benchmark verdicts

⚖️
Endorsed by 1 judge
Independent LLM judges evaluated this model on our weekly intelligence tests
claude-sonnet-4-585/100 · 7 runs
5 correct1 partial1 wrong71% accuracy
2026-06-14

Mistral-Nemo maintains steady baseline with no performance variation

Mistral-Nemo-Instruct-2407 continues to deliver consistent performance across this benchmark window with no measurable changes from the previous period. The model maintains its established baseline characteristics without regression or improvement in any tracked metrics. This stability indicates reliable model serving infrastructure from OVH AI Endpoints in their GRA region, with consistent response patterns and quality outputs. Users can expect the same mid-tier performance levels that were observed during the initial benchmark establishment. The lack of variation suggests no underlying model updates, infrastructure changes, or optimization adjustments have been deployed during this period. For production workloads requiring predictable behavior, this consistency provides operational confidence. However, users seeking performance improvements or enhanced capabilities will need to look at model updates in future releases or consider alternative offerings. The stable serving pattern makes capacity planning and resource allocation straightforward for applications built on this endpoint.

Quality

Latency p50

Test runs

0

Performance remains stable No quality regressions detected No performance improvements observed
Last automated test
Jun 15, 2026 · 08:00 UTC · Speed benchmark
P50 latency
117 ms
P95 latency
191 ms
Errors
0 / 6 runs
Last reviewed by Tokonomix Team·June 15, 2026