Mistral-Nemo-Instruct-2407
Speed analysis
Latency measured across all benchmark runs. P50 (median) and P95 (95th percentile) give a realistic picture of response speed under normal and peak load.
Quality scores
Evaluation results from judge-model scoring across diverse task categories. Scores reflect coherence, accuracy and instruction-following.
Pricing history
Direct provider rates per million tokens, plus a typical-conversation cost estimate.
Pricing over time
Input & output per 1M tokens · step-line = price changes
$0.1300
input / 1M
— stable
$0.1300
output / 1M
— stable
Tokens per second
Throughput in tokens per second, derived from measured P50 latency. Higher is better; fluctuations track provider-side load.
Estimated from P50 latency × 200 output tokens — the absolute number depends on this assumption; the trend is what matters.
Capabilities
Availability
Availability
No measurements yet
We haven't recorded enough API calls to show availability stats for this model. Data appears once the model starts receiving live traffic.
Tokonomix benchmark verdicts
Mistral-Nemo maintains steady baseline with no performance variation
Mistral-Nemo-Instruct-2407 continues to deliver consistent performance across this benchmark window with no measurable changes from the previous period. The model maintains its established baseline characteristics without regression or improvement in any tracked metrics. This stability indicates reliable model serving infrastructure from OVH AI Endpoints in their GRA region, with consistent response patterns and quality outputs. Users can expect the same mid-tier performance levels that were observed during the initial benchmark establishment. The lack of variation suggests no underlying model updates, infrastructure changes, or optimization adjustments have been deployed during this period. For production workloads requiring predictable behavior, this consistency provides operational confidence. However, users seeking performance improvements or enhanced capabilities will need to look at model updates in future releases or consider alternative offerings. The stable serving pattern makes capacity planning and resource allocation straightforward for applications built on this endpoint.
Quality
—
Latency p50
—
Test runs
0
Mistral-Nemo-Instruct-2407
by OVH AI Endpoints (GRA)
- Context window
- — tokens
- Input price
- $0.1300 / 1M
- Output price
- $0.1300 / 1M
- Tier
- —
- Modality
- Text
- API type
- REST · streaming
- Benchmark runs
- 91
More from OVH AI Endpoints (GRA)