Skip to content
Runs in:FranceMade in:China
OVH AI Endpoints (GRA)

Qwen3-Coder-30B-A3B-Instruct

Tokonomix Editorial Team·Reviewed by Mes Kalkan··
Section 01

Speed analysis

Latency measured across all benchmark runs. P50 (median) and P95 (95th percentile) give a realistic picture of response speed under normal and peak load.

P50 latency (median)P95 latency73 runs
59141627744131548805-2806-15ms
Section 02

Quality scores

Evaluation results from judge-model scoring across diverse task categories. Scores reflect coherence, accuracy and instruction-following.

100
Coding
98
Multilingual
100
Reasoning
Section 03

Pricing history

Direct provider rates per million tokens, plus a typical-conversation cost estimate.

💰
API rates — Qwen3-Coder-30B-A3B-Instruct
$0.0700 per 1M input tokens
$0.2600 per 1M output tokens
≈ <$0.0001 per typical conversation (800 tokens)
Input vs output price (per 1M tokens)
per 1M input tokens$0.0700
per 1M output tokens$0.2600

Pricing over time

Input & output per 1M tokens · step-line = price changes

$0.0700

input / 1M

— stable

$0.2600

output / 1M

— stable

2026-06-142026-06-142026-06-14
Input
Output
Price change
⟳ synced weekly
Section 04

Tokens per second

Throughput in tokens per second, derived from measured P50 latency. Higher is better; fluctuations track provider-side load.

Throughput (tokens / s)380 / avg 1070
3334177

Estimated from P50 latency × 200 output tokens — the absolute number depends on this assumption; the trend is what matters.

Section 05

Capabilities

ownedBy: Qwen
Section 06

Availability

Availability

No measurements yet

We haven't recorded enough API calls to show availability stats for this model. Data appears once the model starts receiving live traffic.

Section 07

Tokonomix benchmark verdicts

⚖️
Endorsed by 1 judge
Independent LLM judges evaluated this model on our weekly intelligence tests
claude-sonnet-4-592/100 · 7 runs
6 correct0 partial1 wrong86% accuracy
2026-06-14

Pricing updated, performance metrics remain stable

The Qwen3-Coder-30B-A3B-Instruct model from OVH AI Endpoints maintains consistent performance characteristics following a pricing update. The model continues to demonstrate strong coding capabilities with no measurable changes in throughput, latency, or quality metrics between benchmark windows. Users can expect the same operational performance they experienced previously, with stable response times and code generation quality. The absence of performance data changes indicates reliable infrastructure and consistent model behavior. This stability is particularly valuable for production environments where predictable behavior matters. The coding-focused architecture continues to serve its intended use case without degradation. For teams already using this endpoint, the update should be transparent from a technical perspective, requiring no adjustments to integration patterns or performance expectations. New users evaluating this model can reference both current and previous benchmark data with confidence that results remain representative of actual performance. The pricing adjustment appears to be an isolated business decision without technical implications for model operation or capability.

Quality

Latency p50

Test runs

0

Performance metrics remain stable Consistent coding capabilities maintained
Last automated test
Jun 15, 2026 · 08:00 UTC · Speed benchmark
P50 latency
526 ms
P95 latency
570 ms
Errors
0 / 6 runs
Last reviewed by Tokonomix Team·June 15, 2026