Qwen3-Coder-30B-A3B-Instruct
Speed analysis
Latency measured across all benchmark runs. P50 (median) and P95 (95th percentile) give a realistic picture of response speed under normal and peak load.
Quality scores
Evaluation results from judge-model scoring across diverse task categories. Scores reflect coherence, accuracy and instruction-following.
Pricing history
Direct provider rates per million tokens, plus a typical-conversation cost estimate.
Pricing over time
Input & output per 1M tokens · step-line = price changes
$0.0700
input / 1M
— stable
$0.2600
output / 1M
— stable
Tokens per second
Throughput in tokens per second, derived from measured P50 latency. Higher is better; fluctuations track provider-side load.
Estimated from P50 latency × 200 output tokens — the absolute number depends on this assumption; the trend is what matters.
Capabilities
Availability
Availability
No measurements yet
We haven't recorded enough API calls to show availability stats for this model. Data appears once the model starts receiving live traffic.
Tokonomix benchmark verdicts
Pricing updated, performance metrics remain stable
The Qwen3-Coder-30B-A3B-Instruct model from OVH AI Endpoints maintains consistent performance characteristics following a pricing update. The model continues to demonstrate strong coding capabilities with no measurable changes in throughput, latency, or quality metrics between benchmark windows. Users can expect the same operational performance they experienced previously, with stable response times and code generation quality. The absence of performance data changes indicates reliable infrastructure and consistent model behavior. This stability is particularly valuable for production environments where predictable behavior matters. The coding-focused architecture continues to serve its intended use case without degradation. For teams already using this endpoint, the update should be transparent from a technical perspective, requiring no adjustments to integration patterns or performance expectations. New users evaluating this model can reference both current and previous benchmark data with confidence that results remain representative of actual performance. The pricing adjustment appears to be an isolated business decision without technical implications for model operation or capability.
Quality
—
Latency p50
—
Test runs
0
Qwen3-Coder-30B-A3B-Instruct
by OVH AI Endpoints (GRA)
- Context window
- — tokens
- Input price
- $0.0700 / 1M
- Output price
- $0.2600 / 1M
- Tier
- —
- Modality
- Text
- API type
- REST · streaming
- Benchmark runs
- 91
More from OVH AI Endpoints (GRA)