Tier B — Productie

Draait in:FranceGemaakt in:China

Qwen3-Coder-30B-A3B-Instruct

Tier B — Productie

Tokonomix-redactie·Gecontroleerd door Mes Kalkan·Gepubliceerd 27 mei 2026·Laatst gecontroleerd 30 juli 2026

Sectie 01

Snelheidsanalyse

Latency gemeten over alle benchmark-runs. P50 (mediaan) en P95 (95e percentiel) geven een realistisch beeld van de responssnelheid onder normale en piekbelasting.

P50 latency (mediaan)P95 latency101 runs

Sectie 02

Kwaliteitsscores

Evaluatieresultaten van judge-model beoordelingen over diverse taakcategorieën. Scores weerspiegelen coherentie, accuratesse en instructieopvolging.

Creatief

Feitelijk

100

Meertaligheid

Redeneren

Sectie 03

Prijsgeschiedenis

Directe provider-tarieven per miljoen tokens, plus een typische gespreks-kostschatting.

💰

API-tarieven — Qwen3-Coder-30B-A3B-Instruct

$0.0700 per 1M input-tokens

$0.2600 per 1M output-tokens

≈ <$0.0001 per typisch gesprek (800 tokens)

Input vs output prijs (per 1M tokens)

per 1M input-tokens$0.0700

per 1M output-tokens$0.2600

Pricing over time

Input & output per 1M tokens · step-line = price changes

$0.0700

input / 1M

— stable

$0.2600

output / 1M

— stable

2026-06-142026-06-282026-07-26

Input

Output

Price change

⟳ synced weekly

Sectie 04

Tokens per seconde

Doorvoersnelheid in tokens per seconde, afgeleid uit gemeten P50-latency. Hogere waarden zijn beter; fluctuaties weerspiegelen serverbelasting bij de provider.

Doorvoer (tokens / s)2222 / avg 1425

Geschat uit P50-latency × 200 output-tokens — het absolute getal hangt af van deze aanname; de trend is wat telt.

Sectie 05

Mogelijkheden

ownedBy: Qwen

Sectie 06

Beschikbaarheid

Nog geen meetdata

Er zijn nog niet genoeg API-aanroepen geregistreerd om beschikbaarheidsstatistieken voor dit model te tonen. Data verschijnt zodra het model live verkeer ontvangt.

Sectie 07

Tokonomix benchmark-oordelen

⚖️

Endorsed by 2 judges

Independent LLM judges evaluated this model on our weekly intelligence tests

cohere/command-a100/100 · 1 runs

1 correct0 partial0 wrong100% accuracy

claude-sonnet-4-592/100 · 47 runs

41 correct2 partial4 wrong87% accuracy

● 2026-07-26

Quality drops 9.8 points to 86.5 as category mix shifts from coding

Qwen3-Coder-30B-A3B-Instruct experienced a notable quality decline this window, falling from 96.3 to 86.5 overall. The most significant change is a shift in tested categories, with coding tests absent from the current window while new categories emerged. Multilingual performance remains the model's strongest area, maintaining exceptional scores at 100 compared to 99 previously. Creative work held relatively steady, moving from 90 to 88. However, the newly tested reasoning category scored 75, and factual performance came in at 83, both pulling the overall average down. The absence of coding tests is particularly notable given this model's specialized positioning and its perfect 100 coding score in the previous window. On the positive side, latency improved by 16 percent, dropping from 4655ms to 3913ms at median, making the model more responsive for interactive use cases. With only 5 test runs in each window, these results should be considered preliminary. Users should note that while the model continues to excel at multilingual tasks and maintains decent creative capabilities, the current test mix suggests more variability in reasoning and factual domains than previously observed.

Quality

86.5

Latency p50

3,913 ms

Test runs

✗ Quality dropped 9.8 points✓ Latency improved 16%✓ Multilingual maintains perfect score✗ No coding tests this window

Laatste automatische test

30 jul 2026 · 08:05 UTC · Snelheidstest

P50 latency

90 ms

P95 latency

103 ms

Fouten

0 / 6 runs

Laatst beoordeeld door Tokonomix-team·30 juli 2026