Seviye B — Üretim

Çalıştığı yer:FranceYapıldığı yer:China

Qwen3-Coder-30B-A3B-Instruct

Seviye B — Üretim

Tokonomix Editöryel Ekibi·İnceleyen Mes Kalkan·Yayınlandı 27 Mayıs 2026·Son inceleme 30 Temmuz 2026

Bölüm 01

Hız analizi

Tüm benchmark çalıştırmalarında ölçülen gecikme. P50 (medyan) ve P95 (95. yüzdelik) normal ve yoğun yük altında yanıt hızının gerçekçi bir resmini verir.

P50 gecikme (medyan)P95 gecikme101 runs

Bölüm 02

Kalite puanları

Çeşitli görev kategorilerinde yargıç modelin puanlarından elde edilen değerlendirme sonuçları. Puanlar tutarlılık, doğruluk ve talimat takibini yansıtır.

Yaratıcı

Olgusal

100

Çok dilli

Akıl yürütme

Bölüm 03

Fiyat geçmişi

Milyon token başına doğrudan sağlayıcı tarifeleri, artı tipik bir konuşma maliyet tahmini.

💰

API tarifeleri — Qwen3-Coder-30B-A3B-Instruct

$0.0700 1M giriş token başına

$0.2600 1M çıkış token başına

≈ <$0.0001 tipik konuşma başına (800 token)

Giriş vs çıkış fiyatı (1M token başına)

1M giriş token başına$0.0700

1M çıkış token başına$0.2600

Pricing over time

Input & output per 1M tokens · step-line = price changes

$0.0700

input / 1M

— stable

$0.2600

output / 1M

— stable

2026-06-142026-06-282026-07-26

Input

Output

Price change

⟳ synced weekly

Bölüm 04

Saniye başına token

Ölçülen P50 gecikmesinden türetilen saniye başına token verimi. Yüksek daha iyidir; dalgalanmalar sağlayıcı tarafındaki yükü yansıtır.

Verim (token / s)2222 / avg 1425

P50 gecikme × 200 çıkış token tahmininden hesaplandı — mutlak rakam bu varsayıma bağlıdır; önemli olan eğilimdir.

Bölüm 05

Yetenekler

ownedBy: Qwen

Bölüm 06

Kullanılabilirlik

Henüz ölçüm verisi yok

Bu model için kullanılabilirlik istatistiklerini göstermek için yeterli API çağrısı henüz kaydedilmedi. Veri, model canlı trafik almaya başlayınca görünür.

Bölüm 07

Tokonomix kıyaslama kararları

⚖️

Endorsed by 2 judges

Independent LLM judges evaluated this model on our weekly intelligence tests

cohere/command-a100/100 · 1 runs

1 correct0 partial0 wrong100% accuracy

claude-sonnet-4-592/100 · 47 runs

41 correct2 partial4 wrong87% accuracy

● 2026-07-26

Quality drops 9.8 points to 86.5 as category mix shifts from coding

Qwen3-Coder-30B-A3B-Instruct experienced a notable quality decline this window, falling from 96.3 to 86.5 overall. The most significant change is a shift in tested categories, with coding tests absent from the current window while new categories emerged. Multilingual performance remains the model's strongest area, maintaining exceptional scores at 100 compared to 99 previously. Creative work held relatively steady, moving from 90 to 88. However, the newly tested reasoning category scored 75, and factual performance came in at 83, both pulling the overall average down. The absence of coding tests is particularly notable given this model's specialized positioning and its perfect 100 coding score in the previous window. On the positive side, latency improved by 16 percent, dropping from 4655ms to 3913ms at median, making the model more responsive for interactive use cases. With only 5 test runs in each window, these results should be considered preliminary. Users should note that while the model continues to excel at multilingual tasks and maintains decent creative capabilities, the current test mix suggests more variability in reasoning and factual domains than previously observed.

Quality

86.5

Latency p50

3,913 ms

Test runs

✗ Quality dropped 9.8 points✓ Latency improved 16%✓ Multilingual maintains perfect score✗ No coding tests this window

Son otomatik test

30 Tem 2026 · 08:05 UTC · Hız testi

P50 gecikme

90 ms

P95 gecikme

103 ms

Hatalar

0 / 6 çalıştırma

Son inceleyen Tokonomix Ekibi·30 Temmuz 2026