Çalıştığı yer:FranceYapıldığı yer:United States

Arşivlendi

Bu model sağlayıcı tarafından kullanımdan kaldırıldı. Geçmiş veriler korunmaktadır.

28 Haziran 2026 tarihinden beri kullanılamıyor.

OVH AI Endpoints (GRA)

Llama-3.1-8B-Instruct

Tokonomix Editöryel Ekibi·İnceleyen Mes Kalkan·Yayınlandı 27 Mayıs 2026·Son inceleme 28 Haziran 2026

Bölüm 01

Fiyat geçmişi

Milyon token başına doğrudan sağlayıcı tarifeleri, artı tipik bir konuşma maliyet tahmini.

💰

API tarifeleri — Llama-3.1-8B-Instruct

$0.1000 1M giriş token başına

$0.1000 1M çıkış token başına

≈ <$0.0001 tipik konuşma başına (800 token)

Giriş vs çıkış fiyatı (1M token başına)

1M giriş token başına$0.1000

1M çıkış token başına$0.1000

Pricing over time

Input & output per 1M tokens · step-line = price changes

$0.1000

input / 1M

— stable

$0.1000

output / 1M

— stable

2026-06-142026-06-142026-06-21

Input

Output

Price change

⟳ synced weekly

Bölüm 02

Yetenekler

ownedBy: meta-llama

Bölüm 03

Kullanılabilirlik

Henüz ölçüm verisi yok

Bu model için kullanılabilirlik istatistiklerini göstermek için yeterli API çağrısı henüz kaydedilmedi. Veri, model canlı trafik almaya başlayınca görünür.

Bölüm 04

Tokonomix kıyaslama kararları

⚖️

Endorsed by 1 judge

Independent LLM judges evaluated this model on our weekly intelligence tests

claude-sonnet-4-586/100 · 23 runs

15 correct7 partial1 wrong65% accuracy

● 2026-06-21

Quality drops 29 points as performance degrades across all categories

Llama-3.1-8B-Instruct by OVH AI Endpoints has experienced a significant decline in performance this benchmark window. The overall quality score plummeted from 99.0 to 70.3, representing a 28.7-point drop that affects the model's competitive standing. The degradation is evident across all measured categories, with factual accuracy scoring just 57, reasoning at 74, and multilingual capabilities at 80. This contrasts sharply with the previous window where coding achieved 100, multilingual scored 97, and reasoning reached 100. The current window shows a different category composition, making direct comparisons complex, but the overall trend is unmistakably negative. On a positive note, latency has improved slightly from 9119ms to 7942ms at the median, offering users marginally faster response times. However, this speed gain is overshadowed by the substantial quality regression. Testing consistency remains stable with five runs in both windows. Users relying on this endpoint should be aware of the current performance limitations, particularly for fact-dependent tasks where the model now scores below 60. The cause of this regression warrants investigation to determine whether it stems from infrastructure changes, model configuration, or other factors.

Quality

70.3

Latency p50

7,942 ms

Test runs

✗ Quality dropped 29 points✗ Factual accuracy now only 57✓ Latency improved to 7942ms✗ Reasoning declined significantly

Son otomatik test

28 Haz 2026 · 05:12 UTC · Test

P50 gecikme

—

P95 gecikme

—

Hatalar

1 / 6 çalıştırma

Son inceleyen Tokonomix Ekibi·28 Haziran 2026