Naar inhoud
Draait in:USGemaakt in:United States
Google Gemini

Gemini 3.1 Flash Lite

1.048576M tokens

Tokonomix-redactie·Gecontroleerd door Mes Kalkan··
Sectie 01

Snelheidsanalyse

Latency gemeten over alle benchmark-runs. P50 (mediaan) en P95 (95e percentiel) geven een realistisch beeld van de responssnelheid onder normale en piekbelasting.

P50 latency (mediaan)P95 latency13 runs
35451367283098905-2805-31ms
Sectie 02

Kwaliteitsscores

Evaluatieresultaten van judge-model beoordelingen over diverse taakcategorieën. Scores weerspiegelen coherentie, accuratesse en instructieopvolging.

99
Code generatie
98
Creatief
100
Feitelijk
100
Meertaligheid
Sectie 03

Prijsgeschiedenis

Directe provider-tarieven per miljoen tokens, plus een typische gespreks-kostschatting.

💰
API-tarieven — Gemini 3.1 Flash Lite
$0.2500 per 1M input-tokens
$1.50 per 1M output-tokens
≈ $0.0004 per typisch gesprek (800 tokens)
Input vs output prijs (per 1M tokens)
per 1M input-tokens$0.2500
per 1M output-tokens$1.50

Pricing over time

Input & output per 1M tokens · step-line = price changes

$0.2500

input / 1M

— no change

$1.50

output / 1M

— no change

2026-06-072026-06-072026-06-07
Input
Output
Price change
⟳ synced weekly
Sectie 04

Tokens per seconde

Doorvoersnelheid in tokens per seconde, afgeleid uit gemeten P50-latency. Hogere waarden zijn beter; fluctuaties weerspiegelen serverbelasting bij de provider.

Doorvoer (tokens / s)425 / avg 444
559323

Geschat uit P50-latency × 200 output-tokens — het absolute getal hangt af van deze aanname; de trend is wat telt.

Sectie 05

Mogelijkheden

toolssource: litellmvisionjson modepdf inputreasoningaudio inputjson schemaparallel toolsprompt cachingoutputTokenLimit: 65536max output tokens: 65536
Sectie 06

Tokonomix benchmark-oordelen

⚖️
Endorsed by 1 judge
Independent LLM judges evaluated this model on our weekly intelligence tests
claude-sonnet-4-598/100 · 7 runs
7 correct0 partial0 wrong100% accuracy
2026-06-07

Gemini 3.1 Flash Lite adds capabilities but shows no performance data

Gemini 3.1 Flash Lite has undergone a significant expansion of capabilities since the previous benchmark window. The model now supports a comprehensive suite of features including tool use, vision processing, JSON mode and schema support, PDF input handling, reasoning capabilities, audio input, parallel tools execution, and prompt caching. This represents a substantial evolution from its previous baseline state, transforming it from a simple text model into a multimodal platform with advanced functionality. However, the current benchmark window contains no performance metrics across any evaluation categories, making it impossible to assess how these new capabilities translate into actual performance. Users should note that while the feature set has expanded dramatically and pricing information has been updated, there is currently no empirical data to validate the model's effectiveness at tasks involving these new modalities. The addition of prompt caching and parallel tools suggests optimization for production use cases, but without benchmark results, the practical impact remains unverified. Organizations considering this model should await performance data before making deployment decisions based solely on the expanded capability list.

Quality

Latency p50

Test runs

0

Multimodal capabilities added Tool use now supported No benchmark data available
Laatste automatische test
7 jun 2026 · 05:03 UTC · Benchmark
P50 latency
1910 ms
P95 latency
Fouten
0 / 6 runs
Laatst beoordeeld door Tokonomix-team·7 juni 2026