Speed analysis
Latency measured across all benchmark runs. P50 (median) and P95 (95th percentile) give a realistic picture of response speed under normal and peak load.
Quality scores
Evaluation results from judge-model scoring across diverse task categories. Scores reflect coherence, accuracy and instruction-following.
Pricing history
Direct provider rates per million tokens, plus a typical-conversation cost estimate.
Pricing over time
Input & output per 1M tokens · step-line = price changes
$0.2500
input / 1M
— no change
$1.50
output / 1M
— no change
Tokens per second
Throughput in tokens per second, derived from measured P50 latency. Higher is better; fluctuations track provider-side load.
Estimated from P50 latency × 200 output tokens — the absolute number depends on this assumption; the trend is what matters.
Capabilities
Tokonomix benchmark verdicts
Gemini 3.1 Flash Lite adds capabilities but shows no performance data
Gemini 3.1 Flash Lite has undergone a significant expansion of capabilities since the previous benchmark window. The model now supports a comprehensive suite of features including tool use, vision processing, JSON mode and schema support, PDF input handling, reasoning capabilities, audio input, parallel tools execution, and prompt caching. This represents a substantial evolution from its previous baseline state, transforming it from a simple text model into a multimodal platform with advanced functionality. However, the current benchmark window contains no performance metrics across any evaluation categories, making it impossible to assess how these new capabilities translate into actual performance. Users should note that while the feature set has expanded dramatically and pricing information has been updated, there is currently no empirical data to validate the model's effectiveness at tasks involving these new modalities. The addition of prompt caching and parallel tools suggests optimization for production use cases, but without benchmark results, the practical impact remains unverified. Organizations considering this model should await performance data before making deployment decisions based solely on the expanded capability list.
Quality
—
Latency p50
—
Test runs
0
Gemini 3.1 Flash Lite
by Google Gemini
- Context window
- 1.048576M tokens
- Input price
- $0.2500 / 1M
- Output price
- $1.50 / 1M
- Tier
- —
- Modality
- Text + vision
- API type
- REST · streaming
- Benchmark runs
- 26
More from Google Gemini
Similar models