Four Sizes

Gemma 4 Model Family

From mobile devices to developer workstations — a size optimized for every deployment target.

Gemma 4 E2B

Mobile · Edge

Engineered for maximum memory efficiency on edge devices. Activates an effective 2B parameter footprint to preserve RAM and battery life. Runs completely offline.

Parameters
Effective 2B active
Context
128K tokens
Hardware
Phones, Raspberry Pi, Jetson Nano
Native audio inputNear-zero latencyFully offline

Gemma 4 E4B

Mobile · Edge

Higher capability edge model with audio and visual understanding. Integrates with Android AICore Developer Preview and ML Kit GenAI Prompt API for production use.

Parameters
Effective 4B active
Context
128K tokens
Hardware
Android, iOS, Edge GPUs
Audio + visionAICore previewML Kit GenAI API

Gemma 4 26B MoE

Workstation · #6 Open

Mixture-of-Experts model activating only 3.8B parameters during inference for exceptional tokens-per-second throughput. #6 open model on Arena AI leaderboard.

Parameters
26B total, 3.8B active (MoE)
Context
256K tokens
Hardware
Consumer GPU, Single H100
Ultra-fast inferenceLow latencyMoE efficiency

Gemma 4 31B Dense

Workstation · #3 Open

Maximum raw quality and capability. The premier model for fine-tuning and research. Fits on a single 80GB H100 GPU. Currently the #3 open model in the world on Arena AI text leaderboard.

Parameters
31B dense
Context
256K tokens
Hardware
Single 80GB H100 (fp16)
Highest qualityFine-tuning championArena AI #3
Benchmarks

Benchmark Performance

Evaluated against industry-standard datasets. See the full model card for additional benchmarks.

BenchmarkTask31B Dense26B MoEGemma 3 27B (prior gen)
Arena AI (Text)Human preference145214411365
MMMLUMultilingual Q&A85.2%82.6%67.6%
MMMU ProMultimodal reasoning76.9%73.8%49.7%
AIME 2026Mathematics89.2%88.3%20.8%
LiveCodeBench v6Competitive coding80.0%77.1%29.1%
GPQA DiamondScientific knowledge84.3%82.3%42.4%
τ2-bench AgenticAgentic tool use86.4%85.5%6.6%

Source: Google DeepMind · Arena AI scores as of April 2, 2026.

Massive context windows

Process entire codebases, books, or long documents in a single prompt.

Gemma 4 E2B / E4B
128K tokens
Gemma 4 26B MoE
256K tokens
Gemma 4 31B Dense
256K tokens
Gemma 3 27B (prev gen)
128K tokens