Gemma 4 Byte for byte, the most capable open models
Purpose-built for advanced reasoning and agentic workflows. Google DeepMind's breakthrough open model family — Apache 2.0 licensed, frontier performance, runs on your hardware.
One family, every deployment target
From a phone in your pocket to a data center GPU — Gemma 4 has a model for you.
E2B
Effective 2B active params. Offline, near-zero latency on phones & Raspberry Pi.
E4B
Effective 4B with native audio input. Android AICore & ML Kit integration.
26B MoE
Mixture of Experts, only 3.8B active per inference. Blazing fast tokens/sec.
31B Dense
Maximum raw quality. Premier fine-tuning base. Single H100 80GB.
Built for modern AI workflows
Six breakthrough capabilities in a single, open-source model family.
Advanced Reasoning
Multi-step planning and deep logic. Major improvements in math (AIME 2026: 89.2%) and instruction-following benchmarks, enabling complex problem decomposition.
Agentic Workflows
Native function calling, structured JSON output, and system instructions. Build autonomous agents that interact with tools and APIs to execute complex workflows reliably.
Code Generation
High-quality offline code generation. Turn your workstation into a local-first AI code assistant. Scores 80% on LiveCodeBench v6 competitive coding problems.
Multimodal — Vision, Video & Audio
All models natively process images and video at variable resolutions, excelling at OCR and chart understanding. E2B/E4B also support native audio input for speech recognition.
140+ Languages
Natively trained on over 140 languages. Build inclusive, high-performance applications for a global audience with state-of-the-art multilingual understanding (MMMLU: 85.2%).
Ultra-Long Context
Process long-form content seamlessly. Edge models support a 128K context window; larger models extend to 256K tokens — pass entire repositories or long documents in one prompt.
Outperforms models 20× its size
Gemma 4 31B ranks #3 among all open models on Arena AI's text leaderboard. The 26B MoE holds #6 — achieving frontier-level intelligence at a fraction of the compute cost.
Ready to build with Gemma 4?
Apache 2.0 licensed. Available on Hugging Face, Ollama, Kaggle, LM Studio, and more. Start in minutes.
View Quick Start Guide →