Gemma 4, in four sizes from 2B to 31B parameters, is the first Google model under a true open-source license -- and its reasoning benchmarks rival closed competitors at a fraction of the size.
Ars Technica called the Apache 2.0 switch 'a major concession'; ZDNET framed it as making cloud AI optional; 9to5Google led with the Gemini 3 research lineage.
Jeff Dean posted benchmarks and the Apache 2.0 license shift; the AI developer community on X is treating Gemma 4 as the moment open-source models became viable for agentic workflows.
Google released Gemma 4 on Wednesday, April 2 -- a family of four open models that represents a genuine inflection point in the relationship between large technology companies and the developers who build on their work. The models come in four sizes: 2 billion, 4 billion, 26 billion (mixture-of-experts), and 31 billion (dense) parameters. [1] All four are released under the Apache 2.0 license, which means anyone can use, modify, and distribute them commercially without restriction. [2]
The license change is the news. Gemma 3, released in 2025, carried Google's custom Gemma Terms of Use -- a license that imposed usage restrictions, required attribution, and reserved rights that made corporate legal departments nervous. Apache 2.0 is the gold standard of permissive open-source licensing. It means a startup can build a product on Gemma 4, modify the weights, and never tell Google. It means a hospital can deploy the model on its own servers, process patient data locally, and satisfy HIPAA requirements that cloud-based models cannot. [3]
The technical capabilities justify the license shift. Google's blog post describes the models as "built from the same world-class research and technology as Gemini 3" -- Google's most advanced closed model. [1] The benchmarks support the claim. The Decoder reported that Gemma 4's 31B dense model achieves an AIME 2026 score of 89.2%, up from 20.8% for Gemma 3. [4] Its Codeforces ELO jumped from 110 to 2,150. These are not incremental improvements. They represent a model that has crossed from research curiosity to production capability.
The 26B mixture-of-experts variant is particularly significant for cost-conscious deployment. MoE architectures activate only a subset of parameters for each input, which means the model delivers 26B-class performance while using computational resources closer to a 4B model. [4] For organizations running inference on their own hardware -- the entire point of an open model -- this translates to lower GPU requirements and faster response times.
Google's Jeff Dean posted the release on X, emphasizing the Apache 2.0 decision: "By releasing Gemma 4 under the Apache 2.0 license, we hope to enable more researchers and developers to build with our models." [5] Richard Seroter of Google Cloud noted that Gemma 4 debuted at third and sixth place on the open-source leaderboard, making it "the #1 ranked US open source model." [6] The framing matters. In a field where Meta's Llama has dominated the open-weight landscape, Google is staking a claim not just on capability but on openness.
Ars Technica called the license switch "a major concession," noting that Google's previous Gemma releases had frustrated developers who wanted to use the models in commercial products without navigating restrictive terms. [7] ZDNET went further, framing Gemma 4 as making "cloud AI optional" -- a direct challenge to the business model that funds Google's own AI division. [3] The tension is real. Every developer who runs Gemma 4 locally is a developer not paying for Gemini API calls. Google appears to have concluded that the ecosystem benefits outweigh the revenue cannibalization.
The agentic workflow capability is what distinguishes Gemma 4 from its predecessors and from most open competitors. Google's announcement specifically highlights the models' ability to use tools, follow multi-step instructions, and reason through complex tasks -- the requirements for AI agents that can operate autonomously. [1] Mashable reported that the models are available through Hugging Face, Kaggle, and Google AI Studio, with Ollama support for local deployment. [8] The 2B and 4B variants can run on laptops. The 26B MoE variant fits on a single consumer GPU. The 31B dense model requires a workstation-class setup but remains within reach of a small team.
The timing is coincidental but instructive. Gemma 4 arrived during a week in which the dominant news is a war, an energy crisis, and a space mission. The model will not solve any of those problems. But it will make it cheaper, faster, and more private for anyone to build AI applications that run without an internet connection, without a cloud bill, and without asking Google's permission. That last part is new.
-- KENJI NAKAMURA, Tokyo