What are the key points?

Google DeepMind releases Gemma 4, a vision-capable open model suite under Apache 2.0 license. New 'Effective' parameter architecture introduces Per-Layer Embeddings to boost on-device performance and efficiency. Models demonstrate high intelligence-per-parameter, excelling in complex reasoning and SVG code generation.

Google Launches Gemma 4 Open Vision Models

•Google DeepMind releases Gemma 4, a vision-capable open model suite under Apache 2.0 license.
•New 'Effective' parameter architecture introduces Per-Layer Embeddings to boost on-device performance and efficiency.
•Models demonstrate high intelligence-per-parameter, excelling in complex reasoning and SVG code generation.

Google DeepMind has unveiled Gemma 4, a suite of vision-capable models that signals a major leap in open-weights efficiency. By focusing on "intelligence-per-parameter," the release offers four models ranging from 2B to 31B parameters, all licensed under the developer-friendly Apache 2.0 terms. This shift addresses the growing demand for powerful models that can run locally on consumer hardware without the latency or privacy concerns associated with cloud-based systems.

The core technical innovation is the introduction of "Effective" parameter sizing, specifically in the E2B and E4B variants. This architecture utilizes a technique called Per-Layer Embeddings (PLE). Rather than simply scaling by adding more layers, PLE assigns individual embedding tables to each decoder layer for rapid token lookups. This design allows the models to perform complex reasoning while maintaining a memory footprint small enough for mobile devices and laptops.

Beyond pure text, Gemma 4 demonstrates remarkable proficiency in multimodal tasks, specifically reasoning through visual prompts to generate functional code like Scalable Vector Graphics (SVG). Although community-led testing on local runners revealed some initial technical hurdles in the largest 31B variant, the overall performance benchmark sets a new standard for what small models can achieve in creative and logic-heavy applications without massive infrastructure.

Google DeepMind has unveiled Gemma 4, a suite of vision-capable models that signals a major leap in open-weights efficiency. By focusing on "intelligence-per-parameter," the release offers four models ranging from 2B to 31B parameters, all licensed under the developer-friendly Apache 2.0 terms. This shift addresses the growing demand for powerful models that can run locally on consumer hardware without the latency or privacy concerns associated with cloud-based systems.

The core technical innovation is the introduction of "Effective" parameter sizing, specifically in the E2B and E4B variants. This architecture utilizes a technique called Per-Layer Embeddings (PLE). Rather than simply scaling by adding more layers, PLE assigns individual embedding tables to each decoder layer for rapid token lookups. This design allows the models to perform complex reasoning while maintaining a memory footprint small enough for mobile devices and laptops.

Beyond pure text, Gemma 4 demonstrates remarkable proficiency in multimodal tasks, specifically reasoning through visual prompts to generate functional code like Scalable Vector Graphics (SVG). Although community-led testing on local runners revealed some initial technical hurdles in the largest 31B variant, the overall performance benchmark sets a new standard for what small models can achieve in creative and logic-heavy applications without massive infrastructure.

Google Launches Gemma 4 Open Vision Models

Tags