What are the key points?

Alibaba launches Qwen3.5 small models ranging from 0.8B to 9B parameters under Apache 2.0 license. The 9B variant becomes the most intelligent model under 10B, doubling performance of previous size-class leaders. Compact models feature native vision and 262K context windows, optimized for local execution on consumer hardware.

Alibaba Releases High-Performance Qwen3.5 Small Models

•Alibaba launches Qwen3.5 small models ranging from 0.8B to 9B parameters under Apache 2.0 license.
•The 9B variant becomes the most intelligent model under 10B, doubling performance of previous size-class leaders.
•Compact models feature native vision and 262K context windows, optimized for local execution on consumer hardware.

Alibaba has significantly expanded its Qwen3.5 ecosystem by introducing four dense, small-scale models designed for high-efficiency reasoning. The lineup, ranging from 0.8B to 9B parameters, demonstrates a massive leap in intelligence compared to the previous Qwen3 generation. Specifically, the 9B model now leads the sub-10B category, outperforming rivals like Falcon and NVIDIA's Nemotron by a wide margin. These models are unique because they utilize a unified "thinking" approach, which allows them to solve complex problems by generating a high volume of internal tokens before providing an answer.

Despite their compact size, these models are multimodal by nature, meaning they can process both text and images (native vision support) without requiring separate adapters. On the MMMU-Pro benchmark, which tests multimodal reasoning, the 9B and 4B variants achieved scores of 69% and 65% respectively, setting new standards for models under 15B parameters. This capability makes them exceptionally powerful for edge computing applications where memory is limited but visual understanding is required.

However, this intelligence comes with a specific trade-off: high token consumption. The small models use significantly more output tokens to "think" through problems compared to their larger flagship siblings or even frontier models like GPT-5.1. Furthermore, while reasoning is sharp, the models still struggle with factual accuracy, showing high hallucination rates on the AA-Omniscience benchmark. With an Apache 2.0 license and low memory requirements, these models are now accessible for developers to run locally on standard laptops and smartphones.

Alibaba has significantly expanded its Qwen3.5 ecosystem by introducing four dense, small-scale models designed for high-efficiency reasoning. The lineup, ranging from 0.8B to 9B parameters, demonstrates a massive leap in intelligence compared to the previous Qwen3 generation. Specifically, the 9B model now leads the sub-10B category, outperforming rivals like Falcon and NVIDIA's Nemotron by a wide margin. These models are unique because they utilize a unified "thinking" approach, which allows them to solve complex problems by generating a high volume of internal tokens before providing an answer.

Despite their compact size, these models are multimodal by nature, meaning they can process both text and images (native vision support) without requiring separate adapters. On the MMMU-Pro benchmark, which tests multimodal reasoning, the 9B and 4B variants achieved scores of 69% and 65% respectively, setting new standards for models under 15B parameters. This capability makes them exceptionally powerful for edge computing applications where memory is limited but visual understanding is required.

However, this intelligence comes with a specific trade-off: high token consumption. The small models use significantly more output tokens to "think" through problems compared to their larger flagship siblings or even frontier models like GPT-5.1. Furthermore, while reasoning is sharp, the models still struggle with factual accuracy, showing high hallucination rates on the AA-Omniscience benchmark. With an Apache 2.0 license and low memory requirements, these models are now accessible for developers to run locally on standard laptops and smartphones.

Alibaba Releases High-Performance Qwen3.5 Small Models

Tags