Alibaba's Qwen3.5 Revolutionizes AI with 19x Faster Decoding and Cost Reduction
February 18, 2026
Alibaba’s Qwen3.5 delivers dramatically faster decoding—about 19x faster than Qwen3-Max at 256K context and around 7.2x faster than the 235B-A22B model—while running roughly 60% cheaper than its predecessor and handling larger concurrent workloads.
Qwen3.5-397B-A17B stands as Alibaba’s flagship open-weight model with 397 billion parameters but only 17 billion active per token, enabling faster inference and lower runtime cost without sacrificing performance against larger models.
Qwen3.5 emphasizes agentic capabilities, featuring open-sourced Qwen Code for natural-language-driven coding tasks and compatibility with the OpenClaw framework for autonomous task execution, with hosted Qwen3.5-Plus offering adaptive inference modes.
Open-weight deployment requires substantial hardware—roughly 256GB RAM for quantized use or 512GB for comfortable headroom—typically on GPU nodes; all open-weight models are released under Apache 2.0, allowing commercial use, modification, and redistribution without royalties.
Qwen3.5 is the successor to Qwen3-Next MoE, expanding from 128 to 512 experts to reduce inference latency and enable a full-depth expert pool for specialized reasoning.
Availability: Qwen3.5-397B-A17B on Hugging Face under Qwen/Qwen3.5-397B-A17B, hosted Qwen3.5-Plus via Alibaba Cloud Model Studio, and Qwen Chat at chat.qwen.ai for evaluation.
The model expands multilingual support to 201 languages and dialects with a 250k-token tokenizer, lowering per-language tokenization costs and speeding inference for non-Latin scripts.
Benchmark highlights show competitive performance against GPT-5.2, Claude Opus 4.5, and Gemini 3 Pro on general reasoning and coding tasks, with Qwen3.5-397B-A17B outperforming Qwen3-Max across multiple tasks.
Qwen3.5 is native multimodal, trained on text, images, and video from scratch, improving tight text-image reasoning without a separate vision encoder.
This marks the first release in the Qwen3.5 family, with smaller dense models and additional MoE configurations expected soon, signaling a broader rollout of frontier-capable, open-weight models for enterprise deployment without proprietary APIs.
Summary based on 1 source
Get a daily email with more Tech stories
Source

VentureBeat • Feb 18, 2026
Alibaba's new Qwen 3.5 beats its larger trillion-parameter model — at a fraction of the cost