Open SourceSaturday, 02 May 2026 · 3 min read

Qwen3.6-35B Scores 73% on SWE-bench Under Apache 2.0, Closing Gap With Proprietary Coding Agents

Alibaba's Qwen team released Qwen3.6-35B-A3B on April 16 under Apache 2.0, posting a 73.4% score on SWE-bench Verified and running on consumer hardware — the strongest open-weight coding-agent result published this cycle.

↳ Source: huggingface.co/Qwen

Alibaba's Qwen team released Qwen3.6-35B-A3B on April 16, 2026, and the benchmark it planted — 73.4 percent on SWE-bench Verified — sits above every comparable open-weight model published to date, arriving under an Apache 2.0 license that imposes no commercial restrictions.

Architecture: Why 35B Feels Like 3B

The model is a sparse Mixture-of-Experts architecture. It carries 35 billion total parameters but activates only 3 billion of them per token, routing each inference through a subset of 256 experts arranged across 40 layers that alternate between Gated DeltaNet attention and standard Gated Attention. The practical consequence is that a model capable of frontier-level performance on agentic coding tasks can run on hardware that would be inadequate for a dense 35B model. Quantized GGUF versions are available via Unsloth and can be run locally on a single 24GB consumer GPU.

Context length is a strength. The model supports 262,144 tokens natively, with YaRN scaling extending that to roughly one million tokens — comparable to the context window that Llama 4 Maverick introduced as a headline feature. Qwen3.6 also supports multi-token prediction, which accelerates generation, and retains reasoning traces from prior turns when deployed in thinking mode — a feature that improves performance on multi-step agentic tasks where intermediate steps compound.

Benchmarks in Context

The 73.4 percent on SWE-bench Verified is the number that has attracted the most attention. SWE-bench tests a model's ability to solve real GitHub issues on established Python repositories — an evaluation designed to resist the kind of benchmark contamination that inflates scores on synthetic coding problems. For reference, Gemma 4-31B scores in the low-to-mid range on the same benchmark, and the proprietary frontier models that top the leaderboard are all either closed-weight or carry non-commercial licensing terms.

On SWE-bench Pro — a harder variant introduced in late 2025 — Qwen3.6-35B-A3B scores 49.5 percent, against Gemma 4-31B's 35.7 percent. Terminal-Bench 2.0, which evaluates autonomous shell-level task completion, puts Qwen3.6 at 51.5 percent versus Gemma's 42.9. The model's MMLU-Pro score of 85.2 and AIME 2026 score of 92.7 indicate that the coding performance is not at the expense of general reasoning.

A companion dense model, Qwen3.6-27B, followed on April 22, targeting deployments where MoE routing overhead is a concern or where simpler serving infrastructure is preferred.

Availability and Ecosystem Integration

Both models are available on Hugging Face Hub, where the 35B-A3B variant has already accumulated nearly 2.4 million downloads in its first two weeks. Deployment is supported out of the box via vLLM, SGLang, and KTransformers, with the Qwen team recommending eight-GPU tensor parallelism for full-precision production workloads. Ollama users can pull the model with a single command. MCP tool-calling configuration is documented directly in the model card, making agent integration straightforward.

Why Apache 2.0 Matters Here

The license is not a footnote. Apache 2.0 permits commercial use, modification, and redistribution without royalty or restriction. That stands in contrast to several recent open-weight releases that carry custom licenses prohibiting commercial deployment above certain usage thresholds, or that exclude specific downstream applications.

For organizations that have been evaluating proprietary coding assistants for on-premises deployment — in regulated industries, air-gapped environments, or situations where data residency requirements prevent API calls to external services — Qwen3.6-35B-A3B represents a meaningful option that did not exist four weeks ago. The combination of a 73 percent SWE-bench score, a permissive license, and hardware requirements within reach of a single workstation GPU is the profile the open-source community has been working toward, and Alibaba's Qwen team has now delivered it.

The broader trajectory of the Qwen family — which now spans models from 0.8B edge deployments to 397B-A17B frontier variants, all under the same Apache 2.0 terms — suggests that Alibaba's strategy is to own the open-weight coding category at every scale tier, rather than compete at a single point on the capability curve.

#Qwen#Alibaba#open source#Apache 2.0#SWE-bench#MoE#coding

Qwen3.6-35B Scores 73% on SWE-bench Under Apache 2.0, Closing Gap With Proprietary Coding Agents

Architecture: Why 35B Feels Like 3B

Benchmarks in Context

Availability and Ecosystem Integration

Why Apache 2.0 Matters Here

Sources