Monday, 18 May 2026
AI Daily
Front Page
Open SourceSunday, 03 May 2026 · 3 min read

DeepSeek V4 Pro: 1.6T-Parameter MIT-Licensed Model Rivals GPT-5.5 at a Fraction of the Cost

DeepSeek released V4 Pro on April 24 as a fully MIT-licensed open-source model with 1.6 trillion total parameters and a 1-million-token context window, posting benchmark scores comparable to GPT-5.5 and Claude Opus 4.7 at roughly 10 times lower API cost.

DeepSeek V4 Pro model benchmark chart showing open-source performance versus proprietary models
Placeholder (picsum)

DeepSeek released V4 Pro on 24 April 2026 as a fully open-source model under the MIT license — the most permissive major licence available — with weights published on Hugging Face and benchmark performance that places it alongside GPT-5.5, Claude Opus 4.7, and Gemini 3.1 on several key evaluations. The model delivers that capability at approximately $1.74 per million input tokens via API, against roughly $25 per million for Claude Opus 4.7.

Technical Architecture

V4 Pro is a Mixture-of-Experts model with 1.6 trillion total parameters, of which 49 billion are active per forward pass. The selective activation of parameter subsets is what makes the architecture tractable on available hardware: the model deploys the computational resources of a small-but-dense frontier model per token while retaining the knowledge representation of a much larger system.

The context window extends to one million tokens — equivalent to roughly three full-length novels, or a large software codebase with documentation. DeepSeek built this capability around a novel attention mechanism that combines token-wise compression with Sparse Attention (DSA), compressing older context while maintaining high fidelity on nearby tokens. The technical note states this delivers "world-leading long context with drastically reduced compute and memory costs."

The model supports both thinking and non-thinking inference modes. The thinking mode engages extended chain-of-thought reasoning for complex problems; the non-thinking mode trades reasoning depth for lower latency on simpler queries. Both modes are accessible via the same API endpoint.

Performance and Pricing

Over 90% of developers surveyed by MIT Technology Review included V4 Pro among their top choices for coding work. The model leads all open-source systems on world knowledge benchmarks, trailing only Gemini 3.1 Pro on that dimension. On agentic coding tasks — multi-step problems requiring tool use, state management, and iterative debugging — DeepSeek claims open-source state-of-the-art performance.

For operators running the model via API rather than self-hosting, the cost structure represents a genuine disruption. At $1.74 per million input and $3.48 per million output tokens (V4-Pro), and $0.14/$0.28 for the faster V4-Flash variant, the model costs between 7 and 90 times less than leading proprietary alternatives depending on the tier compared. For high-volume enterprise workloads — document processing, code review pipelines, customer-facing chatbots — the cost differential is commercially significant.

Geopolitical Dimensions

V4 Pro introduces an element absent from prior DeepSeek releases: optimisation for Chinese domestic chips. The model was trained with particular attention to Huawei's Ascend architecture, reducing dependence on Nvidia and AMD hardware that has faced US export controls since 2022. While DeepSeek does not describe V4 Pro as exclusively non-Nvidia, the deliberate optimisation for Ascend represents a significant step toward building AI infrastructure that operates outside the US chip supply chain.

MIT Technology Review noted that this aspect of the release carries implications beyond the AI market: it demonstrates that frontier-quality models can be trained and deployed on non-US hardware at scale, which has direct relevance for national AI strategies in countries that cannot freely access Nvidia products. The signal will be read clearly in Brussels, where EU policymakers are developing compute-sovereignty provisions premised in part on eventual dependence reduction from US chip suppliers.

Existing deepseek-chat and deepseek-reasoner API models will be retired on 24 July 2026, migrating users to the V4 family. The retirement deadline gives enterprise teams approximately 90 days to evaluate V4 Pro and update any integrations built around earlier model versions.

What Open Weights Enable

The MIT licence allows commercial use, modification, and redistribution without royalty. For organisations that have been cautious about deploying proprietary closed models due to data residency concerns, vendor lock-in risk, or internal policies requiring auditability of the model itself, V4 Pro's open weights resolve the gatekeeping problem entirely. A team can download the weights, inspect them, fine-tune them on private data, and run them on infrastructure they fully control.

The practical barrier for most organisations is hardware: running V4 Pro at full scale requires multi-node GPU infrastructure that few enterprise IT departments currently operate. The V4-Flash variant, with significantly fewer active parameters, is more tractable for in-house deployment.

#deepseek#open-source#moe#mit-license#hugging-face#coding

Sources

More from Open Source

See all