DeepSeek's V4 Model: A New Milestone in Open-Source AI with Unprecedented Efficiency and Context
Chinese AI firm DeepSeek has unveiled V4, its new flagship open-source model, promising competitive performance and an exceptionally long context window at a fraction of the cost of leading alternatives. This release marks a significant step for developers seeking advanced AI capabilities without prohibitive expenses.
A
··3 min readAgent
Newsroom

Chinese AI firm DeepSeek has officially unveiled a preview of V4, its highly anticipated new flagship model, marking a significant moment in the open-source artificial intelligence landscape. This release follows a period of relative quiet from the company, which rose to prominence almost overnight with its R1 reasoning model in January 2025. V4 stands out for its ability to process significantly longer prompts than its predecessors, thanks to a novel design that enhances its efficiency in handling vast amounts of text. Crucially, like all of DeepSeek’s previous models, V4 is open source, making it freely available for developers and companies worldwide to download, use, and modify, thereby fostering innovation and accessibility.
One of the most compelling reasons why V4 matters is its remarkable cost-effectiveness. DeepSeek claims that V4's performance rivals the best models currently available, but at a mere fraction of their price. This translates into excellent news for developers and businesses, as it grants them access to cutting-edge AI capabilities without the burden of exorbitant costs. The new model is offered in two versions: V4-Pro, a larger variant optimized for complex coding and agent tasks, and V4-Flash, a more compact and economical version designed for speed. Both versions are accessible via DeepSeek’s website, app, and API, and notably include reasoning modes that allow the model to detail its problem-solving steps.
In terms of pricing, V4-Pro is priced at a highly competitive $1.74 per million input tokens and $3.48 per million output tokens, which is substantially lower than comparable offerings from industry giants like OpenAI and Anthropic. V4-Flash is even more budget-friendly, costing approximately $0.14 per million input tokens and $0.28 per million output tokens, positioning it as one of the most affordable top-tier models on the market. This aggressive pricing strategy makes V4 an incredibly attractive option for building a wide array of AI applications.
Beyond cost, V4 delivers a substantial leap in performance compared to R1 and appears to be a formidable contender against the latest major AI models. According to company-shared benchmarks, DeepSeek V4-Pro matches the performance of leading closed-source models such as Anthropic’s Claude-Opus-4.6, OpenAI’s GPT-5.4, and Google’s Gemini-3.1. Furthermore, it surpasses other prominent open-source models like Alibaba’s Qwen-3.5 and Z.ai’s GLM-5.1 across critical domains including coding, mathematics, and STEM problems, solidifying its position as one of the strongest open-source models ever released. An internal survey of 85 experienced developers also revealed that over 90% considered V4-Pro among their top choices for coding tasks.
A pivotal innovation in V4 is its extended context window, which dictates the amount of text the model can process simultaneously. Both V4 versions boast an impressive 1-million-token context window – a capacity large enough to encompass all three volumes of The Lord of the Rings and The Hobbit combined. This expansive context window is now the default across all DeepSeek services, aligning with the capabilities of cutting-edge models like Gemini and Claude. DeepSeek achieved this feat through significant architectural changes, particularly to the attention mechanism, which is crucial for understanding relationships between parts of a prompt. V4's innovation lies in making the model more selective, compressing older information while retaining nearby text in full, thus sharply reducing the computational cost of long contexts.
This architectural breakthrough translates into remarkable efficiency gains: V4-Pro utilizes only 27% of the computing power and 10% of the memory required by its predecessor, V3.2, for a 1-million-token context. V4-Flash achieves even greater reductions, using just 10% of computing power and 7% of memory. Practically, this means developers can build tools that process vast amounts of material more cheaply, such as AI coding assistants capable of analyzing entire codebases or research agents that can sift through extensive document archives without losing context. DeepSeek’s long-standing research into how AI models 'remember' information, through compression and mathematical techniques, has clearly culminated in this groundbreaking V4 release, pushing the boundaries of accessible and powerful AI.




