DeepSeek launches V4 model with one million word context

DeepSeek has launched its new AI model, DeepSeek V4, claiming improved performance optimized for domestically produced chips in China. The model features an ultra-long context of one million words, enhancing agent capabilities, world knowledge, and reasoning performance.

DeepSeek V4 is available in two versions: DeepSeek V4-Pro and DeepSeek V4-Flash. The company describes the latter as a more efficient and economical option. According to DeepSeek, V4-Pro significantly outperforms other open-source models in world knowledge benchmarks and is only slightly surpassed by Google’s closed-source model, Gemini-Pro-3.1.

The V4-Pro variant includes a “maximum reasoning effort mode” designed to advance the knowledge capabilities of open-source models, establishing it as a top contender in that space. DeepSeek previously caused a trillion-dollar sell-off in the stock market with its earlier model, R1, which challenged AI systems like OpenAI’s ChatGPT at a lower development cost.

Last year’s R1 release led to significant losses for major tech firms, with Nvidia suffering over $500 billion in a single day. The launch also marked the first major competition from a Chinese AI company against established US tech giants. DeepSeek’s release comes amid US semiconductor export restrictions to China, particularly affecting high-end GPUs essential for AI development.

The chip system used for training DeepSeek V4 has not been disclosed, but the firm stated it supports both Nvidia and Huawei chips. DeepSeek V4 can process up to 384,000 tokens, a fundamental unit of data for AI models. This marks a significant improvement over its predecessor, V3, which managed only 128,000 tokens.

The upgrade allows multi-document reasoning, enabling the AI to comprehend entire books and full code databases. The company claims this capability represents a “dramatic leap in computational efficiency” and initiates a new era for large language models with one million-length contexts.

DeepSeek V4-Pro outperforms Google’s Gemini-3.1-Pro but still lags behind Anthropic’s Claude Opus 4.6 model. DeepSeek aims to further enhance the model’s intelligence, robustness, and usability across various tasks and scenarios.

Featured image credit