DeepSeek, a Chinese AI startup, revealed a new model, “MODEL1,” in its FlashMLA code repository on GitHub, appearing 28 times across 114 files. The revelation coincides with the first anniversary of DeepSeek’s R1 release.
MODEL1 represents a distinct architecture from DeepSeek-V3.2, internally codenamed “V32.” Code analysis by developers indicates changes in key-value cache layout, sparsity handling, and FP8 data format decoding. These alterations suggest targeted restructuring for memory optimization and computational efficiency.
The disclosure occurred via DeepSeek’s FlashMLA repository, which contains the company’s Multi-Head Latent Attention decoding kernel for Nvidia Hopper GPUs. Updates to the FlashMLA source code added support for MODEL1, including compatibility with Nvidia’s upcoming Blackwell architecture (SM100), according to posts on Reddit’s LocalLLaMA community. The code changes show MODEL1 reverting to a unified 512-standard dimension and incorporating features described as “Value Vector Position Awareness” and possible implementations of DeepSeek’s “Engram” conditional memory system.
DeepSeek plans to release its next-generation V4 model around mid-February 2026, coinciding with Lunar New Year on February 17, according to The Information, as cited by Reuters. Internal tests by DeepSeek employees suggest V4 could outperform rival models from Anthropic and OpenAI on coding benchmarks, particularly with long code prompts. The V4 model is expected to integrate DeepSeek’s Engram architecture, which allows efficient retrieval from contexts exceeding one million tokens by using a lookup system for foundational facts.
The MODEL1 revelation comes one year after DeepSeek’s R1 debut in January 2025. This event, termed an “AI Sputnik moment” by venture capitalist Marc Andreessen, resulted in a $593 billion reduction in Nvidia’s market value on a single day, ITPro reported. DeepSeek’s R1 model reportedly cost under $6 million to train, yet matched or exceeded OpenAI’s o1 model on math and coding benchmarks. The company subsequently released V3.1 in August and V3.2 in December, with V3.2 described as offering performance equivalent to OpenAI’s GPT-5.








