TechBriefly
  • Tech
  • Business
  • Crypto
  • Science
  • Geek
  • How to
  • About
    • About TechBriefly
    • Terms and Conditions
    • Privacy Policy
    • Contact Us
    • Languages
      • 中文 (Chinese)
      • Dansk
      • Deutsch
      • Español
      • English
      • Français
      • Nederlands
      • Italiano
      • 日本语 (Japanese)
      • 한국인 (Korean)
      • Norsk
      • Polski
      • Português
      • Pусский (Russian)
      • Suomalainen
      • Svenska
No Result
View All Result
TechBriefly
Home Tech AI
DeepSeek releases V3.2-exp model with sparse attention

DeepSeek releases V3.2-exp model with sparse attention

Kerem GülenbyKerem Gülen
30 September 2025
in AI
Reading Time: 2 mins read
Share on FacebookShare on Twitter

Researchers at DeepSeek on Monday released a new experimental model, V3.2‑exp, which is designed to have dramatically lower inference costs when used in long-context operations. DeepSeek announced the model in a post on Hugging Face and also published a linked academic paper on GitHub that provides details on its architecture and performance.

The most important feature of the model is called DeepSeek Sparse Attention. This system uses a module referred to as a “lightning indexer” to prioritize specific excerpts from the context window. After that step, a separate system, a “fine-granular token selection system,” chooses specific tokens from within those excerpts. These selected tokens are then loaded into the module’s limited attention window. This combination allows the Sparse Attention model to operate over long portions of context with comparatively small server loads.

The system’s benefits are significant for long-context operations. Preliminary testing conducted by DeepSeek found that the price of a simple API call could be reduced by as much as half in these situations. Further testing will be required to build a more robust assessment of the claims. The model is open-weight and freely available on Hugging Face, which will allow for third-party tests to evaluate the results presented in the paper.

DeepSeek’s new model is part of a string of recent breakthroughs that address the problem of inference costs. These costs represent the server expenses of operating a pre-trained AI model, which are distinct from the cost of training it. DeepSeek’s researchers were looking for ways to make the fundamental transformer architecture operate more efficiently, finding that there are significant improvements to be made.

Based in China, DeepSeek has been an unusual figure in the AI sector, particularly for those who view AI research as a nationalist struggle between the U.S. and China. The company gained attention at the beginning of the year with its R1 model, which was trained using primarily reinforcement learning at a far lower cost than its American competitors. However, the model did not spark a wholesale revolution in AI training as some predicted, and the company has receded from the spotlight in the months since.

The new “sparse attention” approach is unlikely to produce the same uproar as R1, but it could still teach U.S. providers some much-needed tricks to help keep inference costs low.

Tags: DeepSeek V3.2-expfeatured
ShareTweet
Kerem Gülen

Kerem Gülen

Kerem from Turkey has an insatiable curiosity for the latest advancements in tech gadgets and a knack for innovative thinking.With 3 years of experience in editorship and a childhood dream of becoming a journalist, Kerem has always been curious about the latest tech gadgets and is constantly seeking new ways to create.As a Master's student in Strategic Communications, Kerem is eager to learn more about the ever-evolving world of technology. His primary focuses are artificial intelligence and digital inclusion, and he delves into the most current and accurate information on these topics.

Related Posts

Ashley St. Clair sues xAI over Grok deepfakes

Ashley St. Clair sues xAI over Grok deepfakes

16 January 2026
Google Gemini gains “proactive reasoning” across YouTube and Search history

Google Gemini gains “proactive reasoning” across YouTube and Search history

15 January 2026
Google launches revamped Trends Explore page with Gemini

Google launches revamped Trends Explore page with Gemini

15 January 2026
Apple chose Google Gemini for Siri

Apple chose Google Gemini for Siri

13 January 2026

LATEST

Nvidia Rubin GPUs: 200 teraFLOPS FP64 from software emulation

Walmart maintains Apple Pay blockade across US stores for 2026

Musk demands $134 billion from OpenAI and Microsoft for wrongful gains

Apple shifts iOS 27 focus to quality and underlying performance

Google integrates Wallet and Tasks into Pixel 10 Magic Cue

Threads overtakes X with 141.5M mobile users

Microsoft issues emergency fix for Windows 11 shutdown bugs

How to gain full control by jailbreaking iPhone and rooting Android

How to create folders and add widgets on Android

OpenAI rockets $250 million into Altman’s Merge Labs brain-AI bridge

TechBriefly

© 2021 TechBriefly is a Linkmedya brand.

  • Tech
  • Business
  • Science
  • Geek
  • How to
  • About
  • Privacy
  • Terms
  • Contact
  • | Network Sites |
  • Digital Report
  • LeaderGamer

Follow Us

No Result
View All Result
  • Tech
  • Business
  • Crypto
  • Science
  • Geek
  • How to
  • About
    • About TechBriefly
    • Terms and Conditions
    • Privacy Policy
    • Contact Us
    • Languages
      • 中文 (Chinese)
      • Dansk
      • Deutsch
      • Español
      • English
      • Français
      • Nederlands
      • Italiano
      • 日本语 (Japanese)
      • 한국인 (Korean)
      • Norsk
      • Polski
      • Português
      • Pусский (Russian)
      • Suomalainen
      • Svenska