TechBriefly
  • Tech
  • Business
  • Crypto
  • Science
  • Geek
  • How to
  • About
    • About TechBriefly
    • Terms and Conditions
    • Privacy Policy
    • Contact Us
    • Languages
      • 中文 (Chinese)
      • Dansk
      • Deutsch
      • Español
      • English
      • Français
      • Nederlands
      • Italiano
      • 日本语 (Japanese)
      • 한국인 (Korean)
      • Norsk
      • Polski
      • Português
      • Pусский (Russian)
      • Suomalainen
      • Svenska
No Result
View All Result
TechBriefly
Home Tech AI
DeepSeek releases V3.2-exp model with sparse attention

DeepSeek releases V3.2-exp model with sparse attention

Kerem GülenbyKerem Gülen
30 September 2025
in AI
Reading Time: 2 mins read
Share on FacebookShare on Twitter

Researchers at DeepSeek on Monday released a new experimental model, V3.2‑exp, which is designed to have dramatically lower inference costs when used in long-context operations. DeepSeek announced the model in a post on Hugging Face and also published a linked academic paper on GitHub that provides details on its architecture and performance.

The most important feature of the model is called DeepSeek Sparse Attention. This system uses a module referred to as a “lightning indexer” to prioritize specific excerpts from the context window. After that step, a separate system, a “fine-granular token selection system,” chooses specific tokens from within those excerpts. These selected tokens are then loaded into the module’s limited attention window. This combination allows the Sparse Attention model to operate over long portions of context with comparatively small server loads.

The system’s benefits are significant for long-context operations. Preliminary testing conducted by DeepSeek found that the price of a simple API call could be reduced by as much as half in these situations. Further testing will be required to build a more robust assessment of the claims. The model is open-weight and freely available on Hugging Face, which will allow for third-party tests to evaluate the results presented in the paper.

DeepSeek’s new model is part of a string of recent breakthroughs that address the problem of inference costs. These costs represent the server expenses of operating a pre-trained AI model, which are distinct from the cost of training it. DeepSeek’s researchers were looking for ways to make the fundamental transformer architecture operate more efficiently, finding that there are significant improvements to be made.

Based in China, DeepSeek has been an unusual figure in the AI sector, particularly for those who view AI research as a nationalist struggle between the U.S. and China. The company gained attention at the beginning of the year with its R1 model, which was trained using primarily reinforcement learning at a far lower cost than its American competitors. However, the model did not spark a wholesale revolution in AI training as some predicted, and the company has receded from the spotlight in the months since.

The new “sparse attention” approach is unlikely to produce the same uproar as R1, but it could still teach U.S. providers some much-needed tricks to help keep inference costs low.

Tags: DeepSeek V3.2-expfeatured
ShareTweet
Kerem Gülen

Kerem Gülen

Kerem from Turkey has an insatiable curiosity for the latest advancements in tech gadgets and a knack for innovative thinking.With 3 years of experience in editorship and a childhood dream of becoming a journalist, Kerem has always been curious about the latest tech gadgets and is constantly seeking new ways to create.As a Master's student in Strategic Communications, Kerem is eager to learn more about the ever-evolving world of technology. His primary focuses are artificial intelligence and digital inclusion, and he delves into the most current and accurate information on these topics.

Related Posts

Google introduces AI Inbox to organize Gmail tasks and updates

Google introduces AI Inbox to organize Gmail tasks and updates

9 January 2026
OpenAI announces ChatGPT Health feature

OpenAI announces ChatGPT Health feature

8 January 2026
Google Classroom turns lessons into podcasts with Gemini

Google Classroom turns lessons into podcasts with Gemini

8 January 2026
Caterpillar partners with Nvidia to bring AI to the construction site

Caterpillar partners with Nvidia to bring AI to the construction site

8 January 2026

LATEST

How to choose the right reset method for Samsung Galaxy devices

What resetting end-to-end encryption does on iPhone, iPad or Mac

How to easily monitor your AT&T data usage and avoid overages

How to reset your Bosch dishwasher when buttons won’t respond

Disney+ brings TikTok-style scrolling to its streaming app

Xbox reveals lineup for next Developer Direct: Fable, Forza and more

FIFA and TikTok partner to stream live World Cup clips

YouTube updates search filters to separate Shorts from long videos

Google introduces AI Inbox to organize Gmail tasks and updates

Announcements made by Samsung Display at CES 2026

TechBriefly

© 2021 TechBriefly is a Linkmedya brand.

  • Tech
  • Business
  • Science
  • Geek
  • How to
  • About
  • Privacy
  • Terms
  • Contact
  • | Network Sites |
  • Digital Report
  • LeaderGamer

Follow Us

No Result
View All Result
  • Tech
  • Business
  • Crypto
  • Science
  • Geek
  • How to
  • About
    • About TechBriefly
    • Terms and Conditions
    • Privacy Policy
    • Contact Us
    • Languages
      • 中文 (Chinese)
      • Dansk
      • Deutsch
      • Español
      • English
      • Français
      • Nederlands
      • Italiano
      • 日本语 (Japanese)
      • 한국인 (Korean)
      • Norsk
      • Polski
      • Português
      • Pусский (Russian)
      • Suomalainen
      • Svenska