TechBriefly
  • Tech
  • Business
  • Crypto
  • Science
  • Geek
  • How to
  • About
    • About TechBriefly
    • Terms and Conditions
    • Privacy Policy
    • Contact Us
    • Languages
      • 中文 (Chinese)
      • Dansk
      • Deutsch
      • Español
      • English
      • Français
      • Nederlands
      • Italiano
      • 日本语 (Japanese)
      • 한국인 (Korean)
      • Norsk
      • Polski
      • Português
      • Pусский (Russian)
      • Suomalainen
      • Svenska
No Result
View All Result
TechBriefly
Home Tech
Google details Ironwood TPU for large-scale inference

Google details Ironwood TPU for large-scale inference

Kerem GülenbyKerem Gülen
8 September 2025
in Tech
Reading Time: 2 mins read
Share on FacebookShare on Twitter

Google unveiled details about its Ironwood Tensor Processing Unit (TPU) at Hot Chips 2025, following its initial announcement at Google Cloud Next ’25 in April. Ironwood represents Google’s seventh-generation TPU, specifically designed for large-scale inference workloads, marking a shift from previous generations focused on training.

Each Ironwood chip incorporates two compute dies, delivering 4,614 TFLOPs of FP8 performance. It features eight stacks of HBM3e, providing 192 GB of memory per chip with a 7.3 TB/s bandwidth. The system architecture scales up to 9,216 chips per pod, facilitated by 1.2 TB/s of I/O bandwidth, eliminating the need for glue logic and achieving a total of 42.5 exaflops of performance.

A key highlight of Ironwood is its memory capacity. A single pod provides 1.77 PB of directly addressable HBM, which Google claims is a new world record for shared memory supercomputers. This extensive memory capacity is made possible by optical circuit switches that link racks together.

The Ironwood TPU also emphasizes reliability and resilience. The hardware can automatically reconfigure around failed nodes and restore workloads from checkpoints. Features include an on-chip root of trust, built-in self-test functions, silent data corruption mitigation, and logic repair functions to improve manufacturing yield. According to Google, an emphasis on RAS (reliability, availability, and serviceability) is visible throughout the architecture.

Cooling is handled by a cold-plate solution integrated with Google’s third-generation liquid-cooling infrastructure. Google claims that Ironwood achieves a twofold improvement in performance per watt compared to its predecessor, Trillium. Dynamic voltage and frequency scaling further enhance efficiency during varied workloads.

AI techniques were also employed in the design of Ironwood to optimize ALU circuits and floor plans. A fourth-generation SparseCore has been added to accelerate embeddings and collective operations, supporting workloads such as recommendation engines.

Ironwood deployment is currently underway at hyperscale within Google Cloud data centers. However, the TPU remains an internal platform and is not directly available to Google Cloud customers.

Ryan Smith of ServeTheHome commented on Google’s presentation at Hot Chips 2025, stating, “This was an awesome presentation. Google saw the need to create high‑end AI compute many generations ago. Now the company is innovating at every level from the chips, to the interconnects, and to the physical infrastructure. Even as the last Hot Chips 2025 presentation this had the audience glued to the stage at what Google was showing.”

Tags: featuredGoogleOpenAI
ShareTweet
Kerem Gülen

Kerem Gülen

Kerem from Turkey has an insatiable curiosity for the latest advancements in tech gadgets and a knack for innovative thinking.With 3 years of experience in editorship and a childhood dream of becoming a journalist, Kerem has always been curious about the latest tech gadgets and is constantly seeking new ways to create.As a Master's student in Strategic Communications, Kerem is eager to learn more about the ever-evolving world of technology. His primary focuses are artificial intelligence and digital inclusion, and he delves into the most current and accurate information on these topics.

Related Posts

Xiaomi to launch fully self-developed smartphone in 2026

Xiaomi to launch fully self-developed smartphone in 2026

12 January 2026
New WhatsApp parental controls will block strangers

New WhatsApp parental controls will block strangers

12 January 2026
Galaxy Unpacked 2026: S26 Ultra arrives just before MWC

Galaxy Unpacked 2026: S26 Ultra arrives just before MWC

12 January 2026
Meta purges 550,000 Australian accounts to comply with under-16 ban

Meta purges 550,000 Australian accounts to comply with under-16 ban

12 January 2026

LATEST

Xiaomi to launch fully self-developed smartphone in 2026

New WhatsApp parental controls will block strangers

Galaxy Unpacked 2026: S26 Ultra arrives just before MWC

Meta purges 550,000 Australian accounts to comply with under-16 ban

Simple ways to install and remove programs on Ubuntu

A guide to preventing accidental typing on Windows and Mac

Accessing your Google Chrome bookmarks

A guide to installing restricted extensions in Google Chrome

Anthropic launches health features for Claude

Google removes AI Overviews from medical queries

TechBriefly

© 2021 TechBriefly is a Linkmedya brand.

  • Tech
  • Business
  • Science
  • Geek
  • How to
  • About
  • Privacy
  • Terms
  • Contact
  • | Network Sites |
  • Digital Report
  • LeaderGamer

Follow Us

No Result
View All Result
  • Tech
  • Business
  • Crypto
  • Science
  • Geek
  • How to
  • About
    • About TechBriefly
    • Terms and Conditions
    • Privacy Policy
    • Contact Us
    • Languages
      • 中文 (Chinese)
      • Dansk
      • Deutsch
      • Español
      • English
      • Français
      • Nederlands
      • Italiano
      • 日本语 (Japanese)
      • 한국인 (Korean)
      • Norsk
      • Polski
      • Português
      • Pусский (Russian)
      • Suomalainen
      • Svenska