Google AI releases EmbeddingGemma on-device model

Aytun Çelebi · September 8, 2025, 11:36 ·1 min read

Google AI has released EmbeddingGemma, a new on-device embedding model boasting 308 million parameters. According to Google, its compact size allows it to function effectively on mobile devices and in offline settings. The model achieves sub-15ms inference latency for 256 tokens on EdgeTPU, making it suitable for real-time applications.

Trained on data spanning over 100 languages, EmbeddingGemma secured the top position on the Massive Text Embedding Benchmark (MTEB) among models with fewer than 500 million parameters. Google reports its performance rivals or surpasses that of embedding models almost twice its size, especially in cross-lingual retrieval and semantic search tasks.

More information is available via the provided links to a full analysis, the model on Hugging Face, and technical details.

AI EmbeddingGemma Google

Written by

Aytun Çelebi

Starting with coding on Commodore 64 in elementary school moving to web programming in his teenage years, Aytun has been around technology for over 30 years, and he has been a tech journalist for over 20 years now. He worked in many major Turkish outlets (newspapers, magazines, TV channels and websites) and managed some. Besides journalism, he worked as a copywriter and PR manager (for Lenovo, HP and many international brands ) in agencies. He founded his agency, Linkmedya in 2019 to execute his way of producing content. He is recently interested in AI, automation and MarTech.

View all posts →