Microsoft has launched “MAI-Transcribe-1”, an AI transcription model that achieves speech-to-text accuracy across 25 widely spoken languages. The model aims to serve applications such as meetings, closed captioning, and dictation.
MAI-Transcribe-1 will be made available on Microsoft Foundry alongside other models, MAI-Voice-1 and MAI-Image-2. Microsoft stated this launch allows “MAI models [to] become broadly available for commercial use for the first time,” enabling customers to evaluate and build applications leveraging AI in transcription, voice, and image generation.
MAI-Voice-1 features hyper-realistic speech generation that maintains speaker identity and emotional nuance across extended content. It includes a voice-prompting feature that can develop custom brand voices from just one minute of recorded audio.
Meanwhile, MAI-Image-2 is a new text-to-image generation model that excels in rendering natural lighting, accurate skin tones, and clear text within images. This model has ranked among the top three on the Arena.ai text-to-image leaderboard.
Microsoft continues to reduce its reliance on OpenAI technology. The company has criticized GPT-4 for high costs and slow response times. As such, Microsoft has initiated the development of its own in-house AI models and is assessing third-party models for its Copilot feature.
Mustafa Suleyman, Microsoft’s AI CEO, confirmed the focus on developing “off-frontier” AI models, noting they will not reach the sophistication of OpenAI’s offerings. Recent restructuring of Microsoft’s Copilot leadership resulted in the formation of four divisions: Copilot experience, Copilot platform, Microsoft 365 apps, and AI models. Jacob Andreou, a former Snap executive, will lead the Copilot experiences division and report to Microsoft CEO Satya Nadella.
Salesforce CEO Marc Benioff previously stated that Microsoft would likely discontinue using OpenAI technology, pointing to challenges faced by OpenAI, including the abandonment of its $500 billion Stargate project aimed at building data centers across the U.S.








