Thinking Machines Lab announced interaction models designed to allow simultaneous input processing and response generation, marking a shift in how AI can engage users. This capability, termed “full duplex,” enables interactions to resemble a phone conversation rather than a text-based exchange.
The model, TML-Interaction-Small, reportedly generates responses in 0.40 seconds, mirroring the speed of natural human conversation. The startup claims this performance is significantly faster than comparable models developed by OpenAI and Google.
Currently, TML-Interaction-Small is in a research preview and is not available to the public. According to the company, a “limited research preview” is expected within the next few months, with a broader release planned later in the year.
Despite the impressive benchmarks, there remains skepticism regarding whether the real-world experience will match the technical claims until users can conduct actual tests. Mira Murati, the former CTO of OpenAI and founder of Thinking Machines Lab, stated that the company aims to make interactivity an inherent feature of AI models.








