On May 10th, 2024, OpenAI co-founder and CEO Sam Altman teased the upcoming OpenAI Spring Update on X.
While rumors swirled about GPT-5 or a search engine, Altman hinted at “new stuff” that would feel “like magic”.
From the GPT-4o to the ChatGPT desktop app, here is everything announced at the OpenAI Spring Update. Buckle up!
GPT-4o was the big deal in the OpenAI Spring Update show
OpenAI unveiled their latest advancement to immensely popular ChatGPT, GPT-4o, described by CTO Mira Murati as their “newest flagship model”.
This iteration builds upon GPT-4’s capabilities, notably its ability to reason across voice, text, and vision.
Murati emphasized their commitment to accessibility, aiming to offer advanced AI tools for free. This aligns with their mission of democratizing access to powerful language models.
Another OpenAI employee, William Fedus, states: “GPT-4o is our new state-of-the-art frontier model. We’ve been testing a version on the LMSys arena as im-also-a-good-gpt2-chatbot “, with the following benchmark results the new GPT-4o got on LMSys arena on X:
But the ELO can ultimately become bounded by the difficulty of the prompts (i.e. can’t achieve arbitrarily high win rates on the prompt: “what’s up”). We find on harder prompt sets — and in particular coding — there is an even larger gap: GPT-4o achieves a +100 ELO over our prior… pic.twitter.com/ReJzcQdgC8
— William Fedus (@LiamFedus) May 13, 2024
With the introduction of GPT-4o in the OpenAI API, OpenAI demonstrates its commitment to fostering innovation and empowering developers.
The future holds exciting possibilities as audio and video functionalities become more widely available, further expanding the potential for groundbreaking applications.
Users will get more out of the free version of ChatGPT
Previously limited to mobile devices, the ChatGPT Voice mode is now available on desktops through a dedicated Mac application – not a voice assistant as some speculated. This highlights a shift in how humans and machines collaborate, according to Murati. She explained that GPT-4o processes information across various modalities, allowing them to extend GPT-4-level intelligence to free users, a feature they’ve been working on for months.
Over 100 million users rely on ChatGPT, and GPT-4o’s improved resource efficiency allows OpenAI to offer customizable chatbots, Custom GPTs, within the free tier. Users can soon expect tools for data, coding, and image analysis, eliminating the need for a paid subscription for basic visual tasks.
These features will be rolled out in the coming weeks.
The significant enhancements to the free tier raise questions about the value proposition of the $20 monthly ChatGPT Plus subscription. Murati clarifies that the primary benefit remains the ability to make five times the daily requests to GPT-4o compared to the free plan.
When using GPT-4o, ChatGPT Free users will now have access to features such as:
- Experience GPT-4 level intelligence
- Get responses from both the model and the web
- Analyze data and create charts
- Chat about photos you take
- Upload files for assistance summarizing, writing or analyzing
- Discover and use GPTs and the GPT Store
- Build a more helpful experience with Memory
Real-time voice chat with ChatGPT
GPT-4o processes audio inputs directly, eliminating the need for text transcription. During the demonstration at OpenAI Spring Update, an OpenAI staff member simulated breathing exercises, and GPT-4o successfully provided suggestions to improve technique, enhance singing, and even offered mood-boosting advice.
Further innovation comes with GPT-4o’s ability to offer real-time assistance through livestreaming. This showcases the platform’s potential for interactive problem-solving and education. Imagine students or researchers presenting complex equations during a livestream, and ChatGPT instantly providing explanations and solutions – a revolutionary approach to learning.
Check out how Greg Brockman showcases this feature in the video below.
Introducing GPT-4o, our new model which can reason across text, audio, and video in real time.
It's extremely versatile, fun to play with, and is a step towards a much more natural form of human-computer interaction (and even human-computer-computer interaction): pic.twitter.com/VLG7TJ1JQx
— Greg Brockman (@gdb) May 13, 2024
ChatGPT desktop app arrives
The Mac ChatGPT desktop application showcases a remarkably natural voice interface for ChatGPT. During the presentation in OpenAI Spring Update, we saw that it can observe code being written in real-time, analyze it, and articulate its observations, including potential problems. The vision functionalities seem to extend beyond code, as demonstrated by the app’s ability to examine and offer insights on a displayed graph.
ChatGPT as a real-time translator
And finally, at the OpenAI Spring Update, the OpenAI team showcased ChatGPT Voice as a live translation tool.
Sentences spoken in Italian by Murati were seamlessly translated into English, with responses translated back from English to Italian and Tom Warren has captured the moments on X:
OpenAI has just demonstrated its new GPT-4o model doing real-time translations 🤯 pic.twitter.com/Cl0gp9v3kN
— Tom Warren (@tomwarren) May 13, 2024
These updates represent a significant step forward for OpenAI and the field of large language models. With a focus on accessibility, improved functionality, and real-time capabilities, OpenAI positions itself at the forefront of language processing technology. The implications of these advancements are vast, with the potential to revolutionize communication, education, and creative endeavors.
Read more about the GPT-4o in the blog post by the OpenAI team here.
Featured image credit: OpenAI