OpenAI New GPT-4o Model Lets You Talk and Watch in Real-Time

OpenAI has introduced GPT-4o, a new AI model that allows real-time interaction using voice, video, and text. This new model is available for free through the GPT app and web interface. Users who pay for OpenAI's subscriptions, starting at $20 per month, can use the model more frequently.

OpenAI's Chief Technology Officer, Mira Murati, showcased the new model a day before Google's big AI event on May 14.

Previously, GPT-4 offered similar features but used separate models, causing slower responses and higher costs. Now, GPT-4o combines these features into one model, called an "omnimodel," for faster and smoother interactions.

The new model works like a more advanced version of assistants like Siri or Alexa but can handle more complex tasks. Murati believes GPT-4o represents the future of human-machine interaction, making it more natural.

Researchers Barret Zoph and Mark Chen demonstrated various uses of GPT-4o, highlighting its ability to handle live conversations. For example, the model can stop and adjust its response if interrupted, and it can change its tone, such as reading a bedtime story in different voices.

GPT-4o can also solve visual problems in real-time. In one demo, Zoph filmed himself writing an algebra equation, and the model guided him like a teacher without giving direct answers.

Like previous versions, GPT-4o can remember past interactions, giving it a sense of continuity. New features include live translation, conversation search, and real-time information lookup.

The demo had some glitches, like the model making unsolicited comments, but it recovered well. This model opens up many powerful features to the public that were previously only available to paying users. However, it's not yet clear how many free interactions users will get before needing to pay. Subscribers will still have more usage capacity compared to free users.

Share on

You may also like

This website uses cookies to improve your web experience.