November 22, 2024
Digital Media News

OpenAI debuts their GPT-4o: here’s everything you need to know

OpenAI has just launched their latest flagship model dubbed GPT-4o, terming it their fastest, most powerful and most human-performing AI model so far.

“We’re very, very excited to bring GPT-4o to all of our free users out there,” Chief Technology Officer Mira Murati said at the highly anticipated launch event in San Francisco.

What is it?

According to the company, GPT-40 – the “O” standing for omni – is a revolutionary AI model that enhances human-computer relations and can generate content or understand commands in voice, text, or images.

“The new voice (and video) mode is the best computer interface I’ve ever used. It feels like AI from the movies,” said OpenAI CEO Sam Altman in a blog post. “”Talking to a computer has never felt really natural for me; now it does.”

“With GPT-4o, we trained a single new model end-to-end across text, vision, and audio, meaning that all inputs and outputs are processed by the same neural network. Because GPT-4o is our first model combining all of these modalities, we are still just scratching the surface of exploring what the model can do and its limitations,” the company said in a blog post.

Based on the live demos, GPT-4o appears to be like ChatGPT transformed into a digital personal assistant that can help users with a variety of tasks, from translations to having real-time spoken conversations.

What are some limitations of the model?

Although far ahead of its peers, OpenAI’s GPT-4o is not without its potential concerns. On its blog, OpenAI has said that their GPT-4o is still in the early stages of exploring the potential of unified multimodal interaction, which means that certain features like audio outputs are accessible in a limited form only, with preset voices. Further developments are necessary for the model to fully realize its potential.

When it comes to safety, OpenAI said that GPT-4o comes with built-in safety measures, including “filtered training data, and refined model behaviour post training”. The company has asserted that the new model has undergone extensive safety evaluations and external reviews.

Why does it matter?

This new model has been launched at a time when the great AI race is intensifying, with tech giants like Meta and Google working towards building more powerful LLMs and bringing them to their various products. This new model has arrived just a day before Google is expected to make its own announcements about Gemini, the search engine’s own giant AI tool that competes with ChatGPT.

However, the announcement will likely amplify the AI debate surging across various industries, with multiple users and professionals arguing against the increased use of artificial intelligence on ethnical and labor-oriented grounds.

When will the model be available?

GPT-4o will be made available to the public in stages. Text and image capabilities are already rolling out on ChatGPT, with some services available to free users. Meanwhile, audio and video functions will come more gradually to developers and selected partners.