ChatGPT gets even smarter after OpenAI reveals new ‘brain’ for free called GPT-4o – it can ‘see’ photos and talk to you
OPENAI is releasing a smarter version of its AI chatbot for free to users around the world – and it can see photos and even speak to you like a super-charged Siri.
The new GPT-4o model was announced at Spring Update, a special event live-streamed for free around the world.
It's being released on Monday and is expected to roll out over the next few weeks.
Regular ChatGPT runs on GPT 3.5, with an upgrade to the smarter GPT 4 requiring ChatGPT Plus, a paid-for subscription.
Now GPT-4o will be available on the ChatGPT app and website – and doesn't cost a penny.
"The special thing about GPT-4o is that it brings GPT-4 level intelligence to everyone, including our free users," said Mira Murati, ChatGPT’s chief technology officer.
You can upload images and get feedback on those photos based on what the AI sees.
For instance, you could share a photo of a plant, a spider, or a landmark to find out more information about it.
Or you could ask it to translate the information on a photo of a foreign-language sign or menu.
You can even share a selfie with the app and ChatGPT will be able to describe the emotions it thinks you're feeling.
And it can remember things you've said before, even in different conversations.
ChatGPT can also use GPT-4o for voice conversations.
The Siri- or Alexa-style assistant will speak to you in natural language.
This sci-fi upgrade brings us closer to the movie Her, where humans can build what feel like close relationships with AI through actual spoken conversation.
The app can also now function as a live translator between two people who speak different languages.
A live demo of this at the Spring Update event showed ChatGPT impressively translating English to Italian and vice versa.
AI REVOLUTION
OpenAI shot to international fame after releasing its ChatGPT chatbot at the end of 2022.
The AI helper kicked off an AI boom and has attracted over 100 million users.
OpenAI – led by CEO Sam Altman – is valued at more than $80 billion and sells a premium version of its chatbot called ChatGPT Plus to users.
The company is headquartered in San Francisco and has received billions of dollars in investment from Microsoft.
What is ChatGPT Plus?
Here's what you need to know...
- ChatGPT Plus is the premium version of OpenAI's chatbot
- It costs $20 a month and comes with additional benefits
- OpenAI says: "It offers availability even when demand is high, faster response speed, and priority access to new features."
- For instance, you'll gain access to the more powerful GPT-4 language model.
- You can browse, create and use GPTs
- You'll gain access to extra tools like DALL-E image generation
- You can get access to current information courtesy of search engines
- And you can have voice conversations with ChatGPT too
WHAT IS CHATGPT?
ChatGPT is a language model that can produce text.
So you feed it some kind of prompt – “Write a short poem about flowers,” for example – and it will create a chunk of text based on that.
If it works correctly, the text will read in natural language as if it was written by a human.
GPT stands for Generative Pre-Trained Transformer and describes the type of model that can create AI-generated content.
ChatGPT can hold conversations and even learn from things you’ve said.
It can handle very complicated prompts and is even being used by businesses to help with work.
But note that it might not always tell you the truth.
"These models were trained on vast amounts of data from the internet written by humans, including conversations, so the responses it provides may sound human-like," Open AI explained.
READ MORE SUN STORIES
OpenAI's GPT-4o makes voice assistants look 'primitive' – EXPERT ANALYSIS
Dr Andrew Rogoyski from the University of Surrey's Institute for People-Centred AI, said...
"This was another "Sora moment" for OpenAI. They made existing voice assistants look utterly primitive. Making GPT4o available to free users is great news for users, but is an incredibly aggressive market move. How will they monetise this?
"OpenAI's conversational GPT-4o is impressive. The responses are very fast, a key feature in establishing engagement with the user. The responses were also pretty flawless in terms of a natural voice, including very human-like nuances and additions. The responses to unstructured questions were particularly lifelike. The overall demo was so good I was left wondering whether they'd rigged the demo and used a real human voice actor. I'm looking forward to trying GPT-4o.
"The applications for this technology are endless. Companies using this technology in a call centre may suffer from a new problem – customers failing to drop the call because they're enjoying their chat with the robot!
"An interesting benefit of using voice and video feeds is that OpenAI will acquire enormous volumes of completely new data, independent of Internet repositories, untainted by repetition and fakes, with metadata related to who the user is, why they were calling, who they were calling, and so on. This data will greatly enrich the data for training future AIs.
"One of the big challenges ahead for Big AI is sequestering vast sources of clean data for training future models, so avoiding copyright infringement, model collapse and other problems. It's not clear what OpenAI's big models, like Sora, are trained on, but it's equally clear that they are targeting voice and video to create new sources of training data. I'm waiting to see what they say about the privacy of conversations.
"The impact of AI on the future of work remains completely uncertain. The natural conversation demonstrated today opens up a host of business opportunities but also puts in danger a swathe of customer agent jobs, from help centres to sales calls.
"The pace of AI development over the last few months has meant practically every job and skill is going to change, probably within just a couple of years. Nobody really knows how this is going to pan out. It could turbo-boost economies, or it could dismantle them."
"It is important to keep in mind that this is a direct result of the system's design (i.e. maximizing the similarity between outputs and the dataset the models were trained on).
"And that such outputs may be inaccurate, untruthful, and otherwise misleading at times."