Google’s new Gemini AI beats GPT-4 in 30 of 32 tests

But will the difference be enough to matter in real life?
Sign up for the Freethink Weekly newsletter!
A collection of our favorite stories straight to your inbox

Tech giant Google has finally unveiled its much-hyped Gemini AI, a series of generative AI models it claims are its “largest and most capable” to date. 

“This new era of models represents one of the biggest science and engineering efforts we’ve undertaken as a company,” said Google CEO Sundar Pichai. 

Multimodal AI: Generative AIs are algorithms trained to create original content in response to user prompts. OpenAI’s first iteration of ChatGPT, for example, can understand and produce human-like text, while its DALL-E 2 system can generate images based on text prompts. 

While those systems understand and generate just one type of content, a multimodal generative AI can work with several — in September, OpenAI announced a multimodal version of ChatGPT that could understand image, voice, and text inputs.

“Its capabilities are state-of-the-art in nearly every domain.”

Demis Hassabis

The Gemini era: According to Google, multimodal AIs are traditionally created by combining separate, specialized models into one program, but it took a different approach with its Gemini AI, training it to be multimodal from the start.

“This helps Gemini seamlessly understand and reason about all kinds of inputs from the ground up, far better than existing multimodal models — and its capabilities are state-of-the-art in nearly every domain,” wrote Demis Hassabis, CEO and cofounder of Google DeepMind.

In addition to being highly capable, Google says the Gemini AI is also its “most flexible” model. This has allowed the company to create three different sizes of the AI: Ultra, Nano, and Pro. 

  • Gemini Ultra is the most powerful model, designed for complex tasks. According to Google, it’s the first generative AI model to outperform human experts on the MMLU, a benchmark assessing knowledge across 57 subjects. Google is currently soliciting feedback on Ultra from select users, but expects to make it widely available in 2024.
  • Gemini Nano is the least capable model, but it’s small and efficient enough to run locally on smartphones. Google has already made it available on its Pixel 8 Pro — owners of that smartphone can use the AI to summarize audio recordings or generate responses to WhatsApp messages.
  • Gemini Pro, meanwhile, falls between Nano and Ultra in terms of capabilities and size. Google has integrated an English-language version of that model into its ChatGPT-like Bard, which will reportedly get an Ultra upgrade in 2024.

The big picture: Like the rest of the tech industry, Google has been racing to catch up with OpenAI in the generative AI space ever since the release of ChatGPT in 2022, and it’s been hyping the Gemini AI for months as the tech that will put it ahead. 

While Gemini did outperform OpenAI’s GPT-4 on 30 of 32 benchmarks tested (including the MMLU), the difference was often just a percentage point or two — meaning Google may be ahead, but only by a little and only compared to an AI model that’s been out for 9 months already.

“It’s clear that Gemini is a very sophisticated AI system … [but] it’s not obvious to me that Gemini is actually substantially more capable than GPT-4,” Melanie Mitchell, an AI researcher at the Santa Fe Institute in New Mexico, told MIT Technology Review.

We’d love to hear from you! If you have a comment about this article or if you have a tip for a future Freethink story, please email us at [email protected].

Sign up for the Freethink Weekly newsletter!
A collection of our favorite stories straight to your inbox
Related
Google’s $1 billion bet on Africa’s digital future
Just 37% of sub-Saharan Africans use the internet today, but Google predicts the next 10 years will be the region’s “digital decade.”
Should we turn the electricity grid over to AI?
AI could one day be woven throughout the grid management system — here are the pros and cons.
AI skeptic Gary Marcus on AI’s moral and technical shortcomings
From hallucinations to regulatory battles, Gary Marcus argues the AI status quo has failed us and it’s time citizens demand something more.
Flexport is using generative AI to create the “holy grail” of shipping
Flexport is using generative AI to read documents, talk to truckers, and create a “knowledge agent” that’s an expert in shipping.
The West needs more water. This Nobel winner may have the answer.
Paul Migrom has an Emmy, a Nobel, and a successful company. There’s one more big problem on the to-do list.
Up Next
A black and white photo of the advice columnist known as 'Dear Abby' with generative text collage elements.
Subscribe to Freethink for more great stories