OpenAI’s GPT-4 outperforms doctors in another new study

It may "know" more about treating eye problems than your own GP.
Sign up for the Freethink Weekly newsletter!
A collection of our favorite stories straight to your inbox

OpenAI’s most powerful AI model outperformed junior doctors in deciding how to treat patients with eye problems and came close to scoring as high as expert ophthalmologists — at least on this test.

The challenge: When doctors are in med school, they rotate across clinical areas, spending time with specialists in surgery, psychiatry, ophthalmology, and more to ensure they’ll have a basic knowledge of all the subjects by the time they get their medical license.

If they become a general practitioner (GP), though, they may rarely use the info they learned in some of those specialties and in treating less common conditions.

GPT-4 significantly outperformed the junior doctors, scoring 69% compared to their 43%.

The idea: Researchers at the University of Cambridge were curious to see whether large language models (LLMs) — AIs that can understand and generate conversational text — could help GPs treat patients with eye problems, something they might not be handling on a day-to-day basis.

For a study published in PLOS Digital Health, they presented GPT-4 — the LLM powering OpenAI’s ChatGPT Plus — with 87 scenarios of patients with a range of eye problems and asked it to choose the best diagnosis or treatment from four options.

They also gave the test to expert ophthalmologists, trainees working to become ophthalmologists, and unspecialized junior doctors, who have about as much knowledge of eye problems as general practitioners.

“The most important thing is to empower patients to decide whether they want computer systems to be involved or not.”

Arun Thirunavukarasu

GPT-4 significantly outperformed the junior doctors, scoring 69% on the test compared to their median score of 43%. It also scored higher than the trainees, who had a median score of 59%, and was pretty close to the median score of the expert ophthalmologists: 76%.

“What this work shows is that the knowledge and reasoning ability of these large language models in an eye health context is now almost indistinguishable from experts,” lead author Arun Thirunavukarasu told the Financial Times.

Looking ahead: The Cambridge team doesn’t think LLMs will replace doctors, but they do envision the systems being integrated into clinical workflows — a GP who is having trouble getting in touch with a specialist for advice on how to treat something they haven’t seen in a while (or ever) could query an AI, for example.

“The most important thing is to empower patients to decide whether they want computer systems to be involved or not,” said Thirunavukarasu. “That will be an individual decision for each patient to make.”

We’d love to hear from you! If you have a comment about this article or if you have a tip for a future Freethink story, please email us at [email protected].

Sign up for the Freethink Weekly newsletter!
A collection of our favorite stories straight to your inbox
Related
Should we turn the electricity grid over to AI?
AI could one day be woven throughout the grid management system — here are the pros and cons.
Has the US reached “peak obesity”?
A CDC survey suggests America’s obesity rate may be falling. Is this a turning point in the obesity epidemic? Or just a temporary plateau?
AI skeptic Gary Marcus on AI’s moral and technical shortcomings
From hallucinations to regulatory battles, Gary Marcus argues the AI status quo has failed us and it’s time citizens demand something more.
Flexport is using generative AI to create the “holy grail” of shipping
Flexport is using generative AI to read documents, talk to truckers, and create a “knowledge agent” that’s an expert in shipping.
The West needs more water. This Nobel winner may have the answer.
Paul Migrom has an Emmy, a Nobel, and a successful company. There’s one more big problem on the to-do list.
Up Next
A view of an orange and blue jet in flight, with desert terrain visible in the background.
Subscribe to Freethink for more great stories