In Harvard study, AI offered more accurate emergency room diagnoses than two human doctors

A new study examines how large language models perform in a variety of medical contexts, including real emergency room cases — where at least one model seemed to be more accurate than human doctors. The study was published this week in Science and comes from a research team led by physicians and computer scientists at Harvard Medical School and Beth Israel Deaconess Medical Center. The researchers said they conducted a variety of experiments to measure how OpenAI’s models compared to human physicians.

In one experiment, researchers focused on 76 patients who came into the Beth Israel emergency room, comparing the diagnoses offered by two internal medicine attending physicians to those generated by OpenAI’s o1 and 4o models. These diagnoses were assessed by two other attending physicians, who did not know which ones came from humans and which came from AI. ” In Harvard Medical School’s press release about the study, the researchers emphasized that they did not “pre-process the data at all” — the AI models were presented with the same information that was available in the electronic medical records at the time of each diagnosis.

With that information, the o1 model managed to offer “the exact or very close diagnosis” in 67% of triage cases, compared to one physician who had the exact or close diagnosis 55% of the time, and to the other who hit the mark 50% of the time. “We tested the AI model against virtually every benchmark, and it eclipsed both prior models and our physician baselines,” said Arjun Manrai, who heads an AI lab at Harvard Medical School and is one of the study’s lead authors, in the press release. To be clear, the study didn’t claim that AI is ready to make real life-or-death decisions in the emergency room.

” In a post about the study, Kristen Panthagani, an emergency physician, said this is an “an interesting AI study that has led to some very overhyped headlines,” especially since it was comparing AI diagnoses to those from internal medicine physicians, not ER physicians. “If we’re going to compare AI tools to physicians’ clinical ability, we should start by comparing to physicians who actually practice that specialty,” Panthagani said. ” She also argued, “As an ER doctor seeing a patient for a first time, my primary goal is not to guess your ultimate diagnosis.

” This post and headline have been updated to reflect the fact that the diagnoses in the study came from internal medicine attending physicians, and to include commentary from Kristen Panthagani.

In Harvard study, AI offered more accurate emergency room diagnoses than two human doctors

Bagikan Berita Ini

Berita Terkait

Peringatan Presiden Trump: Jangan Biarkan China Mengambil Alih Pasar Kripto

Menghadapi Masa-masa Terburuk, Bitwise Melihat Peluang di Balik Keterpurukan Bitcoin

Mantle's Strategic Shift to Chainlink CCIP: A New Paradigm in Cross-Chain Security

Coinbase's Smart Wallet Upgrade: A Step Towards Seamless Multi-Chain Interactions

Ketua CFTC: Undang-Undang Keterbukaan Hampir Rampung sebelum Deadline Agustus

Institutional Demand for Bitcoin Remains Strong Despite Market Volatility

Kraken Minta Pengadilan Delaware untuk Mengeluarkan Putusan Akhir terhadap Mazars setelah Menerima Penghargaan Arbitrasi Sebesar $22 Juta

Ketua CFTC: Undang-Undang Keterbukaan Hampir Rampung sebelum Deadline Agustus

Institutional Demand for Bitcoin Remains Strong Despite Market Volatility

Kraken Minta Pengadilan Delaware untuk Mengeluarkan Putusan Akhir terhadap Mazars setelah Menerima Penghargaan Arbitrasi Sebesar $22 Juta

Kondisi Pasar Kripto: Bitcoin Turun ke $62.000, Apa yang Terjadi Selanjutnya?

Institutional Investment in Bitcoin ETFs Sees Revival Amidst Market Volatility

Peran Baru Circle dalam Sistem Keuangan Digital AS: Membangun Kepercayaan melalui Pengawasan Federal

Bitcoin’s New Debt Machine is Facing Its First Major Test

Peringatan Presiden Trump: Jangan Biarkan China Mengambil Alih Pasar Kripto

Menghadapi Masa-masa Terburuk, Bitwise Melihat Peluang di Balik Keterpurukan Bitcoin

Mantle's Strategic Shift to Chainlink CCIP: A New Paradigm in Cross-Chain Security

Coinbase's Smart Wallet Upgrade: A Step Towards Seamless Multi-Chain Interactions