Awaken The World Through Enlightened Media

The Pediatric AI That Outperformed Junior Doctors

August 19, 2019

by Aaron Saenz: Training a doctor takes years of grueling work in universities and hospitals. Building a doctor may be as easy as teaching an AI how to read…

Artificial intelligence has taken another step towards becoming an integral part of 21st-century medicine. New research out of Guangzhou, China, published February 11th in Nature Medicine Letters, has demonstrated a natural-language processing AI that is capable of out-performing rookie pediatricians in diagnosing common childhood ailments.

The massive study examined the electronic health records (EHR) from nearly 600,000 patients over an 18-month period at the Guangzhou Women and Children’s Medical Center and then compared AI-generated diagnoses against new assessments from physicians with a range of experience.

The verdict? On average, the AI was noticeably more accurate than junior physicians and nearly as reliable as the more senior ones. These results are the latest demonstration that artificial intelligence is on the cusp of becoming a healthcare staple on a global scale.

Less Like a Computer, More Like a Person

To outshine human doctors, the AI first had to become more human. Like IBM’s Watson, the pediatric AI leverages natural language processing, in essence “reading” written notes from EHRs not unlike how a human doctor would review those same records. But the similarities to human doctors don’t end there. The AI is a machine learning classifier (MLC), capable of placing the information learned from the EHRs into categories to improve performance.

Like traditionally-trained pediatricians, the AI broke cases down into major organ groups and infection areas (upper/lower respiratory, gastrointestinal, etc.) before breaking them down even further into subcategories. It could then develop associations between various symptoms and organ groups and use those associations to improve its diagnoses. This hierarchical approach mimics the deductive reasoning human doctors employ.

Another key strength of the AI developed for this study was the enormous size of the dataset collected to teach it: 1,362,559 outpatient visits from 567,498 patients yielded some 101.6 million data points for the MLC to devour on its quest for pediatric dominance. This allowed the AI the depth of learning needed to distinguish and accurately select from the 55 different diagnosis codes across the various organ groups and subcategories.

When comparing against the human doctors, the study used 11,926 records from an unrelated group of children, giving both the MLC and the 20 humans it was compared against an even playing field. The results were clear: while cohorts of senior pediatricians performed better than the AI, junior pediatricians (those with 3-15 years of experience) were outclassed.

Helping, Not Replacing

While the research used a competitive analysis to measure the success of the AI, the results should be seen as anything but hostile to human doctors. The near future of artificial intelligence in medicine will see these machine learning programs augment, not replace, human physicians. The authors of the study specifically call out augmentation as the key short-term application of their work. Triaging incoming patients via intake forms, performing massive metastudies using EHRs, providing rapid ‘second opinions’—the applications for an AI doctor that is better-but-not-the-best are as varied as the healthcare industry itself.

That’s only considering how artificial intelligence could make a positive impact immediately upon implementation. It’s easy to see how long-term use of a diagnostic assistant could reshape the way modern medical institutions approach their work.

Look at how the MLC results fit snugly between the junior and senior physician groups. Essentially, it took nearly 15 years before a physician could consistently out-diagnose the machine. That’s a decade and a half wherein an AI diagnostic assistant would be an invaluable partner—both as a training tool and a safety measure. Likewise, on the other side of the experience curve you have physicians whose performance could be continuously leveraged to improve the AI’s effectiveness. This is a clear opportunity for a symbiotic relationship, with humans and machines each assisting the other as they mature.

Closer to Us, But Still Dependent on Us

No matter the ultimate application, the AI doctors of the future are drawing nearer to us step by step. This latest research is a demonstration that artificial intelligence can mimic the results of human deductive reasoning even in some of the most complex and important decision-making processes. True, the MLC required input from humans to function; both the initial data points and the cases used to evaluate the AI depended on EHRs written by physicians. While every effort was made to design a test schema that removed any indication of the eventual diagnosis, some “data leakage” is bound to occur.

In other words, when AIs use human-created data, they inherit human insight to some degree. Yet the progress made in machine imaging, chatbots, sensors, and other fields all suggest that this dependence on human input is more about where we are right now than where we could be in the near future.

Data, and More Data

That near future may also have some clear winners and losers. For now, those winners seem to be the institutions that can capture and apply the largest sets of data. With a rapidly digitized society gathering incredible amounts of data, China has a clear advantage. Combined with their relatively relaxed approach to privacy, they are likely to continue as one of the driving forces behind machine learning and its applications. So too will Google/Alphabet with their massive medical studies. Data is the uranium in this AI arms race, and everyone seems to be scrambling to collect more.

In a global community that seems increasingly aware of the potential problems arising from this need for and reliance on data, it’s nice to know there’ll be an upside as well. The technology behind AI medical assistants is looking more and more mature—even if we are still struggling to find exactly where, when, and how that technology should first become universal.

Yet wherever we see the next push to make AI a standard tool in a real-world medical setting, I have little doubt it will greatly improve the lives of human patients. Today Doctor AI is performing as well as a human colleague with more than 10 years of experience. By next year or so, it may take twice as long for humans to be competitive. And in a decade, the combined medical knowledge of all human history may be a tool as common as a stethoscope in your doctor’s hands.