As the AI hype-cycle has built, we’ve been treated to a plethora of claims about what sorts of improvements and breakthroughs the technology can deliver. One of the most fundamental — and potentially important — has been the idea that we can use AI to find new medicines and treatments for existing conditions where current options have come up short.
That promise has itself now come up short. IBM has announced that it will stop selling its Watson AI system as a tool for drug discovery. It’s a high-profile retreat for the company, which has aggressively marketed AI as being useful for these purposes and which ran into problems last year when reports indicated its systems had made improper, dangerous recommendations for cancer patients (the system’s recommendations were never put into effect).
While IBM cites sluggish sales as a reason for its withdrawal, deeper problems are potentially responsible. A recent deep dive by IEEE Spectrum puts context around these issues. The upshot: After years of work and a number of moonshot projects, IBM has remarkably little to show for its efforts. And the company has created a certain amount of ill will towards itself, IEEE writes, because it took an aggressive, marketing-first approach to AI and Watson, promising grandiose achievements that didn’t accurately portray what the system could actually reliably achieve.
Watson wowed the world with its performance on Jeopardy and an ability to analyze the relationships between words rather than treating them like search terms. In theory, Watson could use its engine to sort through reams of medical data in a similar fashion, finding the hidden signal within a system stuffed with noise. Reality has not cooperated. Of the small amount of research conducted on using AI to improve patient outcomes, none of it has involved IBM’s Watson.
The IEEE piece takes pains to note that IBM faced huge challenges in attempting to bring its AI program online and use it effectively for human medicine. Nothing like Watson (or what Watson was intended to be) has ever existed before. No one knew how to build it. Yoshua Bengio, a leading AI researcher at the University of Montreal, summarized the efforts to help AI understand medical texts and terminology thusly: “We’re doing incredibly better with NLP than we were five years ago, yet we’re still incredibly worse than humans.”
A Vexing Problem
Watson’s problem wasn’t that it didn’t work. The problem is, Watson doesn’t do the right stuff. While it quickly learned to ingest and process vast quantities of data, it had a great deal of trouble identifying the bits of information within a study that might lead doctors to actually change their process of care. This is particularly true if the relevant information was incidental to the main point of the research.
Because patient data wasn’t always properly formatted or even chronologically arranged, the software had trouble understanding patient histories. And the system was incapable of comparing new cancer patients against databases of previous patients to discover hidden treatment patterns, because such practices would not be considered evidence-based. Making a strong recommendation from evidence-based medicine requires double-blind studies, meta-analyses, and systemic evidence reviews, not an AI system claiming to have found a similarity between different types of patients.
It’s not clear what’s next for Watson, if anything. The tool has had some success in narrow, tailored applications with less ambiguity. But despite dozens of planned initiatives, oceans of hype, and a great deal of investment, IBM’s Watson for Drug Discovery has clearly missed its own goals.