Welcome to the AI Irruption
Biopharma, like the rest of the world, appears to be on the threshold of profound, technology-induced change. Incredible advances in artificial intelligence, manifested most recently in GPT-4, are here.
This technology, Ezra Klein explains in the New York Times, “changes everything.” Bill Gates describes it as “the most important advance in technology since the graphical user interface,” and declares, “the age of AI has begun.” Similarly, Times columnist Thomas Friedman argues:
“This is a Promethean moment we’ve entered — one of those moments in history when certain new tools, ways of thinking or energy sources are introduced that are such a departure and advance on what existed before that you can’t just change one thing, you have to change everything. That is, how you create, how you compete, how you collaborate, how you work, how you learn, how you govern and, yes, how you cheat, commit crimes and fight wars.”
At an entrepreneurship salon at Harvard this week, I discussed GPT-4 with Dr. Zak Kohane, Chair of the Department of Biomedical Informatics at Harvard University (disclosure: I’m a lecturer in the department), and Editor-in-Chief of the soon-to-be-launched NEJM-AI.
Kohane received early access to GPT-4. He has just completed a book, The AI Revolution in Medicine: GPT-4 and Beyond, to be published in mid-April, about the impact of emerging AI technology on healthcare. I read an advanced, draft copy of the book. Kohane’s co-authors are Peter Lee, Corporate Vice President and Head of Microsoft Research, and Carey Goldberg, a distinguished journalist.
From both the book and the salon, the three most striking features of GPT-4 seem to be:
- Its ability to reason;
- Its ability to communicate and engage with people in natural language;
- The fact that no one really understands how it works.
How did GPT-4 impress Kohane? For starters, it performs spectacularly on standardized exams like the medical boards, and seems to be able to reason thoughtfully, Kohane says.
For example, we discussed the ability of GPT-4 to respond to an apparent paradox that Kohane says stumps the vast majority of the non-physician data scientists. The question: why is a low white blood cell count between midnight and 8 am associated with far worse outcomes than a low count between 8am and 4 pm?
GPT-4’s top suggestion, Kohane says, was the correct answer: the issue isn’t so much the low blood count but rather the existence of a blood draw in the middle of the night. That signals the patient is experiencing some sort of medical crisis.
GPT-4 can also provide sophisticated differential diagnoses, Kohane says, and suggest relevant next steps.
He posed GPT-4 a question from his own specialty, pediatric endocrinology:
“I gave it a very complicated case of ambiguous genitalia that I was actually called for once back in my training. And it’s able to go through everything from the clinical presentation to the molecular biology. It had a disagreement with me and was able to cogently disagree with me, and it was also able to articulate concerns for the parents of this child and for the future engagement of the child in that discussion. So, on the surface it’s acting like one of the most sensitive, socially aware doctors I’ve ever met. But we have no guarantee that it is such.”
The ability of GPT-4 to engage in such a human-like fashion is one of the most striking, and disarming, characteristics of the technology. Many who engage with GPT-4 over time describe the sense of developing a close relationship with it, in a fashion that can feel dislocating. Kohane, Lee, Friedman and others all describing losing sleep after spending time with GPT-4. They are overwhelmed, it seems, by the power and possibility of what they’ve experienced.
“GPT-4’s abilities to do math, engage in conversation, write computer programs, tell jokes, and more were not programmed by humans,” wrote Lee, of Microsoft Research. These capacities emerged unexpectedly, he wrote, “as its neural network grew.“
This creates what Lee calls a “very big problem.” He writes: “Because we don’t understand where GPT-4’s capabilities in math, programming, and reasoning come from, we don’t have a good way of understanding when, why and how it makes mistakes or fails….”
The implications are dizzying. As Kohane writes, “I realized we had met an alien agent and it seemed to know a lot about us, but at the moment, I could not decide if it should be given the keys to our planet or sealed in a bunker until we figured it out.”
Implications for Healthcare
Some AI experts, like University of Toronto professor and author Avi Goldfarb, said the AI technologies will serve as a democratizing force in healthcare. He suggests, in a podcast interview with Patrick O’Shaughnessy, it will be able to “automate diagnosis,” with the consequence of “upskilling” the “millions of medical professionals” like nurses and pharmacists. The consequence, he suggests, is that:
“There’s hundreds of thousands of doctors in the U.S. and their special skill in diagnosis is going to go away. They’ll have to retool and figure out how to deal with that. But there’s millions of other medical professionals who are now going to be able to do their jobs much better, be more productive. And that upskilling provides a lot of what we see as the hope and opportunity for AI.”
I asked Kohane about this, and (to my surprise) he seemed to largely agree, albeit with a slightly different framing. He notes that we have a crisis resulting from a shortage of primary care doctors. Massachusetts (as the Boston Globe has recently reported) is being hit particularly hard. While we may have an idealized view of how the best primary care doctors can treat patients, Kohane argues, this is generally not the lived reality, and suggests that a nurse practitioner or physician assistant, coupled with AI, could generally offer a higher level of care for patients than a typical primary care doctor without AI. Given the shortage of trained medical professionals, AI can help improve the quality and quantity of available care.
It’s also clear that patients will have access – and, through Bing (which works with GPT-4 when accessed from Microsoft’s Edge browser), already have access – to this knowledge and information.
Many patients and caregivers are eager for this capability, as Goldberg writes in the book. The problem is that GPT-4 (like surgeons and Harvard grads) tends to be frequently correct but rarely in doubt. It still suffers from the problem of “hallucinations,” making up information that sounds plausible but isn’t accurate. Like humans suffering from the Dunning-Kruger Effect, it can insist that it’s correct, when it’s wrong. I saw that when GPT-4 tried to persuade me that Goose, not Merlin, uttered the line “That MIG really screwed him up,” in the original Top Gun.
For now, everyone seems to acknowledge the hallucination problem, and call for “human in the loop” approaches. But what happens as we gain more confidence in GPT-4 and, motivated by both convenience and cost, are increasingly tempted to take the human out of the loop?
Stepping Back
What seems clear is that we are truly experiencing what economist Carlota Perez has described (see here) as the “irruption phase” of emerging technology. We recognize that there’s something promising and incredibly exciting, and now everyone is trying to figure out what to make of it, and how to apply it.
These days, it feels like every healthtech person I know who hasn’t started their own VC fund (as noted here) is either starting their own health+AI company or writing a book about health+AI, and in some cases both. Microsoft Office programs are about to be boosted by GPT-4. Other companies, such as the regulatory intelligence company Vivpro – are already offering GPT-powered tools. Every consultancy is offering executives frameworks and navigation guides to the new technology.
The truth, of course, is that no one has any idea how things are going to evolve. The pharma company of the future, the healthcare system of the future, the payor of the future, perhaps even the FDA of the future: all are likely to be profoundly changed by technologies like GPT-4.
The technology will likely first arrive as incremental, point solutions, says Goldfarb, of the University of Toronto. Eventually, however, the real productivity gains arise from more fundamental change. The classic example here, as I’ve discussed, and as Goldfarb also cites, is that dropping electric generators into factories built around steam power didn’t have much impact. But reconceptualizing the structure of factories from the ground up, in a fashion enabled by electricity, was transformative.
As Microsoft AI expert Sebastien Bubeck observes in Kohane’s book, “GPT-4 has randomized the future. There is now a thick fog even just one year into the future.”
What an amazing, terrifying, thrilling, and hopeful time to be alive. For those of us in medicine and biomedical science: what an opportunity, and profound responsibility, to be in the arena, actively shaping the future we hope to create and aspire to inhabit.