The promise and challenge of using AI for lung cancer detection
Announcer: This podcast is intended solely for educational purposes and presents information of a general nature. It is not intended to guide or determine any specific individual situation and persons should consult qualified professionals before taking specific action. The views expressed in this podcast are those of the speakers and not those of Milliman.
Rebecca Driskill: Hello and welcome to "Critical Point," a podcast brought to you by Milliman. I'm Rebecca Driskill, and I'll be your host. On today's episode we're talking about artificial intelligence and its potential to transform healthcare, including processes and patient outcomes. One area where we're beginning to see AI put to use is in lung cancer screening using CT scans. Lung cancer is the number-one cancer killer in the US, so methods to improve the screening process hold a lot of promise, but AI technology in this area is also not without its challenges. Joining me today is one of the foremost experts on the topic. Jim Mulshine is a thoracic medical oncologist who spent 25 years at the National Cancer Institute in Bethesda, Maryland. He's now at Rush University Medical Center. Jim, good morning.
James Mulshine: Good morning.
Rebecca Driskill: Jim has worked closely with Bruce Pyenson here at Milliman studying this topic. Bruce is a principal and consulting actuary with the firm. Morning, Bruce.
Bruce Pyenson: Good morning.
Rebecca Driskill: So I just want to jump in and have you guys start by providing a little bit of background information on lung cancer screening. Before we start talking about AI and its implications, how does a patient actually get screened for lung cancer, and how is lung cancer detected?
James Mulshine: Sure. Lung cancer has been the leading cause of cancer death on the planet for a long time, and it has been so because it is diagnosed late in the natural history of the disease, and at that time there's symptoms, and the ability for the medical field to intervene is very limited, so screening is an effort to advance the diagnosis, to find lung cancer earlier. And to do that it has to be done at a point when an individual has no symptoms, so we look for individuals that are at very high risk for lung cancer, and currently that's done by looking at smoking history. And if you have an individual that's been a smoker, is over age 50, 55 those people who have smoked more than 20 to 30 packs are considered by various groups to be high-risk, and in addition to that they have to be without symptoms related to lung cancer and in otherwise good health. Those individuals undergo a low-dose CT scan, which is an X-ray study that's done in lots of places across the United States and the world. Takes about three seconds, and it's basically getting a three-dimensional X-ray of the chest. And then based upon analysis of that they either have a high degree of suspicion for lung cancer or, more often than not, no evidence for lung cancer. And the ones that have some suspicious findings, there's a systematized workup that they undergo. And as a result of this is validated by major studies in the New England Journal of Health first from the United States in the National Lung Screening Trial and recently by the NELSON group, which is a research group in the Netherlands. It's associated with a 20% to 25% reduction in mortality based upon just doing two or three rounds of screening with follow-up for both of those sites now up to 10 years, and so it's felt by the U.S. Preventive Services Task Force to be an effective early detection tool and reimbursed by federal insurers.
Bruce Pyenson: But the 20% is really a gross underestimate of the impact if people are screened annually for years. And from an actuarial standpoint I was convinced of the value of lung cancer screening or its potential by seeing the huge difference in mortality between lung cancer detected at an early stage and lung cancer detected at a late stage. It's a much bigger difference than you see for other cancers, plus the ability to actually detect early-stage lung cancer, to find nodules in the lung and to follow them to see if they're growing at a cancerous rate when they're still very small, so that convinced me that from an actuarial standpoint there's something here, and of course the clinical trials validated that. For the actuaries who might be listening, I'd point out that back in the 1920s one of the first voices connecting tobacco to lung cancer was an actuary working at Prudential.
Rebecca Driskill: When somebody is screening for lung cancer are you looking at measuring the growth in nodules? And then how could AI actually be useful for that, or could it be useful for that?
James Mulshine: Yes. It's a very important question. There has been early detection with mammography for breast cancer for decades now, and it has been shown in larger studies, long-term studies to be beneficial in a significant way. Women that undergo annual breast mammography are very, very unlikely to die of breast cancer, and so that is a traditional radiological thing. It's actually a planar study, meaning that it looks at changes in the breast in two dimensions, flat, planar-type structures, and it measures that slice of image that's available. A CT scan is very different because it's three-dimensional. It gives you a full representation of the entire 360 degrees of the lung. Therefore there's lots of other structures there, a lot going on, lots of blood vessels, airways, bony structures, so it's much more complicated. So complicated data sets are the mainstream target of AI development, because in the current clinical situation a radiologist can look at the complex findings within the thorax and probably only action about one to two percent of the information that's available on that image, whereas computers, which have incredibly more capability for doing certain types of tasks, can potentially look at a much fuller recovery of information available from that evaluation of an individual's full thoracic contents.
Bruce Pyenson: Part of the attraction of AI for imaging and CT imaging is that it's inherently digital, and so you have this huge wealth of digital information. It's not a matter of going into medical records and figuring out what was captured there. It's inherently digital and structural, you can identify what's actually going on physiologically or from the phenomenon that's observed.
I think the other aspect of this is that what's very important with lung cancer screening is connecting screens from one year to the next year or over an interval of three months or six months or perhaps annually over decades, and that's the sort of thing that AI excels at. So there's a number of reasons why I'm very optimistic at this application of AI and perhaps less optimistic about some other applications where the data that's being used is probably dirtier and less meaningful inherently.
I think there's some challenges in getting there. The quality of information coming in is always an issue in any kind of data analytics, and AI is no exception, and I don't think AI has changed the fundamental law of data analytics, which is "garbage in, garbage out." But I think there's another area that's very promising, which is identifying risk of individuals within a population, what the real characteristics of risk are so that we know socioeconomic drivers are very important. Perhaps other conditions and comorbidities are very important for an individual's risk of lung cancer or their risk of emphysema or COPD or their risk of heart disease, so having the access to that kind of information could help us identify really people who should get scanned who perhaps aren't falling into the very high-risk category that are being targeted today.
James Mulshine: Yeah. So, Bruce, I absolutely agree with you, and I think going back to what you just said there's two perspectives that I have. The first is that the use case of lung cancer screening as an exemplar for the value of AI is kind of exceptional. Related to what you were saying, the lung is an air-filled structure, and a small emerging tumor is a water-filled structure. All over the rest of the body for all of the rest of the cancers you're in an organ that is water-filled with a tumor that's water-filled so that the difference in the two structures is muted, whereas when you're looking for early lung cancer the lung is like a balloon, and so once you get into the lung with the high energy that you're using for the medical radiation you're using the resolution that's possible in the lung because it's an air-filled structure compared to the water-filled tumor-- the ratio is unbelievably strong so that you have very, very good resolution, which is something that you won't have in other settings, but it's true of all the structures within the lung. So the lung is just a particularly favorable place for imaging for this digital type of information in a way that reflects the anatomy and physiology of the individual.
Bruce Pyenson: I think the lung is also a huge contributor to bad outcomes in the population. Just to emphasize that, more women die of lung cancer than breast, ovarian and cervical cancers combined. It's by far the biggest cancer killer at probably close to 190,000 deaths per year according to the Global Burden of Disease Study, so it's an appropriate public health target, especially because the high-risk population can reasonably be determined based on smoking history and some other histories.
James Mulshine: Yes, and unfortunately just recently there was a publication that showed in well-developed nations where women have been smoking much more heavily than in the rest of the world there's a crossover now, so there's more young women developing lung cancer than men in the United States and about 12 other major advanced nations, so this is a tragedy. These are people who started smoking at a time of a big marketing approach to women and smoking, and it's just unfortunate. But the good news is that screening in women seems to be more favorable than screening in men, and so it's very important for us to get the message out that this is an important tool for sustaining one's health.
The one other thing I would say about the digital information in terms of characterizing valuable biomarkers that really tell you the risk and tell you what the state of the biology is – what you had mentioned is that you can look over time at changes in the progression of lung cancer or other diseases, and it turns out that that's much more reliable than just taking a one-time snapshot of what's going on, because you didn't know where it's coming from. When you look across a time domain and you can, measure that as a function of time that's much more reliable, and because the lung has exquisite resolution and we have these really remarkable imaging tools that in the face of advancing computer technology have allowed us to do much more precise characterization we have a very important, robust tool to guide who really has problems because the disease is advancing within the time of observation of that individual.
Bruce Pyenson: So, Jim, let me challenge you--
James Mulshine: Put that in English?
Bruce Pyenson: On the optimism there. All these tools don't matter if we don't get people screened.
James Mulshine: Well, I think that's right, but all these observations don't matter either if we don't get rid of the garbage you were talking about. So standardization of how you do the imaging and a really, really close attention to the technique so that you're getting the same kind of evaluation from time to time allows for a much more precise biomarker. That is kind of an important thing, because if people start complying with screening, and we're getting the wrong answers that's going to erode confidence in this new tool. We've got to do the tool superbly to make sure that we guide clinical management in a productive way in order to get people confident that this is a worthwhile application.
Bruce Pyenson: So can you imagine a day when Mount Sinai says, "Get your lung scan here because we use the best AI"?
Rebecca Driskill: Yeah, I mean, actually I want to jump in take a step back first. What are some of the real-world challenges here, you know, you've got one doctor in New York City versus a radiologist in Chicago, are there challenges to AI in the real world that have to be taken into consideration?
Bruce Pyenson: So the variation in healthcare between cities is notable, but I'd say the variation in performance from one office to down the hall in the next office is also there. We have a few major brand manufacturers of CT scans, software that is perhaps customized to particular machines and calibration that isn't where it could be, so I think there's a number of mechanisms for standardization here. Jim, your thoughts on the standardization, the quality control?
James Mulshine: Yeah, Bruce, I think you're exactly right. The issue about generalizing the benefit of this new approach depends upon a certain amount of standardization in terms of how you acquire the image, how you read the image, what software you use. And it's not that you need to use the same brand of tool as long as whatever tools are being used kind of have some kind of common standards for what's big, what's little, what's good, what's bad. And the kind of conspiracy of AI developers such that there is some kind of standardized approach does not yet exist, and so if everybody races out and gets their material, and in this case it happens to be cases with clinical outcome, and they develop their AI tools it may be that the AI tools are useful if you happen to be studying somebody who has the same characteristics of what was developed in the Southwest or developed in India or developed in France but may not work in Japan or Turkey or wherever. So there's got to be some understanding of how we do this in a way that we understand that these are going to be used as general tools, and so overcoming the balkanization is something that—like the radiology professional society called the RSNA, Radiological Society of North America, has been working with a subset of imaging scientists and clinicians involved in screening and stuff like that to address the standardized issue. The FDA is very much involved with this. The National Institute of Standards is very much involved in this. All the major imaging manufacturers are involved in this because there's an understanding that if there isn't some common rubric holding this together it's just going to be chaos. Even if it's very sophisticated it's going to be chaos along the lines of what Bruce was saying. So that group of people are in an organization called the Quantitative Imaging Biomarker Alliance, and they've been working for the last decade on developing these standards to make it transparent, to make it robust and to try to get buy-in so that these kinds of balkanization issues do not erode the ability of these new, very, very exciting tools to help the public.
Bruce Pyenson: Now, listening to you describe the future there makes it sound very appealing. I think the current state of screening is that it is very effective today with the current technology and the current processes, and the mortality reductions from observational studies I've seen are on the order of 80%. From my standpoint it's not as though we must have the new tools in order to have effective lung cancer screening. We have that today with the existing technology. What I'm looking for with the new technology is a couple of things. One is that we make the process much more efficient and frankly lower-cost, and the other thing is that we use lung cancer screening as a paradigm for improving the efficiency of so many other things in healthcare. My main goal in health policy and health actuarial work is that we are spending way too much and getting pretty mediocre outcomes in the U.S. healthcare system, so I'm excited by the application of new technology to help fix that. And I think that's part of the potential you're describing, but I'd like to hear from you, Jim, on whether you think that's possible, whether we can make this all a lot more efficient and lower-cost.
James Mulshine: Yeah, so, Bruce, you're absolutely right that right now the results in many of these reports are remarkably favorable, and, as you point out, this is from looking at relatively few annual rounds of screening. And as you do this in a disease that has a very long risk profile for the individuals you’ve got to do it for 20 years or 30 years because these people – even if they're in their fifties they're going to be around for quite some time, and many of them are very healthy and stuff like that so that it's something that we can do now, but the efficiency is an issue because we have to scale it.
When these reports were done maybe one percent of the at-risk population was getting lung cancer screening, and they're doing it at centers of excellence in which people have committed their entire careers to being proficient at this task. Now we're gonna ask everybody to do it across the nation and not necessarily in academic medical centers but in community ambulatory screening centers, and so the standardization quality control in that setting is a challenge, not because those people aren't good but because this is a fairly specialized thing. We're talking about looking at six-millimeter suspicious nodules in the lung and saying if they've grown half a millimeter over the previous six months. That's a very, very difficult determination to make, and so to the extent that we can develop software tools that facilitate the radiologists in doing these things it would be good. Partly because of accuracy, partly because of asking every radiologist to do lung cancer screening on top of everything else we're already asking them to do – and we have a limited pool of those types of professionals – it's an incredibly work-intensive thing, so to allow it to scale, to be economical, and to be equitably available in our society we have to develop these tools for efficiency. And I think a lot of people recognize that, and I think that those things can be done just not so they work in the United States but they can work across the world. And so the big focus now has to be on scaling and figuring out approaches that are efficient and cost-attainable.
Rebecca Driskill: Can I ask "How would AI be applied to increase efficiency?"
James Mulshine: If a radiologist is asked to read a new CT scan of the chest, it could take 25, 30 minutes. If you have a computer program built to look at that alongside of the radiologist it could take five minutes or less, and this all has to be done with regulatory approval from the FDA and whatnot, but the computer can look at this digitized information radically more quickly than the human eye, and it can just deal with it in a much more disciplined fashion. And so this is something that-- in a sense it's essential that we develop these kinds of tools, because even though the eyeball and radiologists today can do it, to scale it to all the potential at-risk populations is just going to require dramatic expansion of that workforce or else development of these types of tools to facilitate the analysis.
Bruce Pyenson: Can you speculate on what you think that would mean for the cost of CT imaging? If we can compress the time needed to read an imaging, how do you see that affecting the underlying cost?
James Mulshine: Well, we've already seen it in breast cancer, because there have been computer assisted detection tools out there approved now for 15, 20 years. And these tools if they're used in a limited setting are expensive. If they're used broadly across the population, the efficiency of these things can be radically improved. And I think that that's reasonable model.
But you do bring up a very important point. Our traditional model for doing this is that we have a handful of vendors that can do this. They get access to images and follow up and they develop their tools. But it's a very, very difficult process. For us to really fully realize the benefit of A.I. in this setting, we're going to need hundreds of thousands of images along with what happened to those people, so that we know what outcomes are favorable, versus what outcomes are not. In the past, people have donated clinical specimens that have been the subject of molecular biology research and whatnot to develop genomic tools, we need to develop these imaging tools and we need to do it rapidly. And in order to do this, we need to have a dialogue with the public about how important it is to do this kind of donation.
Rebecca Driskill: So you're actually talking about people, patients donating their CT scans.
James Mulshine: Yes, so right now what the collections have done by the NIH and others, there's high quality images like this that are available in maybe a few hundred or something like that. A.I. tools, you know, to develop robustly need orders of magnitude more information than that. More cases in order to develop tools that can operate at the level that we were just discussing.
Rebecca Driskill: So I'm curious, you know, we can't predict the future, unfortunately. But where would you like to see us five to ten year from now when it comes to A.I. and screening, lung cancer screening?
Bruce Pyenson: I'd like to see us in a place where lung cancer screening has been broadened in a couple of ways. One is that we think of the eligible population as broader than the population that was used for the clinical trials. Because there's many people at high risk that we know about today that weren't included in clinical trials for a number of reasons. So I'd like to see the eligibility broaden to bring in a much bigger proportion of the population. I'd also like to see the use of the data brought in to include information available about the lungs and the heart that are not routinely being used today. Chronic Obstructive Pulmonary Disease (COPD) has a much bigger prevalence than lung cancer. It is a potentially terminal condition, and it's a chronic condition. And identifying people earlier may help people change their lifestyle to be healthier, as well as identification of coronary conditions, such as a cardiac calcium, which is very accurate predictor of future cardiovascular risk. So I'd like to see the practice brought in in those two ways, but of course, what's most important is even with the existing technology, to get many more people screened. Screening rates are far below what they should be for lung cancer.
James Mulshine: Yeah, I agree, and I think that CT scanning in these people who have been heavy smokers allows us to find potential chronic diseases at a time when people are still healthy. And CT scanning may be a pivotal tool in helping us to understand that we want to do healthcare as opposed to disease care. CT scans, in addition to finding early lung cancer could find calcium related to the coronary vessels that may be indicative of a risk for heart disease and areas of injury of lung that could indicate the beginnings of COPD. And it is published already in the literature that about one in four people who are scanned for lung cancer screening have problems with early evidence of COPD emphysema. And that about one in four people that are scanned and maybe slightly less, depending on the population, have evidence of calcification in the coronary circulation, which depending on how dense it is, it could be a variable risk factor. So this is remarkably important, because the scan is already paid for. So the cost of determining these additional factors is going to be a software function, which if done in larger numbers of people would be exceedingly economical.
By the same token, osteoporosis can be looked at in a variety other diseases of the aorta, and other conditions within the chest, because the chest is incredibly valuable real estate in which all kinds of things go on which relate to the beginnings and evolution of major chronic diseases that you can find. And many of these things, there are interventions that could be lifesaving that could be associated with these things. And so with this computational A.I. development, we can potentially start pulling these out in a much more disciplined fashion, so we can use this truly as a public health tool. It's a very exciting possibility that we're using lung cancer screening as a use case, developing these vast array of clinical images that could be then interrogated for these kinds of things to develop very robust informatics tools, A.I. tools, to allow us to economically manage health in a much more productive way.
Rebecca Driskill: One thing I'm curious about from a patient perspective is let's say that this becomes more regularly implemented. Patients don't necessarily live next to a lung cancer center. If a patient wants to go to their nearest ambulatory care center, how ideally does one sort of manage the standardization issue, or what kinds of things are being innovated right now that will help manage this in the future?
Bruce Pyenson: And maybe a way to ask the question is the diversity of healthcare around the world, there's clinics with nothing much in them in some places in rural health clinics and of course the academic medical centers. And as we've seen in China, CT scans are pretty well available in much of the world. But the expertise may not be as high every place, and for sure there's diversity in software and differences in hardware and quality of the hope that the hardware and the software and the expertise. There's people that are working on fixing that for imaging in ways that aren't being applied to other areas of medicine because it's all digital.
James Mulshine: Yes, and Bruce and I have had the privilege of doing a previous communication for Milliman in which we talked about CT scans, this Swiss Army knife, and they're used across the world for all kinds of studies for all kinds of purposes, and so that's a good thing and it's a bad thing, because when you want to do a very specific type of scan for lung cancer screening, you want to make sure, absolutely sure that that is what is being done. Because this machine is used for so many different settings under different conditions to optimize certain types of performance. So a colleague Rick Avilla at Accumetra has worked with RSNA to develop a cloud-based approach in which he has developed a very inexpensive phantom, it costs about $200 to make.
Bruce Pyenson: What is a phantom?
James Mulshine: A phantom is a tool that's used to measure the performance of an X-ray machine. And in this instance it's made with precision-engineered materials so that they know how well the image is being acquired.
Bruce Pyenson: So it's a fake body, it's a standardized fake body.
James Mulshine: Yeah, yeah, it's a surrogate of a body. Okay? And this is used to test, to make sure that the CT is set in such a way that it will obtain a very optimized scan reflecting the kinds of performance of the CT scanner that will allow a good reading to be done of exactly what's going on.
I would point out that one of the things that we have with this challenge with lung cancer screening, is that we really aren't talking about patients. We're talking about people at risk. So these are healthy people that have a risk for lung cancer. And if you say they're patients, they get very, you know, it's very, very off-putting. And so for us to get better uptake with lung cancer screening, we have to make sure that we can allow people who are-- think about themselves to healthy to avail screening in sites very close to them, very convenient to them. So the more often than not, this is going to be an ambulatory imaging center. And so these centers are very, very busy places, and they don't have some of the support from medical physics community that the academic medical centers have. But we’ve got to ensure that they can acquire a scan just as high quality as they do at these major medical centers. And so using these phantoms, these kinds of tools to evaluate how the scan is being acquired can allow us to do that.
And so in doing this process which is now available through the Radiological Society of North America, they can take a picture of this phantom, it is uploaded to the site through the cloud-- through the web at RSNA. It is evaluated using machine vision and other things very rapidly to look at a hundred, close to a hundred parameters of how high a quality that image is, and then it can send back to that site how well their machine is performing, and if it's not performing well enough, what's likely to be the problem that they can correct very quickly just by changing the dial and then 95% of the time, they can get their scan to appoint where it's acquiring a very high quality image. And that could be done in Samoa, or it could be done in Kansas, or it can be done at Mount Sinai, and very rapidly and at very low cost. That's the type of utility of these new informatics resources that can allow us to be much better in terms of rapid dissemination of very sophisticated image acquisition-types of approaches.
Bruce Pyenson: From my standpoint, it's hard to picture other areas of medicine that could adapt as quickly, and would be as amenable to the application of A.I. to change practice. So it's very exciting to see that emerge in imaging and radiology and if that works, and if it takes off, I'd expect we would use this Swiss army knife of imaging for more and more applications just because it's a way to create standards and to disseminate new advances more quickly than what we see in other areas of medicine.
Rebecca Driskill: Well, I will be really interested to see how this develops and I would love for you guys to come back and hopefully we can keep talking about it. Jim, Bruce, thank you! You've been listening to Critical Point, presented by Milliman. To listen to other episodes of our podcasts, visit us at www.milliman.com, or you can find us on iTunes, Google Play, Spotify or Stitcher, see you next time.