Deeper Dive: A Plethora of Prediction Models

Deeper Dive: A Plethora of Prediction Models

Risk Assessment and Reduction
Sep 06, 2020
Annette McWilliams

By Annette McWilliams, MBBS, FRACP, MD, FRCPSC

In reference to: Nemesure B, Clouston S, Albano D, Kuperberg S, Bilfinger TV. Will that pulmonary nodule become cancerous? A risk prediction model for incident lung cancer. Cancer Prev Res (Phila). 2019;12(7):463-470.

Technologic advancements in CT technology during the past 20 years have led to the frequent detection of noncalcified pulmonary nodules when used for chest imaging. The majority of these nodules are benign. Discussion regarding the best management has stimulated numerous publications and guidelines.1-3 The interpretation of risk in a pulmonary nodule detected by chest CT is dependent on multiple factors and the clinical context (i.e., screening or clinical scenario). The probability of risk then determines the next step for the clinician (i.e., when to simply repeat imaging and when to actively investigate). When two or more scans are available to detect growth, management is more straightforward. However, the decision on nodule management with a single chest CT scan is more challenging. 


Screening Risk Prediction Models

Research using low-dose CT for lung cancer screening has contributed longitudinal data on the behavior of small pulmonary nodules in high-risk current or former smokers. This has led to the development of probabilistic models such as the Pan-Canadian (PanCan) risk model, which can be incorporated into nodule management algorithms.4,5  Of particular note, the patients included in the development and validation cohorts of the PanCan nodule risk model were all  ≥50 years of age with no history of lung cancer or other metastatic cancer and were all current or former smokers. This risk model was designed to be used at the baseline screening CT scan to simplify downstream management using a probabilistic approach and to minimize the need for repeat chest CT and further investigation. This model has been validated in external screening cohorts and is undergoing prospective validation in a large multicenter screening study, the International Lung Screen Trial (ILST).6 It appears to perform similarly using either measurement of radiologist-based maximal axial diameter compared to computer-assisted detection (CAD)-based mean axial diameter or baseline nodule volume.7 The additional development of CAD software and use of three-dimensional volumetric analysis for early growth detection on longitudinal surveillance are also likely to contribute to improved nodule management.8,9

Other prediction models have been developed to identify patients at high risk for lung cancer who will most benefit from screening.10-12 These models do not include radiologic characteristics, as they act as a “gateway” for eligibility for low-dose CT screening. They can be used to further define screening eligibility, improve cost-effectiveness, and avoid screening low-risk participants where the risks will outweigh benefits. Examples include the Prostate, Lung, Colorectal and Ovarian Cancer Screening Trial (PLCO) model and the Liverpool Lung Project (LLP) model.10-12 An earlier PLCO model was prospectively validated in the PanCan Early Detection of Lung Cancer Study, and the more recent PLCOm2012 is undergoing prospective validation in the multicenter ILST.6,10

The appropriate application of these two types of screening risk prediction models is important: one for patient selection and one for nodule management. These models have been developed and validated in an asymptomatic screening cohort rather than a clinical setting, and use outside of this setting will likely result in different performance. 


Models in Clinical Practice or Clinical Risk Prediction Models

In the clinical setting, the finding of an incidental nodule is often seen during the course of investigation of another condition. Multiple factors such as intercurrent infection, other acute illnesses, the presence or history of other malignancies, as well as risks for lung cancer will influence assessment of a pulmonary nodule. Multiple clinical nodule prediction models have been published, and many were recently well summarized and discussed in a review article by Choi and colleagues.13 These models have been developed from a variety of different clinical cohorts where the prevalence of malignancy is much greater than that seen in screening cohorts. Some of these are listed in the accompanying Table, although this is not a comprehensive list, and more models are continuing to be published. Their application is often limited by small cohort size and limited external validation in other clinical cohorts. Some of the models are based on chest x-ray rather than CT or include PET scan rather than CT alone. 

A clinical prediction model is most likely to benefit clinical decision-making for indeterminate nodules to reduce unnecessary downstream investigations or surgery. A lesion with high or low likelihood of cancer is relatively straightforward. Clinical-impact studies in how these models affect decision-making would be of great value.14 In addition, the impact of a risk prediction model may be different between expert and non-expert clinicians.

Extrapolation and use of prediction models outside of the development population may lead to inaccuracy, and caution is needed. Models derived from populations with a high prevalence of malignancy will likely overestimate the risk when used in low-prevalence populations and vice versa. The use of a screening nodule prediction model in a clinical setting has not been extensively assessed. The PanCan nodule risk model has been retrospectively evaluated in a small clinical cohort of 244 patients (76% current/former smokers) in the United Kingdom published by Al-Ameri and colleagues.15  In this cohort, 41% patients had a malignant nodule (33% lung cancer, 7% metastatic disease). The PanCan model performed similarly in this clinical cohort as it had in a screening cohort, with an area under the curve varying between 0.852 - 0.902  (dependent on use of model exclusion criteria). The PanCan model has also been assessed in a larger clinical cohort from two centers in the Netherlands.16 Chung and colleagues retrospectively evaluated a clinically heterogeneous population of 912 patients (1846 nodules) with 441 lung cancers. The model performed well (AUC of 0.901-0.911), although not as well as it had in a more homogenous screening population.16


The Nemesure Approach 

In the recent publication by Nemesure, another clinical nodule prediction model is discussed. The authors approached their cohort differently than other previously published clinical models by excluding patients diagnosed with lung cancer at initial presentation or within 6 months of presentation, thereby excluding high-probability nodules and focusing on indeterminate nodules where clinical decision-making is more difficult. The authors have developed their own clinical prediction model derived from a retrospectively selected cohort of patients referred to their lung cancer clinic with a known CT-detected pulmonary nodule. This cohort, of 2,924 patients with 171 lung cancers, defined as those who had not been diagnosed with lung cancer within 6 months of initial visit, had no prior history of lung cancer, and were not found to have metastatic disease. The data were retrieved from a clinical database collected between 2002 and 2015. Dividing their cohort equally into a development and validation set, they evaluated multiple patient and nodule characteristics, although only a relatively low number of nodules appeared to be included in the analysis (mean 1.2-1.4 per patient, total number not stated). It is unknown whether the subsequent lung cancer was related to a pre-existing pulmonary nodule that prompted the referral. Variables in their model included age, smoking history, presence of chronic obstructive pulmonary disease, personal history of other cancers, and nodules of nonsolid/spiculated appearance. They defined patients into high or low risk according to a calculated risk score (cutoff  ≥ 10.17), resulting in a 5-year risk of lung cancer of 17% versus 1.1%, respectively, with a sensitivity of 81% and specificity of 79%. The model has not yet been validated in an external cohort. Despite some limitations, the question posed is clinically relevant. It would be useful to comparatively assess the performance of other published models in their clinical cohort. In addition, external validation and comparison to other models in an independent clinical dataset would add to the assessment of utility. 

The plethora of risk prediction models can be confusing. The use of a model in a setting outside the population from which it was derived may lead to different performance. Prospective comparison of the best available models and the impact on clinical decision-making is needed to incorporate these models into clinical practice. The addition of other imaging tests, such as PET or biomarkers to prediction models, may also prove to have additional discriminatory value, but the extra costs must be considered.17,18 Access to large cohorts with the datasets required is difficult to obtain, and global cooperation is needed to achieve this outcome. The rise of deep machine learning and artificial intelligence are the latest techniques that hold future promise to further improve nodule risk prediction and simplify nodule management, but large datasets are also required for their development.19 The challenge for clinicians remains, but we continue to make progress.


Table 1. Nodule Risk Prediction Models 


Cohort + modality



Lung cancer

External retrospective validation

Prospective validation



PanCan4 2013

Screening CT




UKLS20 2019

Screening CT




Gurney21,22 1993

Clinical CXR




Mayo23 1997

Clinical CXR




Veterans24 Affairs  2007

Clinical CXR




Herder25 2005

Clinical Mayo+PET




BIMC26 2015

Clinical CT+PET




PKUPH27 2012

Surgical CT




TREAT28 2014

Surgical CT+PET




Nemesure29 2019

Clinical CT


Not stated


Abbreviations: PanCan, Pan-Canadian Early Detection of Lung Cancer; UKLS, UK Lung Cancer Screening Trial; BIMC, Bayesian Inference Malignancy Calculator; PKUPH, Peking University Peoples Hospital; TREAT, Thoracic Research Evaluation And Treatment; CT, computed tomography; CXR, chest x-ray; PET, positron emission tomography.



  1. MacMahon H, Naidich D, Goo J, et al. Guidelines for management of incidental pulmonary nodules detected on CT images: from the Fleischner Society 2017. Radiology. 2017;284 (1):228-243.
  2. Callister M, Baldwin D, Akram A, et al. British Thoracic Society guidelines for the investigation and management of pulmonary nodules. Thorax. 2015;70(Suppl. 2):ii1-ii54.
  3. Gould M, Donington J, Lynch W, et al. Evaluation of individuals with pulmonary nodules: when is it lung cancer?. Chest. 2013;143(5):e93S-e120S.
  4. McWilliams A, Tammemagi M, Mayo JR, et al. Probability of cancer in pulmonary nodules detected on first screening CT. N Engl J Med. 2013; 369(10):910-919.
  5. Tammemagi M, Lam S. Screening for lung cancer using low dose computed tomography. BMJ. 2014;348:g2253.
  6. International Lung Screen Trial (ILST) (ILST). Accessed December 7, 2019.
  7. Tammemagi M, Ritchie A, Atkar-Khattra S, et al. Predicting malignancy risk of screen-detected lung nodules-mean diameter or volume. J Thorac Oncol. 2018;14(2):203-211.
  8. Horeweg N, Rosmalen J, Heuvelmans M, et al. Lung cancer probability in patients with CT-detected pulmonary nodules: a prespecified analysis of data from the NELSON trial of low-dose CT screening. Lancet Oncol. 2014;15(12):1332-1341.
  9. Shariaty F, Mousavi M. Application of CAD systems for the automatic detection of lung nodules. Informatics in Medicine Unlocked. 2019;15:100173
  10. Tammemagi, M, Schmidt H, Martel S, et al. Participant selection for lung cancer screening by risk modeling (the Pan-Canadian Early Detection of Lung Cancer [PanCan] study): a single-arm, prospective study. Lancet Oncol. 2017;18(11):1523-1531.
  11. Marcus M, Chen Y, Raji O, Duffy S, Field J. LLPi: Liverpool Lung Project Risk Prediction for lung cancer incidence. Cancer Prev Res (Phila). 2015;8(6):570-575.
  12. Ten Haaf K, Jeon J, Tammemagi M, et al. Risk prediction models for selection of lung cancer screening candidates: a retrospective validation study. PLoS Med. 2017;14(4):e1002277.
  13. Choi H, Ghobrial M, Mazzone P. Models to estimate the probability of malignancy in patients with pulmonary nodules. Annals ATS. 2018;15(10):1117-1126.
  14. Kappen T, van Klei W, van Wolfswinkel L, Kalkman C, Vergouwe Y, Moons K. Evaluating the impact of prediction models: lessons learned, challenges and recommendations. Diagn Progn Res. 2018;2:11.
  15. Al-Ameri A, Malhotra P, Thygesne H, et al. Risk of malignancy in pulmonary nodules; a validation study of four prediction models. Lung Cancer. 2015;89:27-30.
  16. Chung K, Mets O, Gerke P, et al. Brock malignancy risk calculator for pulmonary nodules; validation outside a lung cancer screening population. Thorax. 2018;73(9):857-863.
  17. Silvestri G, Tanner N, Kearney P, et al. Assessment of plasma protoemics biomarker’s ability to distinguish benign from malignant lung nodules: results of the PANOPTIC (Pulmonary Nodule Plasma Proteomic Classifier) Trial. Chest. 2018; 154(3):491-500.
  18. Perandini S, Soardi G, Larici A, et al. Multicenter external validation of two malignancy risk prediction models in patients undergoing 18F-FDG-PET for solitary pulmonary nodule evaluation. Eur Radiol. 2017; 27(5):2042-2046.
  19. Huang P, Lin C, Li Y, et al. Prediction of lung cancer risk at follow-up screening with low-dose CT: a training and validation study of a deep learning method. Lancet Digital Health. 2019;1(7):e353-e362.
  20. Marcus M, Duffy S, Deveraj A, et al. Probability of cancer in lung nodules using sequential volumetric screening up to 12 months: the UKLS trial. Thorax. 2019;74(8):761-767.
  21. Gurney JW. Determining the likelihood of malignancy in solitary pulmonary nodules with Bayesian analysis. Part I. Theory. Radiology. 1993;186(2):405-413.
  22. Gurney JW, Lyddon DM, McKay JA. Determining the likelihood of malignancy in solitary pulmonary nodules with Bayesian analysis. Part II. Application. Radiology. 1993;186(2):415-422.
  23. Swenson S, Silverstein M, Ilstrup D, Schleck C, Edell E. The probability of malignancy in solitary pulmonary nodules: application to small radiologically indeterminate nodules. Arch Int Med. 1997;157(8):849-855.
  24. Gould M, Ananth L, Barnett P. Veterans Affairs SNAP Cooperative Study Group. A clinical model to estimate the pretest probability of lung cancer in patients with pulmonary nodules. Chest. 2007;131(2):383-388.
  25. Herder G, van Tinteren H, Golding R, et al. Clinical prediction model to characterize pulmonary nodoules. Chest. 2005;128(4):2490-2496.
  26. Soardi G, Perandini S, Motton M, Montemezzi S. Assessing probability of malignancy in solid solitary pulmonary nodules with a new Bayesian calculator: improving diagnostic accuracy by means of expanded and updated features. Eur Radiol. 2015;25(1):155-162.
  27. Li Y, Wang J. A mathematical model for predicting malignancy of solitary pulmonary nodules. World J Surg. 2012;36(4):830-835.
  28. Deppen S, Blume J, Alrich M, et al. Predicting lung cancer prior to surgical resection in patients with lung nodules. J Thorac Oncol. 2014;9(10):1477-1484.
  29. Nemesure B, Clouston S, Albano D, Kuperberg S, Bilfinger T. Will that pulmonary nodule become cancerous? A risk prediction model for incident lung cancer. Cancer Prev Res (Phila). 2019 Jul;12(7):463-470. 

About the Author:
Dr. McWilliams is with the Fiona Stanley Hospital and University of Western Australia, Perth, Australia.


About the Authors


Annette McWilliams

Dr. McWilliams is with the Fiona Stanley Hospital and University of Western Australia, Perth, Australia.