International Journal of

Radiation Oncology biology

physics

www.redjournal.org

EDITORIAL

Embracing Phenomenological Approaches to Normal Tissue Complication Probability Modeling: A Question of Method Arjien van der Schaaf, PhD,* Johannes Albertus Langendijk, MD, PhD,* Claudio Fiorino, PhD,y and Tiziana Rancati, PhDz *Department of Radiation Oncology, University of Groningen, University Medical Center Groningen, The Netherlands; yMedical Physics, San Raffaele Scientific Institute, and zProstate Cancer Program, Fondazione IRCCS Istituto Nazionale dei Tumori, Milan, Italy Received Sep 30, 2014, and in revised form Oct 7, 2014. Accepted for publication Oct 9, 2014.

In the field of normal tissue complication probability (NTCP) modeling, it is common practice to build models using a combination of biological knowledge and clinical data. Currently, in attempts to increase the accuracy of these models, it is often presumed that complications are caused by a multitude of factors, of which adequate biological descriptions may be lacking, but that can be inferred from clinical, dosimetry, or genetic data. Thus, the resulting multivariable NTCP models are based increasingly on available data and less on existing biological knowledge. These models can be described as phenomenological, meaning that they are consistent with existing data but that they are not fully explained by existing biological knowledge (Fig. 1). Although these models are often successful in accurately describing complication probabilities, in practice it is important to discern under which conditions we can rely on them. The underlying problem of phenomenological models, especially multivariable models, is that many different conceivable models may exist that are consistent with the existing data. Despite their ability to describe the present data well, many of these models may later turn out to be inconsistent with subsequent data.

Reprint requests to: Claudio Fiorino, PhD, Medical Physics, San Raffaele Scientific Institute, Via Olgettina 60, 20132, Milano, Italy. Tel: þ390226432278; E-mail: [email protected] Int J Radiation Oncol Biol Phys, Vol. 91, No. 3, pp. 468e471, 2015 0360-3016/$ - see front matter Ó 2015 Elsevier Inc. All rights reserved. http://dx.doi.org/10.1016/j.ijrobp.2014.10.017

Overfitting Let’s first consider the simple case in which subsequent data originate from the same patient population under identical conditions as the training data from which the model was built (Fig. 2A). Then the phenomenon that the model describes the training data well, but describes the subsequent data poorly, can be attributed solely to overfitting. Overfitting occurs when a model is fitted to a dataset in such specific detail that the resulting model loses its general validity (1) (Fig. 2B). Fortunately, many model learning methods are available that reduce this effect (2), resulting in high probabilities that the produced models describe subsequent data almost equally well as the training data.

Model Selection Uncertainty (Instability) It is important to note, however, that multiple models may exist that are not overfitted and that describe the data almost equally well, but that make different predictions (Fig. 3). In that case, multiple alternative phenomenological models exist for which we may have very few arguments about

Conflict of interest: none.

Volume 91  Number 3  2015

Phenomenological NTCP modeling

Dataset Observations of potential risk factors and outcomes

Learning method

Model

Learns associations from the data with limited theoretical motivation

Predicts outcome as a function of risk factors

469

The model predictions are consistent with the available observations

Fig. 1.

Characteristic method of phenomenological model learning.

which one should be preferred. The range of different alternative models may be appreciable, especially when multivariable models are considered (3, 4), but although phenomenological models cannot be demonstrated to be

A

uniquely correct, each of the alternative models would make good predictions from a statistical point of view, as measured, for instance, in terms of explained variance, likelihood, discrimination, or calibration.

Generalizability New patients

Population

Observed cohort

Subsequent data Prediction performance Is the model consistent with subsequent data?

Learning method Training data

Model

Restricts model freedom

Goodness of fit Is the model consistent with the training data?

B Goodness of fit Prediction performance Optimal use of available information

Underfitting

General model

Overfitting

Detailed model

Model freedom Fig. 2. (A) A phenomenological model fitted to training data and subsequently used to predict data of new patients. Both datasets are drawn independently from the same population. Model freedom is restricted to avoid overfitting. (B) Goodness of fit and prediction performance as function of model freedom, showing that goodness of fit increases monotonically and prediction performance peaks at a specific value.

In the more general case, when predicting for patient populations or conditions that differ from the training data (Fig. 4), it is impossible to guarantee how good the model predictions will be without making further assumptions. Phenomenological models rely on mimicking the variations and relations of the training data, and therefore prediction accuracy cannot be guaranteed for subsequent datasets with a different statistical structure (5). Statistical differences between datasets may be observable from the variation and correlation of the independent variables, but they may also arise unpredictably from hidden factors. Patient populations or conditions may differ between hospitals, between treatment methods, between patient subgroups, or over time (ie possibly always).

Causality Moreover, observation of a correlation never implies that the relationship is causal. Phenomenological models are therefore not strictly guaranteed to describe causal relationships. This means that if treatment is changed purposely to reduce the NTCP on the basis of a phenomenological model, the predicted and actual risk reductions may differ. So, we are faced with the problem that phenomenological models provide rich and valuable information but that their general validity is inherently difficult to guarantee. Is it possible to apply phenomenological models safely, and how can we rely on them?

A Practical Approach Above all, we must accept that phenomenological models are conjectures (ie hypotheses that appear to be correct but that are not fully and generally proven). Conjectures are an essential part of the normal scientific process. We accept simple conjectures, test them, and when they are falsified

470

International Journal of Radiation Oncology  Biology  Physics

van der Schaaf et al.

Population

New patients

Subsequent data

Cohort B Cohort A

Each model makes different predictions, but with equal prediction performance

Each model is consistent with the available data

Model A

Model B

Model C

Learning method

Dataset A

Learning method 1 Dataset B

Learning method 2

Fig. 3. Phenomenological model uncertainty depending on cohorts and learning methods. The resulting models may produce very different predictions while being consistent with the available data and having equal prediction performance. replace them by more complex ones. We can translate this idea into a practical approach for the use of phenomenological models (Fig. 5). If we accept a model and apply it to the treatment of a specific patient group, we must test the model under these conditions by conducting prospective follow-up, and verify whether the resulting outcomes are consistent with the model. This can be defined as quality assurance of a model (6). After sufficient subsequent data are collected, we may be able to either accept the existing model as validated or update the model on the basis of the

Population A

Population B

Cohort

New patients

Subsequent data Can a model trained for population A predict for population B?

Training data

Learning method

Model

Fig. 4. Phenomenological model derived from data of population A and subsequent data from a different population B. Although the prediction performance can be estimated for population A from the training data, there is no guarantee how well the model predicts for population B before external validation is performed.

combined data. This learning cycle can be repeated indefinitely so that each iteration may add to the accuracy of the model. This approach has a few restrictions. It solves the problem only for the specific patient population and condition to which the model was applied. Although after validation we will have better confidence in the model, and the model will be more generally applicable after updating, it remains uncertain how wide the applicable range of patient populations and conditions really is. Also, the problem is solved only after an initial period, until sufficient followup data are available to validate or falsify the model. In this initial period we must be cautious with replacing existing safety constraints by use of the model. Furthermore, it is unknown how many iterations of the learning cycle are required to converge to an accurate model. A consequence of the proposed approach is that the use of phenomenological models must be accompanied by standardized prospective data registration programs, which may increase the clinical workload. On the other hand, the approach is conceptually simple to implement and assures that phenomenological models can be relied upon. Furthermore, it enables clinical experience to be automatically accumulated into an accessible form.

External Validation A possibility to speed up the process of the learning circle just described is to externally validate the model on

Volume 91  Number 3  2015

Phenomenological NTCP modeling

471

Optimized treatment

Conventional treatment

Be careful with abandoning old safety constraints

Learningg circle circ r le

Population A

Cohort

Conjecture model

Training data

Learning method

Population B New patients

Update

Subsequent data

Fig. 5. A practical approach to use phenomenological modeling to optimize treatment. Based on data from conventional treatment an initial model is created. This model is regarded as conjecture and is used to optimize treatment of subsequent patients. Because these patients receive different treatment from that received by the previous patients, they must be considered as a separate population. After prospective data registration, the observed outcomes are compared with the model predictions, enabling model validation. If the model is falsified, it is replaced by an updated model that includes the new data. This repeated process constitutes a learning circle in which experience from old patients is accumulated and used to optimize the treatment of new patients. already existing independent populations that have been prospectively followed up, thus immediately testing the model’s generalizability. Of course, it is of paramount importance that equally defined endpoints and predictors are scored in each population. If this is not the case, further uncertainty, coming from incompatibly scoring the variables, will jeopardize the merit of external validation. From this point of view, using international standardized methods to score toxicities will highly increase the possibility that external validations can be performed successfully. For many late effects, wherein prospective validation could take many years, well-conducted external validations may hugely reduce the time necessary to close the circle and gain confidence in the model in a more reasonable time.

complementary entities, oriented, respectively, top-down and bottom-up.

Conclusions In conclusion, phenomenological models should not be regarded as general abstractions of truth but as conjectures that originate from accumulated experience. When applied with caution and accompanied by prospective data registration programs, combined with external validation, they are potentially powerful tools to learn from patients in the past and to optimize treatment of patients in the future.

References Alternative Validation Paradigms The approaches just proposed are not unique solutions. Alternative approaches may aim to validate the causal interpretation of phenomenological models. For example, randomized controlled trials can compare the outcome of treatments that are optimized using different alternative models, or biological studies may aim to uncover the underlying biological mechanism. In several cases, phenomenological models may be an important source of hypothesis-generating information, guiding and inspiring new research. In view of their 2-way influence, phenomenological modeling and radiation biology are separate but

1. Pitt MA, Myung IJ. When a good fit can be bad. Trends Cogn Sci 2002;6:421-425. 2. Hastie T, Tibshirani R, Friedman J. The elements of statistical learning: Data mining, inference, and prediction. 2nd ed. New York: Springer; 2009. 3. El Naqa I, Bradley J, Blanco AI, et al. Multivariable modeling of radiotherapy outcomes, including dose-volume and clinical factors. Int J Radiat Oncol Biol Phys 2006;64:1275-1286. 4. Austin PC, Tu JV. Automated variable selection methods for logistic regression produced unstable models for predicting acute myocardial infarction mortality. J Clin Epidemiol 2004;57:1138-1146. 5. Steyerberg EW. Clinical prediction models. New York: Springer; 2009. 6. Lambin P, van Stiphout RG, Starmans MH, et al. Predicting outcomes in radiation oncology: Multifactorial decision support systems. Nat Rev Clin Oncol 2013;10:27-40.

Embracing phenomenological approaches to normal tissue complication probability modeling: a question of method.

Embracing phenomenological approaches to normal tissue complication probability modeling: a question of method. - PDF Download Free
395KB Sizes 2 Downloads 8 Views