Original Cardiovascular
Clinical Performance of the EuroSCORE II Compared with the Previous EuroSCORE Iterations Lazar Velicki1,2 Nada Cemerlic-Adjic1,2 Bogoljub Mihajlovic1,2 Miklos Fabri2
Katica Pavlovic1,2
1 Department of Surgery, Medical Faculty, University of Novi Sad,
Novi Sad, Serbia 2 Department of Cardiovascular Surgery, Institute of Cardiovascular Diseases Vojvodina, Sremska Kamenica, Serbia 3 Department of Mathematics, State University of Novi Pazar, Novi Pazar, Serbia
Bojan B. Mihajlovic2
Dragic Bankovic3
Address for correspondence Lazar Velicki, MD, PhD, Institute of Cardiovascular Diseases Vojvodina, Clinic of Cardiovascular Surgery, Put Doktora Goldmana 4, 21204 Sremska Kamenica, Serbia (e-mail:
[email protected]).
Thorac Cardiovasc Surg 2014;62:288–297.
Abstract
Keywords
► ► ► ► ► ►
cardiac surgery risk assessment EuroSCORE outcome mortality prediction
Background The European System for Cardiac Operative Risk Evaluation (EuroSCORE) II has been recently introduced as an update to the previous versions. We sought to evaluate the predictive performance of the EuroSCORE II model against the original additive and logistic EuroSCORE models. Patients and Methods The study included 1,247 consecutive patients who underwent cardiac surgery procedures during a 14-month period starting from the beginning of 2012. The original additive and logistic EuroSCORE models were compared with the EuroSCORE II focusing on the accuracy of predicting hospital mortality. Results The overall hospital mortality rate was 3.45%. The discriminative power of the EuroSCORE II was modest and similar to other algorithms (C-statistics 0.754 for additive EuroSCORE; 0.759 for logistic EuroSCORE; and 0.743 for EuroSCORE II). The EuroSCORE II significantly underestimated the all-patient hospital mortality (3.45% observed vs. 2.12% predicted), as well as in the valvular (3.74% observed vs. 2% predicted), and combined surgery cohorts (6.87% observed vs. 3.64% predicted). The predicted EuroSCORE mortality significantly differed from the observed mortality in the third and the fourth quartile of patients stratified according to the EuroSCORE II mortality risk (p < 0.05). The calibration of the EuroSCORE II was generally good for the entire patient population (Hosmer-Lemeshow [HL] p ¼ 0.139), for the valvular surgery subset (HL p ¼ 0.485), and for the combined surgery subset (HL p ¼ 0.639). Conclusion The EuroSCORE II might be considered a solid predictive tool for hospital mortality. Although, the EuroSCORE II employs more sophisticated calculation methods regarding the number and definition of risk factors included, it does not seem to significantly improve the performance of previous iterations.
Introduction An important component of modern cardiac surgery practice is that of data recording, collection, and analysis for the purpose of assessing and improving the quality of service,
received August 28, 2013 accepted after revision November 30, 2013 published online April 21, 2014
surgical decision-making, and preoperative patient education (informed consent).1 Accurate risk stratification is critical in this endeavor. When using models for this purpose it is vital that the clinician should have formally derived the underlying prediction, know the extent to which their own performance
© 2014 Georg Thieme Verlag KG Stuttgart · New York
DOI http://dx.doi.org/ 10.1055/s-0034-1367734. ISSN 0171-6425.
This document was downloaded for personal use only. Unauthorized distribution is strictly prohibited.
288
EuroSCORE II Validation Study
Patients and Methods Study Design This study examines the data of 1,247 consecutive patients who underwent major cardiac surgery at our institute over a 14-month period since the beginning of 2012. The patients included exclusively underwent: 1. Isolated coronary surgery (coronary artery bypass grafting and off-pump coronary artery bypass). 2. Isolated valvular surgery (mitral, aortic, and tricuspid surgery, with prosthetic replacements or repairs, through traditional or minimally invasive techniques). 3. Combined coronary and valvular surgery. All case data were included irrespective of the priority level (elective, urgent, emergency, salvage). The study was approved by the Institutional Review Board.
289
Operative risks were calculated before the surgery by the EuroSCORE II outcome prediction model7 as well as both the additive and logistic EuroSCORE.3,8 The calculation of the score of each patient was performed by a cardiologist and a cardiac surgeon at our center by means of the EuroSCORE calculator embedded in the institution database system. The calculator was applied by using the appropriate official logistic regression coefficients and it was thoroughly tested.
Outcome Events The outcome event under consideration was the all-cause hospital mortality—defined as death after cardiac procedure during the index hospitalization irrespective of the mortality cause (cardiac or noncardiac).
Statistical Analysis Descriptive statistical data are presented for categorical variables as frequencies (percentages) and were compared between the groups by using either the Pearson χ2 test or Fisher exact test. Continuous variables, expressed as means standard deviation, were compared between the groups by using the unpaired Student t-test or the Wilcoxon rank-sum test (depending on the normality of the distribution). A p-value of less than 0.05 was considered to be significant. The evaluation of the original EuroSCORE models and the EuroSCORE II was performed by comparing the observed and expected hospital mortality. The calibration of models was assessed by using the Hosmer-Lemeshow (HL) test. A wellcalibrated model gives a p-value greater than 0.05. The model discrimination was tested by means of receiver operating characteristic (ROC) curves calculating the area under the curve (AUC) and its 95% confidence intervals (CI)—an index which was used to assess how well the model could discriminate between survivors and nonsurvivors. The cumulative sum (CUSUM) control charts were constructed for a visual analysis of the models. The accuracy and clinical performance of all EuroSCORE instances were tested in the patient subgroups based on the type of cardiac operation, and the subgroups based on risk categorization (quartiles of distribution according to the EuroSCORE II). The statistical analyses were performed with SPSS version 19.0 (SPSS Inc., Chicago, Illinois, United States) and MedCalc for Windows, version 12.2.1 (MedCalc Software, Mariakerke, Belgium).
Results A total of 1,247 cardiac procedures which were performed at our institution between January 2012 and March 2013 include: 718 myocardial revascularizations (57.58%), 294 isolated valve surgeries (23.58%), and 233 combined valve and coronary procedures (18.68%). The mean values and standard deviations of additive EuroSCORE, logistic EuroSCORE, and EuroSCORE II of the patient population were 4.43 2.92, 5.27 6.58, and 2.12 2.78, respectively. The patient-related and surgery-related data are summarized in ►Table 1. The overall hospital mortality rate was 3.45% giving an observed to expected EuroSCORE II, O:E, ratio Thoracic and Cardiovascular Surgeon
Vol. 62
No. 4/2014
This document was downloaded for personal use only. Unauthorized distribution is strictly prohibited.
is reflected in the prediction, and adjust the estimate up or down for important risk factors not captured in the prediction model.1–3 Numerous risk models for predicting postoperative mortality following a major cardiac surgery exist in common usage, one of the more popular being the European System for Cardiac Operation Risk Evaluation (EuroSCORE).3,4 The EuroSCORE, in its both additive and logistic form, has been extensively used over the last decade for the outcome prediction and hospital performance benchmarking.5 General consensus and opinion is that the model shows a good level of accuracy, with a C-statistic of around 0.75 to 0.80, but could use an improvement or recalibration especially with high-risk patients.6 Recently, a new iteration of EuroSCORE has been presented: the EuroSCORE II.7 The EuroSCORE II was designed on the basis of the preoperative patient data of more than 22,000 patients (mostly European), the type of surgery performed, and the corresponding outcome. The model incorporates additional factors and removes or clarifies several existing ones, but it mainly poses a recalibration with the purpose of reflecting today’s outcomes in cardiac surgery. The internal validation shows an improved C-statistic compared with the previous logistic EuroSCORE model (C-IndexEuroSCOREII ¼ 0.81 vs. C-IndexEuroSCORE ¼ 0.78) and good calibration (Hosmer– Lemeshow χ2 ¼ 15.48; p ¼ 0.0505).7 To comprehensively assess the role of EuroSCORE II and to confirm its applicability in contemporary cardiac surgery practice, external validation is mandated. Moreover, external validation is needed to assess the service provided by the specific hospital, which should be aligned with the “gold standard,” that is, the EuroSCORE II. The EuroSCORE II being a model developed from a large multinational cohort might therefore well be considered a reference group incorporating different levels of outcomes. Before the EuroSCORE II can be accepted as an appropriate risk model it requires validation in the population in which it is intended to be used.2 We sought to evaluate the predictive performance of the EuroSCORE II model against the original additive and logistic EuroSCORE risk models in a contemporary set of patients undergoing heart surgery at our center.
Velicki et al.
EuroSCORE II Validation Study
Velicki et al.
Table 1 Patient profile
Age (y)
All patients (n ¼ 1247)
Coronary surgery (n ¼ 718)
Valvular surgery (n ¼ 294)
Combined surgery (n ¼ 233)
64.00 (57–70)
63.00 (57–69)
63.00 (55–71)
67.00 (61–73)
Gender (female)
399 (32)
198 (27.58)
132 (44.90)
67 (28.75)
Body mass index (kg/m2)
27.69 (25.03–30.47)
27.72 (25.18–30.44)
27.68 (24.49–30.96)
27.68 (25.22–30.30)
Body surface area (m )
1.94 (1.81–2.09)
1.97 (1.82–2.10)
1.91 (1.76–2.07)
1.94 (1.81–2.07)
IDDM
130 (10.43)
85 (11.83)
13 (4.42)
32 (13.73)
2
NIDDM
181 (14.51)
117 (16.29)
34 (11.57)
30 (12.88)
Left ventricle ejection fraction (%)
55.00 (46–60)
55.00 (46–60)
60.00 (50–63)
54 (40–60)
Unstable angina
112 (8.98)
102 (14.21)
8 (2.72)
2 (0.86)
Recent MI (< 90 d)
203 (16.28)
169 (23.54)
3 (1.02)
30 (12.88)
Renal dysfunction
13 (1.04)
5 (0.70)
4 (1.36)
4 (1.72)
Previous heart surgery
25 (2.00)
10 (1.39)
11 (3.74)
4 (1.72)
COPD
80 (6.42)
40 (5.57)
22 (7.48)
18 (7.73)
Peripheral vascular disease
175 (14.03)
96 (13.37)
24 (8.16)
54 (23.18)
LMCA stenosis
125 (10.02)
106 (14.76)
0 (0)
19 (8.15)
3VD
522 (41.86)
435 (60.58)
0 (0)
87 (37.34)
Previous PCI
131 (10.51)
101 (14.07)
9 (3.06)
21 (9.01)
Urgent surgery
29 (2.33)
24 (3.34)
2 (0.68)
3 (1.29)
OPCAB
15 (1.2)
12 (1.7)
1 (0.3)
2 (0.9)
Number of distal anastomoses
2 (1–3)
2 (2–3)
0 (0)
2 (1–2)
Aortic X-clamp time (min)
60 (44–81)
50 (36–65)
60 (56–86)
54 (72–110)
CPB time (min)
71 (54–94)
59 (47–74)
81 (67–102)
108 (86–126)
Hospital mortality
43 (3.45)
16 (2.23)
11 (3.74)
16 (6.87)
Abbreviations: 3VD, three vessel disease; COPD, chronic obstructive pulmonary disease; CPB, cardiopulmonary bypass; d, days; IDDM, insulindependent diabetes mellitus; LMCA, left main coronary artery; MI, myocardial infarction; min, minutes; NIDDM, noninsulin-dependent diabetes mellitus; OPCAB, off-pump coronary artery bypass; PCI, percutaneous coronary intervention; y, years. Note: Categorical variables are shown as % (n); continuous variables shown as median (25th percentile, 75th percentile).
of 1.63. The differential mortality rate in coronary, valvular, and combined surgery was: 2.33, 3.74, and 6.87%, respectively. ►Table 2 summarizes the comparison between the predicted and observed mortality rate through different types of surgery. There was a significant difference between the observed and predicted all-patient hospital mortality by using the logistic EuroSCORE (overestimation) and the EuroSCORE II (underestimation). The logistic EuroSCORE significantly overestimated the mortality risk in the coronary surgery cohort. In the valvular and combined surgery cohorts, the EuroSCORE II significantly underestimated the mortality risk. The performance of the model with respect to discriminative power and calibration is presented in ►Table 3. The accuracy of the models is comparable with highest AUC variation in the valvular surgery subset. The EuroSCORE II was outperformed in every surgery-type subset except in the combined surgery where it demonstrated a marginally improved AUC, 0.678 versus 0.649. It was observed that the lower limit of the AUC 95% CI of the ROC curves for the Thoracic and Cardiovascular Surgeon
Vol. 62
No. 4/2014
combined surgery group was well below 0.6 thus indicating poor discriminative capacity. A lack of improvement in discriminative capacity was observed with EuroSCORE II for the entire sample and all the remaining surgical subgroups (coronary and valvular). ►Table 3 shows the comparison of the χ2 HL test for the pathology subgroups. The HL statistics demonstrated a significant overall lack of calibration of the EuroSCORE II model in coronary surgery (p ¼ 0.035). ROC curves and CUSUM curves are presented in ►Figs. 1–4. The clinical performance of the EuroSCORE II was tested in different populations of predicted mortality risk patients (►Fig. 5). The observed population was divided into quartiles according to the EuroSCORE II, and a comparison between the observed and predicted mortality according to the models considered. With low-risk patients (1st quartile, predicted risk 0–0.79%), all the models performed well. With mild-risk patients (2nd quartile, predicted risk 0.8–1.27%), only the EuroSCORE II performed well while the additive and logistic EuroSCORE significantly overestimated the mortality risk (p ¼ 0.013 and 0.029, respectively). In moderate-risk patient
This document was downloaded for personal use only. Unauthorized distribution is strictly prohibited.
290
EuroSCORE II Validation Study
Velicki et al.
291
Table 2 Predicted and observed hospital mortality rate through different types of surgery PM (%)
MR
p-Value
Additive EuroSCORE
4.43
0.78
0.092
Logistic EuroSCORE
5.27
0.65
0.002
EuroSCORE II
2.13
1.63
0.004
3.48
0.67
0.067
All patients (OM ¼ 3.45%)
Additive EuroSCORE Logistic EuroSCORE
3.70
0.63
0.037
EuroSCORE II
1.67
1.39
0.243
4.59
0.81
0.487
Valvular surgery (OM ¼ 3.74%) Additive EuroSCORE Logistic EuroSCORE
4.89
0.76
0.361
EuroSCORE II
2.00
1.87
0.033
Additive EuroSCORE
7.13
0.96
0.876
Logistic EuroSCORE
10.53
0.65
0.069
EuroSCORE II
3.65
1.89
0.009
Combined surgery (OM ¼ 6.87%)
Abbreviations: EuroSCORE, European System for Cardiac Operative Risk Evaluation; MR, mortality risk; OM, observed mortality; PM, predicted mortality.
groups (3rd quartile, predicted risk 1.28–2.34%), the additive and logistic EuroSCORE performed reasonably well, but the EuroSCORE II had significantly underestimated the mortality (p ¼ 0.013). With high-risk patients (4th quartile, predicted risk > 2.35%), the EuroSCORE II again significantly underestimated the mortality (p ¼ 0.024) with other scores performing well, although given the small numbers in the highrisk group it might be difficult to draw such a conclusion.
Discussion Preoperative risk prediction models have a critical role in current cardiac surgical practice.9 In recent years, additive
and logistic EuroSCORE models have been widely used as risk prediction tools for adult cardiac surgery especially in Europe.10 Due to ongoing improvements in surgical practices, perioperative management, and changing patient profile, it was observed that both the discriminative power and calibration of EuroSCORE are decreasing.11–13 This is especially the case for the EuroSCORE calibration which constantly leads to high-grade overprediction clearly demonstrated in a recent systematic review.14 The other observation that must be considered is the fact that surgical mortality has progressively declined in all risk categories over time. Therefore, when assessing the performance of the model in any single surgical practice, one must consider
Table 3 Model performance All patients (n ¼ 1,247)
Coronary surgery (n ¼ 718)
Valvular surgery (n ¼ 294)
Combined surgery (n ¼ 233)
0.754 (0.684–0.823)
0.735 (0.619–0.850)
0.781 (0.641–0.921)
0.649 (0.516–0.783)
4.822, 0.903
7.832, 0.251
4.859, 0.562
8.633, 0.280
0.759 (0.688–0.830)
0.742 (0.627–0.857)
0.789 (0.647–0.932)
0.651 (0.520–0.782)
18.954, 0.015
12.919, 0.115
10.923, 0.206
6.227, 0.622
AUC (95% CI)
0.743 (0.666–0.820)
0.721 (0.578–0.864)
0.730 (0.570–0.890)
0.678 (0.531–0.824)
HL test (χ2, p-value)
12.295, 0.139
16.577, 0.035
7.487, 0.485
6.070, 0.639
Additive EuroSCORE AUC (95% CI) 2
HL test (χ , p-value) Logistic EuroSCORE AUC (95% CI) 2
HL test (χ , p-value) EuroSCORE II
Abbreviations: AUC, area under the curve; CI, confidence interval; EuroSCORE, European System for Cardiac Operative Risk Evaluation; HL, Hosmer-Lemeshow. Thoracic and Cardiovascular Surgeon
Vol. 62
No. 4/2014
This document was downloaded for personal use only. Unauthorized distribution is strictly prohibited.
Coronary surgery (OM ¼ 2.23%)
EuroSCORE II Validation Study
Velicki et al.
ROC Curve
1.0
0.6
0.4
0.2 EuroSCORE_additive EuroSCORE_logistic EuroSCORE_II Reference Line
0.0 0.0
0.2
A
0.4
0.6
0.8
1.0
1 - Specificity 70
Study cohort EuroSCORE additive
60
EuroSCORE logistic
50 EuroSCORE lI
40 30 20 10
1 39 77 115 153 191 229 267 305 343 381 419 457 495 533 571 609 647 685 723 761 799 837 875 913 951 989 1027 1065 1103 1141 1179 1217
0
B
Patient sequence number
Fig. 1 ROC curve analysis (A) and CUSUM chart (B) for the entire sample regardless the type of surgery. CUSUM, cumulative sum control chart; ROC, receiver operating characteristic.
not only the performance of the model but the performance of the surgical team as well. The EuroSCORE II is the latest iteration of the original EuroSCORE model. The model is based on logistic regression and incorporates additional factors but also removes or more clearly defines (clarifies) previously used data. It was devised around a more recent dataset with the intent to balance the gap between the observed and predicted mortality. During the validation, the EuroSCORE II was computed against a consecutive cohort of 16,828 patients, and its validity was estimated in another cohort of 5,553 patients thus demonstrating an excellent discriminative power (AUC 0.81, 95% CI 0.78–0.83).7 As a result, the internal validation unfortunately did not demonstrate a significantly better calibration. In this study, we tried to validate the EuroSCORE II model on a cohort of recent patients submitted for a major cardiac surgery operation in a single center in Serbia. Compared with the population used to derive the EuroSCORE II, our sample was comprised of more patients with diabetes requiring insulin (10.4 vs. 7.6%), and less patients with pulmonary Thoracic and Cardiovascular Surgeon
Vol. 62
No. 4/2014
diseases (6.4 vs. 10.7%). Our study confirmed the unsatisfactory calibration of old and new versions of EuroSCORE (HL statistics p-values less than 0.05) with a diverse pattern of miscalibration as highlighted in ►Table 3. Although, the EuroSCORE overestimated and the EuroSCORE II underestimated the mortality in our data sample, both were sufficiently accurate to warrant their application—perhaps after applying some caution. Conclusions drawn based on our data sample analysis are: 1. EuroSCORE II has accuracy similar to its previous versions, with the older EuroSCORE potentially outperforming the EuroSCORE II in the case of isolated coronary and isolated valvular patients. 2. Calibration of EuroSCORE II was generally good for the entirety of patient population and for the valvular and combined surgery subsets in particular. 3. In moderate- and high-risk patients, the EuroSCORE II may actually underestimate the hospital mortality risk, although no definite conclusion can be reached.
This document was downloaded for personal use only. Unauthorized distribution is strictly prohibited.
Sensitivity
0.8
Hospital mortailty (patient number)
292
4. EuroSCORE II shows clear benefit over previous iterations when used for risk assessment in combined surgery groups of patients, or when evaluating the risk for the entire group of patients regardless of the surgery type. 5. In brief, our sample data indicate that the EuroSCORE overestimates while the EuroSCORE II underestimates the hospital mortality. Neither of models shows perfect alignment with the real-patient data in our hospital, although the EuroSCORE II should be preferably used keeping in mind the constraints and limitations mentioned above. The San Donato group already reported similar EuroSCORE II validation results.6 In their study, which included a total of 1,090 consecutive patients, the authors found that the accuracy of the EuroSCORE II was good (AUC 0.81) but not significantly higher than the other scores. In patients at
Velicki et al.
low, mild to moderate, and high mortality risk, the EuroSCORE II provided a risk prediction not significantly different from the observed mortality rate, whereas in very high-risk patients (observed mortality rate 11%), it significantly underestimated (6.5%) the mortality risk. The authors concluded that the accuracy of the EuroSCORE II was acceptable in isolated coronary surgery, and good or excellent with other surgical procedure types. The Liverpool group focused on the assessment of clinical performance of the EuroSCORE II in different surgery subsets of patients.15 The authors found that the EuroSCORE II is a reasonable risk model for hospital mortality from isolated coronary surgery (AUC 0.79; HL p ¼ 0.052) and aortic procedures (AUC 0.81; HL p ¼ 0.43), and excellent for mitral valve surgery (AUC 0.87; HL p ¼ 0.6). However, the EuroSCORE II failed to improve on the original EuroSCORE model for isolated aortic valve replacements (AUC 0.69; HL
ROC Curve
1.0
Sensitivity
0.8
0.6
0.4
0.2 EuroSCORE_additive EuroSCORE_logistic EuroSCORE_lI Reference Line
0.0 0.0
0.2
0.4
0.6
0.8
1.0
1 - Specificity
A 30 Hospital mortailty (patient number)
Study cohort
25
EuroSCORE additive EuroSCORE logistic
20
EuroSCORE lI
15 10 5
1 23 45 67 89 111 133 155 177 199 221 243 265 287 309 331 353 375 397 419 441 463 485 507 529 551 573 595 617 639 661 683 705
0
B
Patient sequence number
Fig. 2 ROC curve analysis (A) and CUSUM chart (B) for the coronary surgery subset. CUSUM, cumulative sum control chart; ROC, receiver operating characteristic. Thoracic and Cardiovascular Surgeon
Vol. 62
No. 4/2014
293
This document was downloaded for personal use only. Unauthorized distribution is strictly prohibited.
EuroSCORE II Validation Study
EuroSCORE II Validation Study
Velicki et al.
ROC Curve
1.0
0.6
0.4
0.2 EuroSCORE_additive EuroSCORE_logistic EuroSCORE_lI Reference Line
0.0 0.0
0.2
A
0.4
0.6
0.8
1.0
1 - Specificity 16
Study cohort
14
EuroSCORE additive
12
EuroSCORE logistic
10
EuroSCORE lI
8 6 4 2 1 11 21 31 41 51 61 71 81 91 101 111 121 131 141 151 161 171 181 191 201 211 221 231 241 251 261 271 281 291
0
B
Patient sequence number
Fig. 3 ROC curve analysis (A) and CUSUM chart (B) for the valvular surgery subset. CUSUM, cumulative sum control chart; ROC, receiver operating characteristic.
p ¼ 0.07). Another EuroSCORE II external validation study has been recently published including 4,342 patients.16 The AUC for EuroSCORE (0.82, 95% CI 0.79–0.85) was lower than that for EuroSCORE II (0.85, 95% CI 0.83–0.87) but the difference was not statistically significant (p ¼ 0.056). The two models showed poor calibration in the sample: EuroSCORE (χ2 ¼ 39.3, HL p < 0.001) and EuroSCORE II (χ2 ¼ 86.69, HL p < 0.001). The calibration of EuroSCORE was poor in the groups of patients undergoing coronary (HL p ¼ 0.01), valve (HL p ¼ 0.01), and combined coronary valve surgery (HL p ¼ 0.012); and that of EuroSCORE II in the group of coronary (HL p ¼ 0.001) and valve surgery (HL p < 0.001) patients. Another recent study focused on EuroSCORE II validation in Chinese patients submitted to isolated valve surgery.5 The discriminative power of the EuroSCORE II model was good for the single valve surgery group (AUC 0.792) and was poor for the multiple valve surgery group (AUC 0.605). The EuroSCORE II model showed good calibration in predicting hospital Thoracic and Cardiovascular Surgeon
Vol. 62
No. 4/2014
mortality for patients undergoing single valve surgery (HL p ¼ 0.103) and poor calibration for patients undergoing multiple valve surgery (HL p < 0.0001). One of the biggest multicenter study evaluating the accuracy and performance of EuroSCORE II that was published to date involved 12,325 consecutive patients in Italy.10 The authors reported the hospital mortality rate of 2.2%. The discriminatory power was high and similar in all models (AUC 0.82, 95% CI 0.79–0.84 for additive EuroSCORE; 0.82, 95% CI 0.79–0.84 for logistic EuroSCORE; 0.82, 95% CI 0.80–0.85 for EuroSCORE II). The EuroSCORE II had a fair calibration till 30% predicted values and overpredicted beyond. Finally, they concluded that the EuroSCORE II does not seem to significantly improve the performance of older versions in the higher tertiles of risk. A study specifically designed to assess accuracy of the EuroSCORE II in outcome prediction in high-risk patients was also recently published.17 The authors identified a cohort totaling 933 patients with a preoperative logistic EuroSCORE
This document was downloaded for personal use only. Unauthorized distribution is strictly prohibited.
Sensitivity
0.8
Hospital mortailty (patient number)
294
EuroSCORE II Validation Study
Velicki et al.
295
ROC Curve
1.0
0.6
0.4
0.2 EuroSCORE_additive EuroSCORE_logistic EuroSCORE_lI Reference Line
0.0 0.0
0.2
A
0.6
0.8
1.0
1 - Specificity 30
Hospital mortailty (patient number)
0.4 0.4
Study cohort EuroSCORE additive
25 EuroSCORE logistic
20
EuroSCORE lI
15
10
5
1 9 17 25 33 41 49 57 65 73 81 89 97 105 113 121 129 137 145 153 161 169 177 185 193 201 209 217 225 233
0
B
Patient sequence number
Fig. 4 ROC curve analysis (A) and CUSUM chart (B) for the combined surgery subset. CUSUM, cumulative sum control chart; ROC, receiver operating characteristic.
10 from two European institutions. The hospital mortality rate was 9.7%. None of the EuroSCORE models performed well with an AUC of 0.67 for the additive EuroSCORE and the EuroSCORE II, and 0.66 for the logistic EuroSCORE. Model calibration was poor for the EuroSCORE II (χ2 16.5; p ¼ 0.035). The authors concluded that the key problem of risk stratification in high-risk patients has not been successfully addressed by the EuroSCORE II. Several recently published articles compared the EuroSCORE II with another widely accepted model—the Society of Thoracic Surgeons (STS) risk assessment tool. In a large-scale retrospective study by Kirmani et al18 the authors reported that the EuroSCORE II and the STS both provide equivalent discrimination in predicting mortality (AUC 0.818 vs. 0.805, respectively, p ¼ 0.343), as well as good calibration for patients with low to moderate risk, with divergence from 15% predicted risk. However, other reports suggest adequate calibration of
the both EuroSCORE II and STS model with unsatisfactory discriminative power.19,20 It is natural to assume that the procedure specific risk model, which allows the incorporation of risk factors specific to individual procedures, may be more appropriate in risk prediction. This is especially the case in the subset of patients undergoing valvular surgery. Superb model performance is critical when deciding on the treatment strategy. Based on the risk level, patients might be referred to either surgical or transcatheter aortic valve implantation. Given the importance of these issues, and the fact that the EuroSCORE II was developed on a dataset consisting mainly of coronary procedures, one might argue it may be less well adapted to aortic procedures than a specific score. The German aortic valve score was recently introduced for the specific purpose of mortality prediction related to aortic valve procedures in adults.21 A total of 11,794 cases were included in model development which resulted in identification of 15 risk factors Thoracic and Cardiovascular Surgeon
Vol. 62
No. 4/2014
This document was downloaded for personal use only. Unauthorized distribution is strictly prohibited.
Sensitivity
0.8
EuroSCORE II Validation Study
12
Velicki et al.
EuroSCORE additive EuroSCORE logistic EuroSCORE lI
10
Observed mortality
Mortality rate
8
6
4
2
0 Quartile 1
Quartile 2
Quartile 3
Quartile 4
Fig. 5 The observed versus predicted mortality rate for the EuroSCORE II and the other scores considered, according to the quartile distribution. The EuroSCORE II predicted mortality significantly differed from observed mortality in the third and the fourth quartile (p < 0.05). EuroSCORE, European System for Cardiac Operative Risk Evaluation.
based on multiple logistic regression. After the internal validation of the model, good calibration was observed (p ¼ 0.776) with rather acceptable discrimination (AUC 0.808). It is clear from our study as well as from the studies cited above that every EuroSCORE algorithm has intrinsic limitation which reflects on the ability to accurately predict the outcome. Demographic, institutional, and individual variations in practice may contribute to the divergence of observed versus expected.22 The EuroSCORE II was designed based on the data compiled from more than 150 institutions mostly from European centers. Such a large scale project has to account for demographic diversity, and natural variations in both clinical practice and technological advancement. It is not surprising that performance of individual regional institutions such as ours is not entirely aligned with a model generated using international data.15 It is for these reasons that considerable variations of the clinical performance of the EuroSCORE II have been reported. The EuroSCORE II is a reflection of continuous improvement in everyday clinical practice and mainly poses a recalibration aimed at replicating today’s outcomes in cardiac surgery. In our two previous articles,23,24 we reported on the constant improvement of the service provided in our hospital with the coronary surgery mortality of 2.7% in 2001, and the mortality of 1.5% in 2008. These results were achieved despite the trend of constant increase in the number of average risk factors present as well as the increase in the average EuroSCORE. We believe that the EuroSCORE II should be externally validated in every hospital where it is applied to acquire important insights on which patient subpopulation reasonable results could be achieved. This is especially important with calibration, more so than discriminative power. The predictive model which overestimates the level of risk falsely inspires confidence.25 In a corresponding manner, the model which underestimates differences between observed and expected mortalities might rather be seen as a result of the strength or weakness of the reporting hospital instead of score deficits. Thoracic and Cardiovascular Surgeon
Vol. 62
No. 4/2014
The authors could not find any published studies evaluating the EuroSCORE II in predicting significant complications or long-term survival—an equally important consideration. It would also be interesting to see how well the EuroSCORE II performs in the outcome prediction for patients undergoing percutaneous coronary intervention. Several limitations may be observed in the current study. The main limitation of our study is the relatively low sample size, which in turn further limits the groups for any subset analysis. As such, other (external) and maybe larger series would be warranted. Although, designed as a consecutive observational study, it still reflects the experience of a single healthcare institution and may not represent national and international practice and outcomes, which may lead to a potential bias and results requiring further examination with a large number of patients across the multicenter database. In conclusion, the EuroSCORE II predicts hospital mortality with satisfactory results. Although, more sophisticated approach was used regarding the number and definitions of risk factors included, it does not seem to significantly improve the performance of previous iterations. The results of this study show that despite having solid discriminative capacity, the calibration of EuroSCORE II is potentially troublesome for isolated coronary surgery.
Note This paper was supported by the Provincial Secretariat for Science and Technological Development of the Autonomous Province of Vojvodina (Serbia) (Grant number 114–451– 2131/2011).
Conflict of Interest None.
This document was downloaded for personal use only. Unauthorized distribution is strictly prohibited.
296
EuroSCORE II Validation Study
14 Siregar S, Groenwold RH, de Heer F, Bots ML, van der Graaf Y, van
1 Takkenberg JJ, Kappetein AP, Steyerberg EW. The role of Euro-
2
3 4
5
6
7 8
9 10
11
12
13
SCORE II in 21st century cardiac surgery practice. Eur J Cardiothorac Surg 2013;43(1):32–33 Grant SW, Hickey GL, Dimarakis I, et al. How does EuroSCORE II perform in UK cardiac surgery; an analysis of 23 740 patients from the Society for Cardiothoracic Surgery in Great Britain and Ireland National Database. Heart 2012;98(21):1568–1572 Roques F, Michel P, Goldstone AR, Nashef SA. The logistic EuroSCORE. Eur Heart J 2003;24(9):881–882 Roques F, Nashef SA, Michel P, et al. Risk factors and outcome in European cardiac surgery: analysis of the EuroSCORE multinational database of 19030 patients. Eur J Cardiothorac Surg 1999;15(6): 816–822, discussion 822–823 Zhang GX, Wang C, Wang L, et al. Validation of EuroSCORE II in Chinese patients undergoing heart valve surgery. Heart Lung Circ 2013;22(8):606–611 Di Dedda U, Pelissero G, Agnelli B, De Vincentiis C, Castelvecchio S, Ranucci M. Accuracy, calibration and clinical performance of the new EuroSCORE II risk stratification system. Eur J Cardiothorac Surg 2013;43(1):27–32 Nashef SA, Roques F, Sharples LD, et al. EuroSCORE II. Eur J Cardiothorac Surg 2012;41(4):734–744, discussion 744–745 Nashef SA, Roques F, Michel P, Gauducheau E, Lemeshow S, Salamon R. European system for cardiac operative risk evaluation (EuroSCORE). Eur J Cardiothorac Surg 1999;16(1):9–13 Nashef SA. The current role of EuroSCORE. Semin Thorac Cardiovasc Surg 2012;24(1):11–12 Barili F, Pacini D, Capo A, et al. Does EuroSCORE II perform better than its original versions? A multicentre validation study. Eur Heart J 2013;34(1):22–29 Gogbashian A, Sedrakyan A, Treasure T. EuroSCORE: a systematic review of international performance. Eur J Cardiothorac Surg 2004;25(5):695–700 Nashef SA, Roques F, Hammill BG, et al; EurpSCORE Project Group. Validation of European System for Cardiac Operative Risk Evaluation (EuroSCORE) in North American cardiac surgery. Eur J Cardiothorac Surg 2002;22(1):101–105 Ranucci M, Castelvecchio S, Menicanti L, Frigiola A, Pelissero G. Accuracy, calibration and clinical performance of the EuroSCORE: can we reduce the number of variables? Eur J Cardiothorac Surg 2010;37(3):724–729
297
15
16
17
18
19
20
21
22
23
24
25
Herwerden LA. Performance of the original EuroSCORE. Eur J Cardiothorac Surg 2012;41(4):746–754 Chalmers J, Pullan M, Fabri B, et al. Validation of EuroSCORE II in a modern cohort of patients undergoing cardiac surgery. Eur J Cardiothorac Surg 2013;43(4):688–694 Carnero-Alcázar M, Silva Guisasola JA, Reguillo Lacruz FJ, et al. Validation of EuroSCORE II on a single-centre 3800 patient cohort. Interact Cardiovasc Thorac Surg 2013;16(3):293–300 Howell NJ, Head SJ, Freemantle N, et al. The new EuroSCORE II does not improve prediction of mortality in high-risk patients undergoing cardiac surgery: a collaborative analysis of two European centres. Eur J Cardiothorac Surg 2013;44(6):1006–1011 Kirmani BH, Mazhar K, Fabri BM, Pullan DM. Comparison of the EuroSCORE II and Society of Thoracic Surgeons 2008 risk tools. Eur J Cardiothorac Surg 2013;44(6):999–1005 Kunt AG, Kurtcephe M, Hidiroglu M, et al. Comparison of original EuroSCORE, EuroSCORE II and STS risk models in a Turkish cardiac surgical cohort. Interact Cardiovasc Thorac Surg 2013;16(5): 625–629 Borde D, Gandhe U, Hargave N, Pandey K, Khullar V. The application of European system for cardiac operative risk evaluation II (EuroSCORE II) and Society of Thoracic Surgeons (STS) risk-score for risk stratification in Indian patients undergoing cardiac surgery. Ann Card Anaesth 2013;16(3):163–166 Kötting J, Schiller W, Beckmann A, et al. German Aortic Valve Score: a new scoring system for prediction of mortality related to aortic valve procedures in adults. Eur J Cardiothorac Surg 2013; 43(5):971–977 Poullis M. Introducing change (science into the operating room): quality improvement versus experimentation. J Extra Corpor Technol 2009;41(4):11–15 Mihajlović B, Nićin S, Kovacević P, et al. Evaluation of results in coronary surgery using EuroSCORE [in Serbian]. Srp Arh Celok Lek 2011;139(1-2):25–29 Mihajlović B, Nićin S, Cemerlić-Adjić N, et al. Trends of risk factors in coronary surgery [in Serbian]. Srp Arh Celok Lek 2010; 138(9-10):570–576 Grant SW, Grayson AD, Jackson M, et al. Does the choice of riskadjustment model influence the outcome of surgeon-specific mortality analysis? A retrospective analysis of 14,637 patients under 31 surgeons. Heart 2008;94(8):1044–1049
Thoracic and Cardiovascular Surgeon
Vol. 62
No. 4/2014
This document was downloaded for personal use only. Unauthorized distribution is strictly prohibited.
References
Velicki et al.
Copyright of Thoracic & Cardiovascular Surgeon is the property of Georg Thieme Verlag Stuttgart and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use.