Original Cardiovascular

Clinical Performance of the EuroSCORE II Compared with the Previous EuroSCORE Iterations Lazar Velicki1,2 Nada Cemerlic-Adjic1,2 Bogoljub Mihajlovic1,2 Miklos Fabri2

Katica Pavlovic1,2

1 Department of Surgery, Medical Faculty, University of Novi Sad,

Novi Sad, Serbia 2 Department of Cardiovascular Surgery, Institute of Cardiovascular Diseases Vojvodina, Sremska Kamenica, Serbia 3 Department of Mathematics, State University of Novi Pazar, Novi Pazar, Serbia

Bojan B. Mihajlovic2

Dragic Bankovic3

Address for correspondence Lazar Velicki, MD, PhD, Institute of Cardiovascular Diseases Vojvodina, Clinic of Cardiovascular Surgery, Put Doktora Goldmana 4, 21204 Sremska Kamenica, Serbia (e-mail: [email protected]).

Thorac Cardiovasc Surg 2014;62:288–297.

Abstract

Keywords

► ► ► ► ► ►

cardiac surgery risk assessment EuroSCORE outcome mortality prediction

Background The European System for Cardiac Operative Risk Evaluation (EuroSCORE) II has been recently introduced as an update to the previous versions. We sought to evaluate the predictive performance of the EuroSCORE II model against the original additive and logistic EuroSCORE models. Patients and Methods The study included 1,247 consecutive patients who underwent cardiac surgery procedures during a 14-month period starting from the beginning of 2012. The original additive and logistic EuroSCORE models were compared with the EuroSCORE II focusing on the accuracy of predicting hospital mortality. Results The overall hospital mortality rate was 3.45%. The discriminative power of the EuroSCORE II was modest and similar to other algorithms (C-statistics 0.754 for additive EuroSCORE; 0.759 for logistic EuroSCORE; and 0.743 for EuroSCORE II). The EuroSCORE II significantly underestimated the all-patient hospital mortality (3.45% observed vs. 2.12% predicted), as well as in the valvular (3.74% observed vs. 2% predicted), and combined surgery cohorts (6.87% observed vs. 3.64% predicted). The predicted EuroSCORE mortality significantly differed from the observed mortality in the third and the fourth quartile of patients stratified according to the EuroSCORE II mortality risk (p < 0.05). The calibration of the EuroSCORE II was generally good for the entire patient population (Hosmer-Lemeshow [HL] p ¼ 0.139), for the valvular surgery subset (HL p ¼ 0.485), and for the combined surgery subset (HL p ¼ 0.639). Conclusion The EuroSCORE II might be considered a solid predictive tool for hospital mortality. Although, the EuroSCORE II employs more sophisticated calculation methods regarding the number and definition of risk factors included, it does not seem to significantly improve the performance of previous iterations.

Introduction An important component of modern cardiac surgery practice is that of data recording, collection, and analysis for the purpose of assessing and improving the quality of service,

received August 28, 2013 accepted after revision November 30, 2013 published online April 21, 2014

surgical decision-making, and preoperative patient education (informed consent).1 Accurate risk stratification is critical in this endeavor. When using models for this purpose it is vital that the clinician should have formally derived the underlying prediction, know the extent to which their own performance

© 2014 Georg Thieme Verlag KG Stuttgart · New York

DOI http://dx.doi.org/ 10.1055/s-0034-1367734. ISSN 0171-6425.

This document was downloaded for personal use only. Unauthorized distribution is strictly prohibited.

288

EuroSCORE II Validation Study

Patients and Methods Study Design This study examines the data of 1,247 consecutive patients who underwent major cardiac surgery at our institute over a 14-month period since the beginning of 2012. The patients included exclusively underwent: 1. Isolated coronary surgery (coronary artery bypass grafting and off-pump coronary artery bypass). 2. Isolated valvular surgery (mitral, aortic, and tricuspid surgery, with prosthetic replacements or repairs, through traditional or minimally invasive techniques). 3. Combined coronary and valvular surgery. All case data were included irrespective of the priority level (elective, urgent, emergency, salvage). The study was approved by the Institutional Review Board.

289

Operative risks were calculated before the surgery by the EuroSCORE II outcome prediction model7 as well as both the additive and logistic EuroSCORE.3,8 The calculation of the score of each patient was performed by a cardiologist and a cardiac surgeon at our center by means of the EuroSCORE calculator embedded in the institution database system. The calculator was applied by using the appropriate official logistic regression coefficients and it was thoroughly tested.

Outcome Events The outcome event under consideration was the all-cause hospital mortality—defined as death after cardiac procedure during the index hospitalization irrespective of the mortality cause (cardiac or noncardiac).

Statistical Analysis Descriptive statistical data are presented for categorical variables as frequencies (percentages) and were compared between the groups by using either the Pearson χ2 test or Fisher exact test. Continuous variables, expressed as means  standard deviation, were compared between the groups by using the unpaired Student t-test or the Wilcoxon rank-sum test (depending on the normality of the distribution). A p-value of less than 0.05 was considered to be significant. The evaluation of the original EuroSCORE models and the EuroSCORE II was performed by comparing the observed and expected hospital mortality. The calibration of models was assessed by using the Hosmer-Lemeshow (HL) test. A wellcalibrated model gives a p-value greater than 0.05. The model discrimination was tested by means of receiver operating characteristic (ROC) curves calculating the area under the curve (AUC) and its 95% confidence intervals (CI)—an index which was used to assess how well the model could discriminate between survivors and nonsurvivors. The cumulative sum (CUSUM) control charts were constructed for a visual analysis of the models. The accuracy and clinical performance of all EuroSCORE instances were tested in the patient subgroups based on the type of cardiac operation, and the subgroups based on risk categorization (quartiles of distribution according to the EuroSCORE II). The statistical analyses were performed with SPSS version 19.0 (SPSS Inc., Chicago, Illinois, United States) and MedCalc for Windows, version 12.2.1 (MedCalc Software, Mariakerke, Belgium).

Results A total of 1,247 cardiac procedures which were performed at our institution between January 2012 and March 2013 include: 718 myocardial revascularizations (57.58%), 294 isolated valve surgeries (23.58%), and 233 combined valve and coronary procedures (18.68%). The mean values and standard deviations of additive EuroSCORE, logistic EuroSCORE, and EuroSCORE II of the patient population were 4.43  2.92, 5.27  6.58, and 2.12  2.78, respectively. The patient-related and surgery-related data are summarized in ►Table 1. The overall hospital mortality rate was 3.45% giving an observed to expected EuroSCORE II, O:E, ratio Thoracic and Cardiovascular Surgeon

Vol. 62

No. 4/2014

This document was downloaded for personal use only. Unauthorized distribution is strictly prohibited.

is reflected in the prediction, and adjust the estimate up or down for important risk factors not captured in the prediction model.1–3 Numerous risk models for predicting postoperative mortality following a major cardiac surgery exist in common usage, one of the more popular being the European System for Cardiac Operation Risk Evaluation (EuroSCORE).3,4 The EuroSCORE, in its both additive and logistic form, has been extensively used over the last decade for the outcome prediction and hospital performance benchmarking.5 General consensus and opinion is that the model shows a good level of accuracy, with a C-statistic of around 0.75 to 0.80, but could use an improvement or recalibration especially with high-risk patients.6 Recently, a new iteration of EuroSCORE has been presented: the EuroSCORE II.7 The EuroSCORE II was designed on the basis of the preoperative patient data of more than 22,000 patients (mostly European), the type of surgery performed, and the corresponding outcome. The model incorporates additional factors and removes or clarifies several existing ones, but it mainly poses a recalibration with the purpose of reflecting today’s outcomes in cardiac surgery. The internal validation shows an improved C-statistic compared with the previous logistic EuroSCORE model (C-IndexEuroSCOREII ¼ 0.81 vs. C-IndexEuroSCORE ¼ 0.78) and good calibration (Hosmer– Lemeshow χ2 ¼ 15.48; p ¼ 0.0505).7 To comprehensively assess the role of EuroSCORE II and to confirm its applicability in contemporary cardiac surgery practice, external validation is mandated. Moreover, external validation is needed to assess the service provided by the specific hospital, which should be aligned with the “gold standard,” that is, the EuroSCORE II. The EuroSCORE II being a model developed from a large multinational cohort might therefore well be considered a reference group incorporating different levels of outcomes. Before the EuroSCORE II can be accepted as an appropriate risk model it requires validation in the population in which it is intended to be used.2 We sought to evaluate the predictive performance of the EuroSCORE II model against the original additive and logistic EuroSCORE risk models in a contemporary set of patients undergoing heart surgery at our center.

Velicki et al.

EuroSCORE II Validation Study

Velicki et al.

Table 1 Patient profile

Age (y)

All patients (n ¼ 1247)

Coronary surgery (n ¼ 718)

Valvular surgery (n ¼ 294)

Combined surgery (n ¼ 233)

64.00 (57–70)

63.00 (57–69)

63.00 (55–71)

67.00 (61–73)

Gender (female)

399 (32)

198 (27.58)

132 (44.90)

67 (28.75)

Body mass index (kg/m2)

27.69 (25.03–30.47)

27.72 (25.18–30.44)

27.68 (24.49–30.96)

27.68 (25.22–30.30)

Body surface area (m )

1.94 (1.81–2.09)

1.97 (1.82–2.10)

1.91 (1.76–2.07)

1.94 (1.81–2.07)

IDDM

130 (10.43)

85 (11.83)

13 (4.42)

32 (13.73)

2

NIDDM

181 (14.51)

117 (16.29)

34 (11.57)

30 (12.88)

Left ventricle ejection fraction (%)

55.00 (46–60)

55.00 (46–60)

60.00 (50–63)

54 (40–60)

Unstable angina

112 (8.98)

102 (14.21)

8 (2.72)

2 (0.86)

Recent MI (< 90 d)

203 (16.28)

169 (23.54)

3 (1.02)

30 (12.88)

Renal dysfunction

13 (1.04)

5 (0.70)

4 (1.36)

4 (1.72)

Previous heart surgery

25 (2.00)

10 (1.39)

11 (3.74)

4 (1.72)

COPD

80 (6.42)

40 (5.57)

22 (7.48)

18 (7.73)

Peripheral vascular disease

175 (14.03)

96 (13.37)

24 (8.16)

54 (23.18)

LMCA stenosis

125 (10.02)

106 (14.76)

0 (0)

19 (8.15)

3VD

522 (41.86)

435 (60.58)

0 (0)

87 (37.34)

Previous PCI

131 (10.51)

101 (14.07)

9 (3.06)

21 (9.01)

Urgent surgery

29 (2.33)

24 (3.34)

2 (0.68)

3 (1.29)

OPCAB

15 (1.2)

12 (1.7)

1 (0.3)

2 (0.9)

Number of distal anastomoses

2 (1–3)

2 (2–3)

0 (0)

2 (1–2)

Aortic X-clamp time (min)

60 (44–81)

50 (36–65)

60 (56–86)

54 (72–110)

CPB time (min)

71 (54–94)

59 (47–74)

81 (67–102)

108 (86–126)

Hospital mortality

43 (3.45)

16 (2.23)

11 (3.74)

16 (6.87)

Abbreviations: 3VD, three vessel disease; COPD, chronic obstructive pulmonary disease; CPB, cardiopulmonary bypass; d, days; IDDM, insulindependent diabetes mellitus; LMCA, left main coronary artery; MI, myocardial infarction; min, minutes; NIDDM, noninsulin-dependent diabetes mellitus; OPCAB, off-pump coronary artery bypass; PCI, percutaneous coronary intervention; y, years. Note: Categorical variables are shown as % (n); continuous variables shown as median (25th percentile, 75th percentile).

of 1.63. The differential mortality rate in coronary, valvular, and combined surgery was: 2.33, 3.74, and 6.87%, respectively. ►Table 2 summarizes the comparison between the predicted and observed mortality rate through different types of surgery. There was a significant difference between the observed and predicted all-patient hospital mortality by using the logistic EuroSCORE (overestimation) and the EuroSCORE II (underestimation). The logistic EuroSCORE significantly overestimated the mortality risk in the coronary surgery cohort. In the valvular and combined surgery cohorts, the EuroSCORE II significantly underestimated the mortality risk. The performance of the model with respect to discriminative power and calibration is presented in ►Table 3. The accuracy of the models is comparable with highest AUC variation in the valvular surgery subset. The EuroSCORE II was outperformed in every surgery-type subset except in the combined surgery where it demonstrated a marginally improved AUC, 0.678 versus 0.649. It was observed that the lower limit of the AUC 95% CI of the ROC curves for the Thoracic and Cardiovascular Surgeon

Vol. 62

No. 4/2014

combined surgery group was well below 0.6 thus indicating poor discriminative capacity. A lack of improvement in discriminative capacity was observed with EuroSCORE II for the entire sample and all the remaining surgical subgroups (coronary and valvular). ►Table 3 shows the comparison of the χ2 HL test for the pathology subgroups. The HL statistics demonstrated a significant overall lack of calibration of the EuroSCORE II model in coronary surgery (p ¼ 0.035). ROC curves and CUSUM curves are presented in ►Figs. 1–4. The clinical performance of the EuroSCORE II was tested in different populations of predicted mortality risk patients (►Fig. 5). The observed population was divided into quartiles according to the EuroSCORE II, and a comparison between the observed and predicted mortality according to the models considered. With low-risk patients (1st quartile, predicted risk 0–0.79%), all the models performed well. With mild-risk patients (2nd quartile, predicted risk 0.8–1.27%), only the EuroSCORE II performed well while the additive and logistic EuroSCORE significantly overestimated the mortality risk (p ¼ 0.013 and 0.029, respectively). In moderate-risk patient

This document was downloaded for personal use only. Unauthorized distribution is strictly prohibited.

290

EuroSCORE II Validation Study

Velicki et al.

291

Table 2 Predicted and observed hospital mortality rate through different types of surgery PM (%)

MR

p-Value

Additive EuroSCORE

4.43

0.78

0.092

Logistic EuroSCORE

5.27

0.65

0.002

EuroSCORE II

2.13

1.63

0.004

3.48

0.67

0.067

All patients (OM ¼ 3.45%)

Additive EuroSCORE Logistic EuroSCORE

3.70

0.63

0.037

EuroSCORE II

1.67

1.39

0.243

4.59

0.81

0.487

Valvular surgery (OM ¼ 3.74%) Additive EuroSCORE Logistic EuroSCORE

4.89

0.76

0.361

EuroSCORE II

2.00

1.87

0.033

Additive EuroSCORE

7.13

0.96

0.876

Logistic EuroSCORE

10.53

0.65

0.069

EuroSCORE II

3.65

1.89

0.009

Combined surgery (OM ¼ 6.87%)

Abbreviations: EuroSCORE, European System for Cardiac Operative Risk Evaluation; MR, mortality risk; OM, observed mortality; PM, predicted mortality.

groups (3rd quartile, predicted risk 1.28–2.34%), the additive and logistic EuroSCORE performed reasonably well, but the EuroSCORE II had significantly underestimated the mortality (p ¼ 0.013). With high-risk patients (4th quartile, predicted risk > 2.35%), the EuroSCORE II again significantly underestimated the mortality (p ¼ 0.024) with other scores performing well, although given the small numbers in the highrisk group it might be difficult to draw such a conclusion.

Discussion Preoperative risk prediction models have a critical role in current cardiac surgical practice.9 In recent years, additive

and logistic EuroSCORE models have been widely used as risk prediction tools for adult cardiac surgery especially in Europe.10 Due to ongoing improvements in surgical practices, perioperative management, and changing patient profile, it was observed that both the discriminative power and calibration of EuroSCORE are decreasing.11–13 This is especially the case for the EuroSCORE calibration which constantly leads to high-grade overprediction clearly demonstrated in a recent systematic review.14 The other observation that must be considered is the fact that surgical mortality has progressively declined in all risk categories over time. Therefore, when assessing the performance of the model in any single surgical practice, one must consider

Table 3 Model performance All patients (n ¼ 1,247)

Coronary surgery (n ¼ 718)

Valvular surgery (n ¼ 294)

Combined surgery (n ¼ 233)

0.754 (0.684–0.823)

0.735 (0.619–0.850)

0.781 (0.641–0.921)

0.649 (0.516–0.783)

4.822, 0.903

7.832, 0.251

4.859, 0.562

8.633, 0.280

0.759 (0.688–0.830)

0.742 (0.627–0.857)

0.789 (0.647–0.932)

0.651 (0.520–0.782)

18.954, 0.015

12.919, 0.115

10.923, 0.206

6.227, 0.622

AUC (95% CI)

0.743 (0.666–0.820)

0.721 (0.578–0.864)

0.730 (0.570–0.890)

0.678 (0.531–0.824)

HL test (χ2, p-value)

12.295, 0.139

16.577, 0.035

7.487, 0.485

6.070, 0.639

Additive EuroSCORE AUC (95% CI) 2

HL test (χ , p-value) Logistic EuroSCORE AUC (95% CI) 2

HL test (χ , p-value) EuroSCORE II

Abbreviations: AUC, area under the curve; CI, confidence interval; EuroSCORE, European System for Cardiac Operative Risk Evaluation; HL, Hosmer-Lemeshow. Thoracic and Cardiovascular Surgeon

Vol. 62

No. 4/2014

This document was downloaded for personal use only. Unauthorized distribution is strictly prohibited.

Coronary surgery (OM ¼ 2.23%)

EuroSCORE II Validation Study

Velicki et al.

ROC Curve

1.0

0.6

0.4

0.2 EuroSCORE_additive EuroSCORE_logistic EuroSCORE_II Reference Line

0.0 0.0

0.2

A

0.4

0.6

0.8

1.0

1 - Specificity 70

Study cohort EuroSCORE additive

60

EuroSCORE logistic

50 EuroSCORE lI

40 30 20 10

1 39 77 115 153 191 229 267 305 343 381 419 457 495 533 571 609 647 685 723 761 799 837 875 913 951 989 1027 1065 1103 1141 1179 1217

0

B

Patient sequence number

Fig. 1 ROC curve analysis (A) and CUSUM chart (B) for the entire sample regardless the type of surgery. CUSUM, cumulative sum control chart; ROC, receiver operating characteristic.

not only the performance of the model but the performance of the surgical team as well. The EuroSCORE II is the latest iteration of the original EuroSCORE model. The model is based on logistic regression and incorporates additional factors but also removes or more clearly defines (clarifies) previously used data. It was devised around a more recent dataset with the intent to balance the gap between the observed and predicted mortality. During the validation, the EuroSCORE II was computed against a consecutive cohort of 16,828 patients, and its validity was estimated in another cohort of 5,553 patients thus demonstrating an excellent discriminative power (AUC 0.81, 95% CI 0.78–0.83).7 As a result, the internal validation unfortunately did not demonstrate a significantly better calibration. In this study, we tried to validate the EuroSCORE II model on a cohort of recent patients submitted for a major cardiac surgery operation in a single center in Serbia. Compared with the population used to derive the EuroSCORE II, our sample was comprised of more patients with diabetes requiring insulin (10.4 vs. 7.6%), and less patients with pulmonary Thoracic and Cardiovascular Surgeon

Vol. 62

No. 4/2014

diseases (6.4 vs. 10.7%). Our study confirmed the unsatisfactory calibration of old and new versions of EuroSCORE (HL statistics p-values less than 0.05) with a diverse pattern of miscalibration as highlighted in ►Table 3. Although, the EuroSCORE overestimated and the EuroSCORE II underestimated the mortality in our data sample, both were sufficiently accurate to warrant their application—perhaps after applying some caution. Conclusions drawn based on our data sample analysis are: 1. EuroSCORE II has accuracy similar to its previous versions, with the older EuroSCORE potentially outperforming the EuroSCORE II in the case of isolated coronary and isolated valvular patients. 2. Calibration of EuroSCORE II was generally good for the entirety of patient population and for the valvular and combined surgery subsets in particular. 3. In moderate- and high-risk patients, the EuroSCORE II may actually underestimate the hospital mortality risk, although no definite conclusion can be reached.

This document was downloaded for personal use only. Unauthorized distribution is strictly prohibited.

Sensitivity

0.8

Hospital mortailty (patient number)

292

4. EuroSCORE II shows clear benefit over previous iterations when used for risk assessment in combined surgery groups of patients, or when evaluating the risk for the entire group of patients regardless of the surgery type. 5. In brief, our sample data indicate that the EuroSCORE overestimates while the EuroSCORE II underestimates the hospital mortality. Neither of models shows perfect alignment with the real-patient data in our hospital, although the EuroSCORE II should be preferably used keeping in mind the constraints and limitations mentioned above. The San Donato group already reported similar EuroSCORE II validation results.6 In their study, which included a total of 1,090 consecutive patients, the authors found that the accuracy of the EuroSCORE II was good (AUC 0.81) but not significantly higher than the other scores. In patients at

Velicki et al.

low, mild to moderate, and high mortality risk, the EuroSCORE II provided a risk prediction not significantly different from the observed mortality rate, whereas in very high-risk patients (observed mortality rate 11%), it significantly underestimated (6.5%) the mortality risk. The authors concluded that the accuracy of the EuroSCORE II was acceptable in isolated coronary surgery, and good or excellent with other surgical procedure types. The Liverpool group focused on the assessment of clinical performance of the EuroSCORE II in different surgery subsets of patients.15 The authors found that the EuroSCORE II is a reasonable risk model for hospital mortality from isolated coronary surgery (AUC 0.79; HL p ¼ 0.052) and aortic procedures (AUC 0.81; HL p ¼ 0.43), and excellent for mitral valve surgery (AUC 0.87; HL p ¼ 0.6). However, the EuroSCORE II failed to improve on the original EuroSCORE model for isolated aortic valve replacements (AUC 0.69; HL

ROC Curve

1.0

Sensitivity

0.8

0.6

0.4

0.2 EuroSCORE_additive EuroSCORE_logistic EuroSCORE_lI Reference Line

0.0 0.0

0.2

0.4

0.6

0.8

1.0

1 - Specificity

A 30 Hospital mortailty (patient number)

Study cohort

25

EuroSCORE additive EuroSCORE logistic

20

EuroSCORE lI

15 10 5

1 23 45 67 89 111 133 155 177 199 221 243 265 287 309 331 353 375 397 419 441 463 485 507 529 551 573 595 617 639 661 683 705

0

B

Patient sequence number

Fig. 2 ROC curve analysis (A) and CUSUM chart (B) for the coronary surgery subset. CUSUM, cumulative sum control chart; ROC, receiver operating characteristic. Thoracic and Cardiovascular Surgeon

Vol. 62

No. 4/2014

293

This document was downloaded for personal use only. Unauthorized distribution is strictly prohibited.

EuroSCORE II Validation Study

EuroSCORE II Validation Study

Velicki et al.

ROC Curve

1.0

0.6

0.4

0.2 EuroSCORE_additive EuroSCORE_logistic EuroSCORE_lI Reference Line

0.0 0.0

0.2

A

0.4

0.6

0.8

1.0

1 - Specificity 16

Study cohort

14

EuroSCORE additive

12

EuroSCORE logistic

10

EuroSCORE lI

8 6 4 2 1 11 21 31 41 51 61 71 81 91 101 111 121 131 141 151 161 171 181 191 201 211 221 231 241 251 261 271 281 291

0

B

Patient sequence number

Fig. 3 ROC curve analysis (A) and CUSUM chart (B) for the valvular surgery subset. CUSUM, cumulative sum control chart; ROC, receiver operating characteristic.

p ¼ 0.07). Another EuroSCORE II external validation study has been recently published including 4,342 patients.16 The AUC for EuroSCORE (0.82, 95% CI 0.79–0.85) was lower than that for EuroSCORE II (0.85, 95% CI 0.83–0.87) but the difference was not statistically significant (p ¼ 0.056). The two models showed poor calibration in the sample: EuroSCORE (χ2 ¼ 39.3, HL p < 0.001) and EuroSCORE II (χ2 ¼ 86.69, HL p < 0.001). The calibration of EuroSCORE was poor in the groups of patients undergoing coronary (HL p ¼ 0.01), valve (HL p ¼ 0.01), and combined coronary valve surgery (HL p ¼ 0.012); and that of EuroSCORE II in the group of coronary (HL p ¼ 0.001) and valve surgery (HL p < 0.001) patients. Another recent study focused on EuroSCORE II validation in Chinese patients submitted to isolated valve surgery.5 The discriminative power of the EuroSCORE II model was good for the single valve surgery group (AUC 0.792) and was poor for the multiple valve surgery group (AUC 0.605). The EuroSCORE II model showed good calibration in predicting hospital Thoracic and Cardiovascular Surgeon

Vol. 62

No. 4/2014

mortality for patients undergoing single valve surgery (HL p ¼ 0.103) and poor calibration for patients undergoing multiple valve surgery (HL p < 0.0001). One of the biggest multicenter study evaluating the accuracy and performance of EuroSCORE II that was published to date involved 12,325 consecutive patients in Italy.10 The authors reported the hospital mortality rate of 2.2%. The discriminatory power was high and similar in all models (AUC 0.82, 95% CI 0.79–0.84 for additive EuroSCORE; 0.82, 95% CI 0.79–0.84 for logistic EuroSCORE; 0.82, 95% CI 0.80–0.85 for EuroSCORE II). The EuroSCORE II had a fair calibration till 30% predicted values and overpredicted beyond. Finally, they concluded that the EuroSCORE II does not seem to significantly improve the performance of older versions in the higher tertiles of risk. A study specifically designed to assess accuracy of the EuroSCORE II in outcome prediction in high-risk patients was also recently published.17 The authors identified a cohort totaling 933 patients with a preoperative logistic EuroSCORE

This document was downloaded for personal use only. Unauthorized distribution is strictly prohibited.

Sensitivity

0.8

Hospital mortailty (patient number)

294

EuroSCORE II Validation Study

Velicki et al.

295

ROC Curve

1.0

0.6

0.4

0.2 EuroSCORE_additive EuroSCORE_logistic EuroSCORE_lI Reference Line

0.0 0.0

0.2

A

0.6

0.8

1.0

1 - Specificity 30

Hospital mortailty (patient number)

0.4 0.4

Study cohort EuroSCORE additive

25 EuroSCORE logistic

20

EuroSCORE lI

15

10

5

1 9 17 25 33 41 49 57 65 73 81 89 97 105 113 121 129 137 145 153 161 169 177 185 193 201 209 217 225 233

0

B

Patient sequence number

Fig. 4 ROC curve analysis (A) and CUSUM chart (B) for the combined surgery subset. CUSUM, cumulative sum control chart; ROC, receiver operating characteristic.

10 from two European institutions. The hospital mortality rate was 9.7%. None of the EuroSCORE models performed well with an AUC of 0.67 for the additive EuroSCORE and the EuroSCORE II, and 0.66 for the logistic EuroSCORE. Model calibration was poor for the EuroSCORE II (χ2 16.5; p ¼ 0.035). The authors concluded that the key problem of risk stratification in high-risk patients has not been successfully addressed by the EuroSCORE II. Several recently published articles compared the EuroSCORE II with another widely accepted model—the Society of Thoracic Surgeons (STS) risk assessment tool. In a large-scale retrospective study by Kirmani et al18 the authors reported that the EuroSCORE II and the STS both provide equivalent discrimination in predicting mortality (AUC 0.818 vs. 0.805, respectively, p ¼ 0.343), as well as good calibration for patients with low to moderate risk, with divergence from 15% predicted risk. However, other reports suggest adequate calibration of

the both EuroSCORE II and STS model with unsatisfactory discriminative power.19,20 It is natural to assume that the procedure specific risk model, which allows the incorporation of risk factors specific to individual procedures, may be more appropriate in risk prediction. This is especially the case in the subset of patients undergoing valvular surgery. Superb model performance is critical when deciding on the treatment strategy. Based on the risk level, patients might be referred to either surgical or transcatheter aortic valve implantation. Given the importance of these issues, and the fact that the EuroSCORE II was developed on a dataset consisting mainly of coronary procedures, one might argue it may be less well adapted to aortic procedures than a specific score. The German aortic valve score was recently introduced for the specific purpose of mortality prediction related to aortic valve procedures in adults.21 A total of 11,794 cases were included in model development which resulted in identification of 15 risk factors Thoracic and Cardiovascular Surgeon

Vol. 62

No. 4/2014

This document was downloaded for personal use only. Unauthorized distribution is strictly prohibited.

Sensitivity

0.8

EuroSCORE II Validation Study

12

Velicki et al.

EuroSCORE additive EuroSCORE logistic EuroSCORE lI

10

Observed mortality

Mortality rate

8

6

4

2

0 Quartile 1

Quartile 2

Quartile 3

Quartile 4

Fig. 5 The observed versus predicted mortality rate for the EuroSCORE II and the other scores considered, according to the quartile distribution. The EuroSCORE II predicted mortality significantly differed from observed mortality in the third and the fourth quartile (p < 0.05). EuroSCORE, European System for Cardiac Operative Risk Evaluation.

based on multiple logistic regression. After the internal validation of the model, good calibration was observed (p ¼ 0.776) with rather acceptable discrimination (AUC 0.808). It is clear from our study as well as from the studies cited above that every EuroSCORE algorithm has intrinsic limitation which reflects on the ability to accurately predict the outcome. Demographic, institutional, and individual variations in practice may contribute to the divergence of observed versus expected.22 The EuroSCORE II was designed based on the data compiled from more than 150 institutions mostly from European centers. Such a large scale project has to account for demographic diversity, and natural variations in both clinical practice and technological advancement. It is not surprising that performance of individual regional institutions such as ours is not entirely aligned with a model generated using international data.15 It is for these reasons that considerable variations of the clinical performance of the EuroSCORE II have been reported. The EuroSCORE II is a reflection of continuous improvement in everyday clinical practice and mainly poses a recalibration aimed at replicating today’s outcomes in cardiac surgery. In our two previous articles,23,24 we reported on the constant improvement of the service provided in our hospital with the coronary surgery mortality of 2.7% in 2001, and the mortality of 1.5% in 2008. These results were achieved despite the trend of constant increase in the number of average risk factors present as well as the increase in the average EuroSCORE. We believe that the EuroSCORE II should be externally validated in every hospital where it is applied to acquire important insights on which patient subpopulation reasonable results could be achieved. This is especially important with calibration, more so than discriminative power. The predictive model which overestimates the level of risk falsely inspires confidence.25 In a corresponding manner, the model which underestimates differences between observed and expected mortalities might rather be seen as a result of the strength or weakness of the reporting hospital instead of score deficits. Thoracic and Cardiovascular Surgeon

Vol. 62

No. 4/2014

The authors could not find any published studies evaluating the EuroSCORE II in predicting significant complications or long-term survival—an equally important consideration. It would also be interesting to see how well the EuroSCORE II performs in the outcome prediction for patients undergoing percutaneous coronary intervention. Several limitations may be observed in the current study. The main limitation of our study is the relatively low sample size, which in turn further limits the groups for any subset analysis. As such, other (external) and maybe larger series would be warranted. Although, designed as a consecutive observational study, it still reflects the experience of a single healthcare institution and may not represent national and international practice and outcomes, which may lead to a potential bias and results requiring further examination with a large number of patients across the multicenter database. In conclusion, the EuroSCORE II predicts hospital mortality with satisfactory results. Although, more sophisticated approach was used regarding the number and definitions of risk factors included, it does not seem to significantly improve the performance of previous iterations. The results of this study show that despite having solid discriminative capacity, the calibration of EuroSCORE II is potentially troublesome for isolated coronary surgery.

Note This paper was supported by the Provincial Secretariat for Science and Technological Development of the Autonomous Province of Vojvodina (Serbia) (Grant number 114–451– 2131/2011).

Conflict of Interest None.

This document was downloaded for personal use only. Unauthorized distribution is strictly prohibited.

296

EuroSCORE II Validation Study

14 Siregar S, Groenwold RH, de Heer F, Bots ML, van der Graaf Y, van

1 Takkenberg JJ, Kappetein AP, Steyerberg EW. The role of Euro-

2

3 4

5

6

7 8

9 10

11

12

13

SCORE II in 21st century cardiac surgery practice. Eur J Cardiothorac Surg 2013;43(1):32–33 Grant SW, Hickey GL, Dimarakis I, et al. How does EuroSCORE II perform in UK cardiac surgery; an analysis of 23 740 patients from the Society for Cardiothoracic Surgery in Great Britain and Ireland National Database. Heart 2012;98(21):1568–1572 Roques F, Michel P, Goldstone AR, Nashef SA. The logistic EuroSCORE. Eur Heart J 2003;24(9):881–882 Roques F, Nashef SA, Michel P, et al. Risk factors and outcome in European cardiac surgery: analysis of the EuroSCORE multinational database of 19030 patients. Eur J Cardiothorac Surg 1999;15(6): 816–822, discussion 822–823 Zhang GX, Wang C, Wang L, et al. Validation of EuroSCORE II in Chinese patients undergoing heart valve surgery. Heart Lung Circ 2013;22(8):606–611 Di Dedda U, Pelissero G, Agnelli B, De Vincentiis C, Castelvecchio S, Ranucci M. Accuracy, calibration and clinical performance of the new EuroSCORE II risk stratification system. Eur J Cardiothorac Surg 2013;43(1):27–32 Nashef SA, Roques F, Sharples LD, et al. EuroSCORE II. Eur J Cardiothorac Surg 2012;41(4):734–744, discussion 744–745 Nashef SA, Roques F, Michel P, Gauducheau E, Lemeshow S, Salamon R. European system for cardiac operative risk evaluation (EuroSCORE). Eur J Cardiothorac Surg 1999;16(1):9–13 Nashef SA. The current role of EuroSCORE. Semin Thorac Cardiovasc Surg 2012;24(1):11–12 Barili F, Pacini D, Capo A, et al. Does EuroSCORE II perform better than its original versions? A multicentre validation study. Eur Heart J 2013;34(1):22–29 Gogbashian A, Sedrakyan A, Treasure T. EuroSCORE: a systematic review of international performance. Eur J Cardiothorac Surg 2004;25(5):695–700 Nashef SA, Roques F, Hammill BG, et al; EurpSCORE Project Group. Validation of European System for Cardiac Operative Risk Evaluation (EuroSCORE) in North American cardiac surgery. Eur J Cardiothorac Surg 2002;22(1):101–105 Ranucci M, Castelvecchio S, Menicanti L, Frigiola A, Pelissero G. Accuracy, calibration and clinical performance of the EuroSCORE: can we reduce the number of variables? Eur J Cardiothorac Surg 2010;37(3):724–729

297

15

16

17

18

19

20

21

22

23

24

25

Herwerden LA. Performance of the original EuroSCORE. Eur J Cardiothorac Surg 2012;41(4):746–754 Chalmers J, Pullan M, Fabri B, et al. Validation of EuroSCORE II in a modern cohort of patients undergoing cardiac surgery. Eur J Cardiothorac Surg 2013;43(4):688–694 Carnero-Alcázar M, Silva Guisasola JA, Reguillo Lacruz FJ, et al. Validation of EuroSCORE II on a single-centre 3800 patient cohort. Interact Cardiovasc Thorac Surg 2013;16(3):293–300 Howell NJ, Head SJ, Freemantle N, et al. The new EuroSCORE II does not improve prediction of mortality in high-risk patients undergoing cardiac surgery: a collaborative analysis of two European centres. Eur J Cardiothorac Surg 2013;44(6):1006–1011 Kirmani BH, Mazhar K, Fabri BM, Pullan DM. Comparison of the EuroSCORE II and Society of Thoracic Surgeons 2008 risk tools. Eur J Cardiothorac Surg 2013;44(6):999–1005 Kunt AG, Kurtcephe M, Hidiroglu M, et al. Comparison of original EuroSCORE, EuroSCORE II and STS risk models in a Turkish cardiac surgical cohort. Interact Cardiovasc Thorac Surg 2013;16(5): 625–629 Borde D, Gandhe U, Hargave N, Pandey K, Khullar V. The application of European system for cardiac operative risk evaluation II (EuroSCORE II) and Society of Thoracic Surgeons (STS) risk-score for risk stratification in Indian patients undergoing cardiac surgery. Ann Card Anaesth 2013;16(3):163–166 Kötting J, Schiller W, Beckmann A, et al. German Aortic Valve Score: a new scoring system for prediction of mortality related to aortic valve procedures in adults. Eur J Cardiothorac Surg 2013; 43(5):971–977 Poullis M. Introducing change (science into the operating room): quality improvement versus experimentation. J Extra Corpor Technol 2009;41(4):11–15 Mihajlović B, Nićin S, Kovacević P, et al. Evaluation of results in coronary surgery using EuroSCORE [in Serbian]. Srp Arh Celok Lek 2011;139(1-2):25–29 Mihajlović B, Nićin S, Cemerlić-Adjić N, et al. Trends of risk factors in coronary surgery [in Serbian]. Srp Arh Celok Lek 2010; 138(9-10):570–576 Grant SW, Grayson AD, Jackson M, et al. Does the choice of riskadjustment model influence the outcome of surgeon-specific mortality analysis? A retrospective analysis of 14,637 patients under 31 surgeons. Heart 2008;94(8):1044–1049

Thoracic and Cardiovascular Surgeon

Vol. 62

No. 4/2014

This document was downloaded for personal use only. Unauthorized distribution is strictly prohibited.

References

Velicki et al.

Copyright of Thoracic & Cardiovascular Surgeon is the property of Georg Thieme Verlag Stuttgart and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use.

Clinical performance of the EuroSCORE II compared with the previous EuroSCORE iterations.

The European System for Cardiac Operative Risk Evaluation (EuroSCORE) II has been recently introduced as an update to the previous versions. We sought...
362KB Sizes 2 Downloads 3 Views