J Clin Monit Comput DOI 10.1007/s10877-014-9629-8

ORIGINAL RESEARCH

Effect of concurrent oxygen therapy on accuracy of forecasting imminent postoperative desaturation Hisham ElMoaqet • Dawn M. Tilbury Satya Krishna Ramachandran



Received: 21 May 2014 / Accepted: 12 October 2014 Ó Springer Science+Business Media New York 2014

Abstract Episodic postoperative desaturation occurs predominantly from respiratory depression or airway obstruction. Monitor display of desaturation is typically delayed by over 30 s after these dynamic inciting events, due to perfusion delays, signal capture and averaging. Prediction of imminent critical desaturation could aid development of dynamic high-fidelity response systems that reduce or prevent the inciting event from occurring. Oxygen therapy is known to influence the depth and duration of desaturation epochs, thereby potentially influencing the accuracy of forecasting of desaturation. In this study, postoperative pulse oximetry data were retrospectively modeled using autoregressive methods to create prediction models for SpO2 and imminent critical desaturation in the postoperative period. The accuracy of these models in predicting near future SpO2 values was tested using root mean square error. The model accuracy for prediction of critical desaturation (SpO2  89 %) was evaluated using meta-analytical methods (sensitivity, specificity, likelihood ratios, diagnostic odds ratios and area under summary receiver operating characteristic curves). Between-study heterogeneity was used as a measure of reliability of the model across different patients and evaluated using the tau-squared statistic. Model

H. ElMoaqet (&)  D. M. Tilbury Mechanical Engineering Department, University of Michigan, Ann Arbor, MI 48109, USA e-mail: [email protected] D. M. Tilbury e-mail: [email protected] S. K. Ramachandran Department of Anesthesiology, Medical School, University of Michigan, Ann Arbor, MI 48109, USA e-mail: [email protected]

performance was evaluated in 20 patients who received postoperative oxygen supplementation and 20 patients who did not receive oxygen. Our results show that model accuracy was high with root mean square errors between 0.2 and 2.8 %. Prediction accuracy as defined by area under the curve for critical desaturation events was observed to be greater in patients receiving oxygen in the 60-s horizon (0:95  0:04 vs. 0:76  0:16). This was likely related to the higher frequency of events in this group (median [IQR] 133:0½31:5; 508:2) than patients who were not treated with oxygen (0½0; 110; p\0:001). Model reliability was reflected by the homogeneity of the prediction models which were homogenous across both prediction horizons and oxygen treatment groups. In conclusion, we report the use of autoregressive models to predict SpO2 and forecast imminent critical desaturation events in the postoperative period with high degree of accuracy. These models reliably predict critical desaturation in patients receiving supplemental oxygen therapy. While high-fidelity prophylactic interventions that could modify these inciting events are in development, our current study offers proof of concept that the afferent limb of such a system can be modeled with a high degree of accuracy. Keywords Pulse oximetry  Oxygen therapy  Time series modeling and prediction  Autoregressive models  Prediction evaluation metrics

1 Introduction The postoperative period is marked by an increased risk of respiratory depression and episodic airway obstruction in the setting of opioid analgesic therapy. Both these dynamic events cause episodic reductions in alveolar ventilation

123

J Clin Monit Comput

resulting in hypoxemia and an increased risk of postoperative respiratory failure [1, 2]. Pulse oximetry monitoring (POM) reports retrospective changes caused by these inciting events that result in critical alveolar hypoventilation. This POM desaturation display delay could exceed 30–60 s due to perfusion delays, interference, measurement and signal averaging [3–7]. All of the clinical responses to episodic desaturation are, therefore, reactive in nature. Although little is currently known of the clinical value of forecasting imminent critical desaturation, it is possible to foresee high-fidelity interventional systems that may prophylactically modify these inciting events. For example, automated inspired oxygen fraction adjustment based on near-past SpO2 values reduced hyperoxia in infants at risk of oxygen toxicity [8]. As a step towards defining the afferent limb of such a system, we have previously reported a highly accurate method of forecasting near-future peripheral oxygen saturation of hemoglobin (SpO2 ) levels and desaturation episodes [9, 10]. The autoregressive model for forecasting near-future SpO2 was derived and validated uniquely for each patient with root mean square errors ranging from 0:33 to 2.89 %. The sensitivity and positive predictive values of the predicted critical desaturation epochs over the next 20-s horizons, ranged from 88.3 to 100 % and 91.9 to 100 % respectively. At the 60-sahead prediction horizon, the sensitivity (51.7–77.7 %) and positive predictive values (85.8–100 %) were lower. Similar linear models have been previously developed and successfully validated for prediction of near future glucose levels [11–18]. The use of supplemental oxygen therapy is thought to reduce the reliability of POM to identify ongoing alveolar hypoventilation [19]. This shortcoming has serious implications for clinical practice and safe monitoring standards. Although oxygen therapy influences the duration and depth of desaturation, it is unknown whether it influences forecasting of critical postoperative desaturation. In our recent work [9, 10], we did not specifically examine the confounding influence of supplemental oxygen. Thus, in this paper, we evaluate the influence of postoperative oxygen therapy on our ability to accurately forecast near future SpO2 levels, and the ability of this model to correctly predict impending critical desaturation events. Using the predictive SpO2 dynamic model and the performance metric developed in our recent studies [9, 10], we investigate the effect of oxygen therapy on the ability to predict critical events. The main contribution of this paper is demonstrating that the supplemental oxygen therapy can significantly influence the accuracy of predicting critical desaturation events. Our results show that the prediction accuracy of critical desaturation is significantly better for patients receiving oxygen therapy.

123

2 Methods 2.1 Data acquisition This study was approved by the Institutional Review Board (IRB) at the University of Michigan (IRB#HUM00069035). The data used for this study were collected as part of a Quality Improvement study (IRB#HUM00027189) looking at the reliability and nurse response times to a postoperative oximetry based paging alert system [20]. The study was conducted on 119 postoperative adults following orthopedic surgery over a 3 month period in 2009. All patients were placed on postoperative POM (MASIMO RAD-8, Irvine CA) on arrival to the patient care unit per Institutional policy. Immediately after termination of the patients’ monitoring period, device ASCII data consisting of SpO2 measurements were downloaded using PROFOX Oximetry Software (version PO Standard; Escondido, CA). The data obtained are discrete time signals sampled each 2 s and quantized for a resolution of 1 %. We defined a critical oxygen desaturation event by an SpO2  89 % [10, 21]. This study is a sub-analysis of 40 patients’ postoperative data, among which 20 patients received supplemental oxygen therapy. In our recent work [9, 10], we developed a dynamic model to characterize the dynamics of SpO2 and to perform short term predictions for near future SpO2 levels. We also developed a performance metric that evaluates the SpO2 predictive models for their ability to predict critical desaturation events. 2.2 Missing measurements in raw time series The zero amplitude instances of raw POM signal were managed in two distinct ways. When zero measurements appeared in short intervals no more than 6 sampling steps (12 s), we assumed that they represent anomalous sensor measurements that cannot happen clinically. Based on the zero-order-hold principle, these zeros were replaced with the most recent non-zero amplitude [22]. Longer time intervals of zero amplitude typically occur when the POM sensor falls off the patient’s finger. We handled these intervals that last for more than 6 sampling steps by partitioning the time series into smaller pieces that were modeled and analyzed separately. 2.3 Raw signals smoothing using Tikhnov regularization We addressed potential errors from signal discretization effects and noise by smoothing the raw data before computing the autoregressive model (AR model) coefficients [13, 14]. In our previous work [9], we showed that

J Clin Monit Comput

regularization is the best smoothing method for SpO2 time series. Therefore, in this paper, we used Tikhnov regularization approach, which yielded the smoothed signal by computing y~ ¼ Ud w, where Ud denotes the integral operator and w denotes estimates of the SpO2 signals’ first derivatives. The derivatives estimates yielded reliable data smoothing without lag. We chose the first derivative or the rate of change of SpO2 over time to impose smoothness constraints on the SpO2 measurements. To estimate the signals derivatives w, we minimized the functional f ðwÞ, given by f ðwÞ ¼ ky  Ud wk2 þ k2d kLd wk2

ð1Þ

where y denotes the N  1 vector of the raw POM time series signal, Ud denotes the N  N integral operator, w represents the N  1 vector of first order differences (the rate of change in SpO2 with time), kd represents the data regularization parameter, and Ld denotes a well-conditioned matrix chosen to impose smoothness constraints on the derivative of the POM signal. For a chosen Ld , the quality of smoothing is determined by kd . When kd ¼ 0, no regularization is performed resulting in the original raw POM data y. As kd increases, the solution w (and hence y~) increasingly satisfies the imposed smoothness constraint, resulting at the same time in larger deviations from the raw data. For this study, kd ¼ 20 was selected empirically and Ld was chosen to be the identity matrix. 3 Data analysis methods 3.1 AR modeling for regularized SpO2 First, we used 7,500 data points of the regularized time series of each patient to build the AR models by finding the coefficients h that best describe the dependencies in the entire time-series y based on the formula y ¼ Uh, where U denotes the design matrix representing delayed values of y . The coefficients were computed based on the least squares fit, such that the functional ky  Uhk2 is minimized [23]. Using AR models, we calculated the modeled signal y^n at time n (n ¼ m þ 1; . . .; N, where N denotes the total number of data samples available for modeling) as a linear combination of previously observed signals yni m X hi yni y^n ¼ ð2Þ i¼1

where h denotes the vector of AR coefficients, and m denotes the order of the model i.e the number of previously observed SpO2 measurements yni used to predict a future SpO2 value y^n . In our recent work [9, 10], we showed that the dynamics of SpO2 can be captured by an AR-10 model which is used in this study. More extensive details about

the predictive model development and selection can be found in our previous work [9, 10]. 3.2 Time series prediction Future values of the discrete-time signal were forecasted k steps using historical data such that the predicted output y^Nþk of a time series y was derived as a function of previous available measurements of this time series y^ yNþk ¼ f ðyN ; yN1 ; . . .; y0 Þ. The prediction horizon, k, corresponds to the predicted SpO2 at time kTs , where Ts , is the sampling interval of the discrete signal (which is 2 s for our POM data). We applied this principle to the AR model in order to predict future values of SpO2 signals. For example, for a one-step ahead (2 s ahead) prediction, the AR model used the following equation m X y^Nþ1 ¼ hi y~Niþ1 ð3Þ i¼1

where y~ represents the smoothed time series and h represents the vector of AR coefficients. Equation 3 was applied recursively for a variety of prediction horizons k to obtain predicted values y^Nþk . According to the pre-processing techniques presented earlier, we were able to get 15,000, continuous data points (approximately 8 h of data) from the POM signals. We used the first 7,500 data points of the signal to build the AR model and the second 7,500 data points for evaluating SpO2 predictions. 3.3 Prediction evaluation metrics Recent literature in modeling of physiological signals uses several metrics to evaluate the accuracy of identified AR Models and the quality of predictions [13, 15, 16, 24]. In this paper, we use a two-fold precision and error assessment strategy. First, the Root Mean Square Error (RMSE), a metric that has been used previously in validation of forecasting models [13, 15, 16, 25, 26], was used as a measure of the overall precision of prediction of near future SpO2 values. Additionally, we computed summary measures of accuracy for binary prediction of threshold based critical desaturation epochs [27–32]. 3.3.1 Root mean square error The AR models were fitted using least square error principle to minimize the RMSE between the predicted and reference measurements of the time series. RMSE can be expressed as follows sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi n 1X ð4Þ RMSEð^ y; y~Þ ¼ ð^ y  y~i Þ2 n i¼1 i

123

J Clin Monit Comput

where y^ is the predicted signal, y~ is the reference data signal used for identifying the model, and n is the number of predicted output measurements. Given the published sensor precision levels (1:5 to 2.5 %) described by the device manufacturer, we accepted a RMSE value of \2.5 % as being acceptable, 1–2 % as average and \1 % as excellent. 3.3.2 Prediction accuracy of imminent critical desaturation events Additionally, SpO2 data were classified based on the occurrence of critical desaturation events (SpO2 values  89 %). We then evaluated the ability of the predictive dynamic models to accurately predict critical desaturations within the time-series data using the SpO2 prediction grid [9, 10]. This grid is a performance metric that we developed earlier to evaluate the SpO2 predictive models based on whether or not critical desaturations are correctly predicted. Accordingly, the predictive capability of the AR models in the POM signals was assessed by looking k steps ahead at each data point of the validation time series and identifying critical desaturation events occurring in this interval. Subsequently, the AR predictor was run on the same interval to determine if the AR model accurately captured this event in a binary fashion. This can be best illustrated using Fig. 1 that shows the 4 possible regions of the SpO2 grid at each prediction instance. To construct the grid, the prediction results were evaluated progressively at all time steps of the validation data set composed of 7500 pts (4.17 hrs) of SpO2 data distinct from the ones used for model estimation. The prediction grid used for assessing the quality of the prediction process is shown in Table 1, where points in Regions A and D on the main diagonal represent points of good prediction, points in Region C offdiagonal represent the ones at which the model fails to detect critical events, and points in Region B off-diagonal represent false prediction points. Accordingly, true positive, true negative, false positive and false negative data were compiled based on the threshold critical desaturation value of  89 %. Sensitivity, specificity, and positive predictive values were used as statistical measures to evaluate the performance of the SpO2 prediction grid over the validation data set of each patient [10, 27, 28]. Data from all patients were then combined for meta-analyses to determine homogeneity of Table 1 Prediction grid for proposed metric

Pred.  89 % Pred. [ 89 %

123

Ref.  89 %

Ref. [ 89 %

A C

B D

these prediction metrics [29]. Pooled summary measures of accuracy (pooled sensitivity, specificity, and diagnostic odds ratios, DOR) were computed using previously described methods [33]. Values of DOR in excess of 81 were considered to define an excellent prediction model. The Tau-square test was used as a measure of between study (i.e., inter-individual) variance in prediction accuracy of the diagnostic odds ratios [34, 35]. Summary receiver operating characteristic curves were developed and the area under the curve was used as an estimate of the accuracy of the prediction model [32, 33, 36, 37]. The presence or absence of threshold effect determined the use of asymmetric or symmetric SROC curves respectively. MosesShapiro-Littenberg weighted regression (inverse variance) models were used to evaluate the presence of threshold effect [32]. The area under the curve was computed by numeric integration of the curve equation using the trapezoidal method [32]. All data are presented with 95 % confidence intervals [38]. Comparisons between groups of continuous data were performed by the Mann Whitney U test and significance was set at 0:05. All data were analyzed on Matlab (version 2011b; MathWorks, Natick, MA, USA) and Meta-Disc (version 1:4; Ramon y Cajal Hospital, Madrid, Spain).

4 Results Data from 40 patients were analyzed. Patients receiving oxygen therapy had significantly more critical desaturation events (median [IQR] 133.0 [31.5, 508.2]) than patients who were not treated with oxygen (0 [0,110]; p\0:001) over the 4.17 h of model validation. Table 2 shows the RMSE performance for 20 patients of each group over two prediction horizons (20- and 60-s). The range of RMSE values are similar between the two groups and RMSE values are uniformly excellent with the 20-s ahead prediction and acceptable to excellent in the 60-s ahead prediction results. For analyses to evaluate the accuracy of forecasting critical desaturation events, only patients with at least one event were included (16 patients on oxygen and 8 patients without oxygen treatment). 4.1 Patients treated with supplemental oxygen Table 3 shows summary measures of accuracy of prediction of critical desaturation events over foreacasting intervals of 20 s for 20 patients subjected to oxygen therapy. The pooled sensitivity was 0.93 [0.92, 0.93], pooled specificity 0.99 [0.99, 0.99], positive likelihood ratio 559.2 [306.6, 1020.1], negative likelihood ratio 0.09 [0.08, 0.13] and DOR 4689.5 [3022.5, 7275.9]. The positive predictive value was 1 [0.99,1] in these patients. The tau-squared

J Clin Monit Comput Fig. 1 Examples of prediction grid regions, blue actual, greendotted prediction

Region A, Pred89

90

80 −20

0

20

40

60

Time(sec)

evidence of inter-study variance (tau squared p ¼ 0:39). The AUC for the SROC curve was 0:95  0:04 (Fig. 2).

No O2 therapy k ¼ 20 s

k ¼ 60 s

4.2 Patients not treated with supplemental oxygen

18

0.493

1.775

0.959

24

0.227

0.55

0.754

2.497

30

0.315

0.984

0.317

1.033

31

0.244

0.514

26

0.329

0.769

38

0.296

1.146

37

0.318

0.947

44

0.433

1.101

70

1.129

2.99

48

0.292

0.756

73

0.466

1.393

59

0.399

1.07

80

0.284

0.724

61

0.242

0.703

81

0.41

1.218

63

0.592

1.56

90

0.198

0.679

72

0.488

1.659

94

0.579

2.221

79

0.299

0.662

98

0.42

1.652

86

0.472

1.357

103 108

0.462 0.236

1.424 0.84

93 96

0.996 0.264

3.365 0.744

113

0.798

2.874

102

0.461

1.468

116

0.449

1.807

105

0.231

0.616

118

0.571

1.522

106

0.325

0.863

132

0.337

1.088

111

0.748

1.671

136

0.884

2.596

133

0.361

1.08

Tables 5 and 6 display the summary measures of prediction accuracy of the AR model in patients not receiving supplemental oxygen. The pooled estimates [95 % confidence intervals] for 20-s-ahead vs. 60-s-ahead prediction models were as follows: sensitivity 0.892 [0.871, 0.911] vs. 0.442 [0.421, 0.464]; specificity 0.998 [0.998, 0.997] vs. 0.997 [0.998, 0.997]; positive likelihood ratio 458.35 [277.282–757.661] vs. 184.16 [108.846–311.585]; negative likelihood ratio 0.115 [0.082, 0.161] vs. 0.564 [0.513, 0.619]; and DOR 4,476.8 [2,491.2, 8,044.8] vs. 339.99 [193.00, 598.90]. The pooled positive predictive value was 1 [0.9,1.0] for the 20-s horizon and 0.87 [0.83, 0.9] for the 60-s horizon. Between study variance was not significant, with tau-squared p-values of 0.43 and 0.51 respectively for the 20-s-ahead and 60-s-ahead models. Furthermore, Fig. 2 shows summary receiver operating characteristic (SROC) curves for accuracy of prediction of critical desaturation events over the two prediction horizon across oxygen treatment groups where Q refers to Q statistic and SE refers to standard error. The AUC for prediction accuracy were 0:99  0:01 and 0:76  0:16 respectively as shown in Fig. 2.

Pat.#

k ¼ 20 s

k ¼ 60 s

14

0.414

1.438

19

0.353

21 22

Pat.#

statistic was not significant for between study variance (p ¼ 0:52). The AUC for prediction accuracy in this analysis was 0:99  0:00 (Fig. 2). Table 4 provides summary data on the accuracy of 60-sahead prediction of critical desaturation events in these patients: pooled sensitivity 0.66 [0.65, 0.67], pooled specificity 0.99 [0.99, 0.99], positive likelihood ratio 187.9 [146.4, 241.2], negative likelihood ratio 0.46 [0.38, 0.57] and DOR 454.9 [317.3, 652.2]. The positive predictive value of the AR model was 0.96 [0.88, 1]. There was no

5 Discussion In this study, we investigated the effect of oxygen therapy on the ability to predict critical desaturations during postoperative period. We compared the prediction results for imminent critical desaturation epochs in postoperative adults receiving oxygen therapy with others not receiving this therapy. Accordingly, we showed that the performance

123

J Clin Monit Comput Table 3 Summary measures of accuracy of 20-s ahead prediction of critical desaturation events in patients receiving supplemental oxygen (pooled estimates are in bold) Sensitivity [95 % CI]

Specificity [95 % CI]

Positive LR [95 % CI]

Negative LR [95 % CI]

DOR [95 % CI]

0.881

0.999

926.76

0.119

7,777.4

[0.811, 0.932]

[0.998, 1.000]

[440.75, 1,948.7]

[0.074, 0.192]

[3,110.2, 19,448.0]

0.906 [0.750, 0.980]

1 [1.000, 1.000]

13,335.8 [832.13, 213,719.6]

0.106 [0.039, 0.286]

125,729 [6,353.2, 2,488,171]

0.886

0.998

361.67

0.115

3,156.8

[0.821, 0.933]

[0.996, 0.999]

[227.12, 575.91]

[0.072, 0.182]

[1,573.2, 6,334.4]

0.853

1

12,557.8

0.151

83,135.6

[0.765, 0.917]

[1.000, 1.000]

[784.53, 201,008.6]

[0.094, 0.243]

[4917.7, 1,405,450.]

0.878

0.997

327.56

0.122

2,684.4

[0.823, 0.921]

[0.996, 0.998]

[203.16, 528.13]

[0.083, 0.179]

[1,407.6, 5,119.4]

0.871

0.996

207.49

0.13

1,600.6

[0.841, 0.897]

[0.994, 0.997]

[144.10, 298.75]

[0.105, 0.160]

[1,032.9, 2,480.3]

0.837

1

1,910.8

0.163

11,732.6

[0.693, 0.932]

[0.999, 1.000]

[611.72, 5,968.6]

[0.083, 0.321]

[2,917.6, 47,179.9]

0.895

0.997

264.41

0.105

2,512.4

[0.870, 0.917]

[0.995, 0.998]

[175.69, 397.93]

[0.085, 0.131]

[1,561.2, 4,043.2]

0.933

0.999

690.77

0.067

10,347.6

[0.779, 0.992] 0.94

[0.997, 0.999] 0.997

[357.08, 1,336.3] 274.43

[0.017, 0.255] 0.06

[2,138.8, 50,061.3] 4,564.4

[0.915, 0.960]

[0.995, 0.998]

[183.95, 409.41]

[0.042, 0.086]

[2,635.8, 7,904.2]

0.953

0.985

63.536

0.048

1,330.6

[0.945, 0.960]

[0.981, 0.989]

[49.523, 81.514]

[0.041, 0.056]

[987.41, 1,793.0]

0.929

0.997

289.69

0.072

4047

[0.910, 0.944]

[0.995, 0.998]

[188.94, 444.17]

[0.057, 0.090]

[2,466.4, 6,640.7]

0.871

0.998

566.05

0.129

4,380.1

[0.831, 0.905]

[0.997, 0.999]

[313.18, 1,023.1]

[0.098, 0.170]

[2,239.5, 8,566.9]

0.884

1

2,848.7

0.116

24,491

[0.749, 0.961]

[0.999, 1.000]

[709.59, 11,436.2]

[0.051, 0.265]

[4,607.8, 130,171.4]

0.923

1

12,158.9

0.107

113,475

[0.640, 0.998]

[0.999, 1.000]

[756.10, 195,529.6]

[0.024, 0.486]

[4407.1, 2,921,760.]

0.93

0.998

426.91

0.071

6,048.9

[0.874, 0.966]

[0.996, 0.999]

[261.13, 697.93]

[0.039, 0.128]

[2,694.6, 13,578.6]

0.926

0.998

559.24

0.099

4,689.5

[0.919, 0.931]

[0.998, 0.997]

[306.588–1,020.1]

[0.077, 0.126]

[3,022.5, 7275.9]

of these models was reliable and reproducible in patients receiving supplemental oxygen treatment. The ability to predict critical events reflected by the pooled sensitivity of the predictions was observed to be significantly greater in patients receiving oxygen over 60s prediction intervals (0:44  0:02 vs. 0:66  0:01). Furthermore, prediction accuracy as defined by area under the curve for critical desaturation events was observed to be greater in patients receiving oxygen in the 60-s horizon (0:95  0:04 vs. 0:76  0:16). Postoperative monitoring has increasingly been viewed by experts as having significant patient value. Continuous

123

POM linked to threshold based paging alerts have previously been shown to significantly improve patient outcomes [39]. Counterbalancing this benefit of improved outcomes is the risk of technological intensification. Increasing levels of monitoring potentially increase the alarm burden for caregivers. This could divert their attentions from other more important clinical crises, or paradoxically reduce rescue effectiveness due to desensitization. In Taenzer’s study [39], it is important to note that adverse outcomes still occurred in monitored patients, but at a lower rate than patients undergoing usual care. This may indicate other etiology of events or unavoidable delays in monitor capture of inciting events,

J Clin Monit Comput Table 4 Summary measures of accuracy of 60-s-ahead prediction of critical desaturation events in patients receiving supplemental oxygen (pooled estimates are in bold) Sensitivity [95 % CI]

Specificity [95 % CI]

Positive LR [95 % CI]

Negative LR [95 % CI]

DOR [95 % CI]

0.48

0.997

150.66

0.522

288.63

[0.416, 0.544]

[0.995, 0.998]

[98.174, 231.21]

[0.463, 0.589]

[178.65, 466.31]

0.403 [0.289, 0.525]

1 [1.000, 1.000]

5980 [368.89, 96941.5]

0.596 [0.493, 0.720]

10034.7 [603.51, 166851.8]

0.507

0.996

130.43

0.495

263.72

[0.446, 0.568]

[0.994, 0.997]

[88.502, 192.21]

[0.438, 0.558]

[169.72, 409.77]

0.469

0.998

244.16

0.532

458.56

[0.393, 0.545]

[0.997, 0.999]

[141.35, 421.75]

[0.463, 0.612]

[251.02, 837.68]

0.489

0.997

143.26

0.513

279.2

[0.435, 0.542]

[0.995, 0.998]

[92.254, 222.48]

[0.463, 0.568]

[173.34, 449.72]

0.51

0.994

85.998

0.493

174.43

[0.479, 0.540]

[0.992, 0.996]

[62.295, 118.72]

[0.464, 0.524]

[124.05, 245.27]

0.494

1

6,707.2

0.506

13,255.6

[0.382, 0.606]

[0.999, 1.000]

[416.04, 108,129.5]

[0.410, 0.625]

[802.27, 219,016.0]

0.537

0.996

129.77

0.465

279.35

[0.509, 0.566]

[0.994, 0.997]

[88.111, 191.13]

[0.437, 0.494]

[186.94, 417.43]

0.414

0.998

182.31

0.587

310.56

[0.298, 0.538] 0.66

[0.996, 0.999] 0.997

[102.37, 324.69] 193.54

[0.482, 0.715] 0.341

[155.01, 622.19] 566.87

[0.624, 0.694]

[0.995, 0.998]

[128.27, 292.01]

[0.308, 0.378]

[366.06, 877.84]

0.866

0.996

193.06

0.135

1434.3

[0.855, 0.877]

[0.993, 0.997]

[118.39, 314.84]

[0.124, 0.146]

[870.26, 2,364.0]

0.593

0.998

252.39

0.408

618.02

[0.567, 0.618]

[0.996, 0.999]

[149.32, 426.61]

[0.384, 0.434]

[362.17, 1,054.6]

0.473

0.998

230.55

0.528

436.4

[0.434, 0.512]

[0.997, 0.999]

[135.76, 391.54]

[0.491, 0.568]

[252.60, 753.93]

0.458

0.999

324.91

0.543

598.43

[0.348, 0.571]

[0.997, 0.999]

[162.39, 650.09]

[0.446, 0.662]

[273.36, 1310.1]

0.394

1

5375.4

0.603

8914.6

[0.229, 0.579]

[0.999, 1.000]

[326.06, 88616.9]

[0.459, 0.792]

[512.69, 155,007.1]

0.471

0.998

198.31

0.531

373.7

[0.414, 0.528]

[0.996, 0.999]

[121.56, 323.53]

[0.477, 0.590]

[220.80, 632.49]

0.657

0.998

187.9

0.464

454.97

[0.648, 0.666]

[0.998, 0.997]

[146.348, 241.239]

[0.378, 0.570]

[317.34, 652.28]

rescue or response to therapy. Decision support systems are intuitively expected to aid clinical care, and monitoring systems of the future should be capable of enabling automated prophylactic modifications of therapy. Critical respiratory events in particular are complex and dynamic, making modeling and interventions based on dynamic models challenging. Inherently, POM displays retrospective desaturation events, delayed by 8–15 s due to delays in monitor capture and signal processing. Since these events are transient, high fidelity response systems based on realtime POM data are unlikely to be successful in preventing them. This is because the inciting cause for desaturation, i.e.,

airway obstruction or central apnea likely occurred at least 30–60 s prior to the desaturation epoch, and arousals related to hypoxemia are likely to intervene and self-terminate the inciting event. This has the potential to convert our current retrospective POM paradigm into a one that is more realtime. Auto-control systems based on retrospective SpO2 values have been shown to be more effective than human responses in reducing risk of hyperoxia by adjusting inspired oxygen fraction [8]. While high-fidelity prophylactic interventions that could modify these inciting events are yet untested, our current study offers proof of concept that the afferent limb of such a system can be modeled with a

123

J Clin Monit Comput Table 5 Summary measures of accuracy of 20-s-ahead prediction of critical desaturation events in patients not treated with oxygen (pooled estimates are in bold) Sensitivity [95 % CI]

Specificity [95 % CI]

Positive LR [95 % CI]

Negative LR [95 % CI]

DOR [95 % CI]

0.934

0.998

491.78

0.066

7,485.3

[0.875, 0.971]

[0.997, 0.999]

[290.79, 831.69]

[0.034, 0.128]

[3,079.6, 18,193.9]

0.846 [0.546, 0.981]

1 [1.000, 1.000]

12,285.3 [760.29, 198,515.2]

0.179 [0.058, 0.549]

68,793 [3,127.2, 1,513,340]

0.824

1

2,046.7

0.177

11,593.6

[0.655, 0.932]

[0.999, 1.000]

[653.27, 6,412.6]

[0.085, 0.365]

[2761.3, 48,675.7]

0.938

0.997

310.99

0.063

4,960.9

[0.893, 0.967]

[0.995, 0.998]

[204.58, 472.77]

[0.036, 0.108]

[2,417.7, 10,179.2]

0.901

0.995

170.13

0.1

1,703.2

[0.862, 0.931]

[0.993, 0.996]

[123.63, 234.11]

[0.072, 0.140]

[1,044.3, 2,777.6]

0.88

0.997

281.39

0.121

2,331.7

[0.812, 0.930]

[0.995, 0.998]

[186.21, 425.21]

[0.076, 0.191]

[1,200.7, 4,528.1]

0.894

0.997

289.18

0.107

2,709.9

[0.769, 0.965]

[0.995, 0.998]

[190.04, 440.04]

[0.047, 0.244]

[983.50, 7,466.9]

0.798

1

1,959.1

0.202

9,716.3

[0.719, 0.864]

[0.999, 1.000]

[629.91, 6,093.2]

[0.143, 0.284]

[2,895.0, 32,610.0]

0.892

0.998

458.35

0.115

4,476.8

[0.871, 0.911]

[0.998, 0.997]

[277.282–757.661]

[0.082, 0.161]

[2491.2, 8,044.8]

Table 6 Summary measures of accuracy of 60-s ahead prediction of critical desaturations in patients not treated with oxygen (pooled estimates are in bold) Sensitivity [95 % CI]

Specificity [95 % CI]

Positive LR [95 % CI]

Negative LR [95 % CI]

DOR [95 % CI]

0.604

0.998

274.35

0.397

691.21

[0.533, 0.672]

[0.996, 0.999]

[166.06, 453.24]

[0.335, 0.471]

[392.54, 1217.1]

0.333 [0.180, 0.518]

1 [1.000, 1.000]

5031.6 [302.53, 83684.4]

0.662 [0.520, 0.842]

7602.8 [434.74, 132957.1]

0.378

0.999

559.7

0.622

899.77

[0.268, 0.499]

[0.998, 1.000]

[222.25, 1409.5]

[0.521, 0.743]

[332.75, 2433.0]

0.392

0.996

97.959

0.61

160.46

[0.348, 0.438]

[0.994, 0.997]

[66.566, 144.16]

[0.568, 0.656]

[106.00, 242.90]

0.474

0.993

68.95

0.53

130.15

[0.434, 0.514]

[0.991, 0.995]

[51.257, 92.751]

[0.492, 0.571]

[93.877, 180.43]

0.447

0.995

86.925

0.556

156.35

[0.387, 0.508]

[0.993, 0.996]

[61.413, 123.04]

[0.500, 0.619]

[104.64, 233.62]

0.467

0.997

156.39

0.534

292.7

[0.370, 0.566]

[0.995, 0.998]

[98.364, 248.66]

[0.447, 0.638]

[166.34, 515.06]

0.356

0.999

511.86

0.644

794.76

[0.301, 0.415]

[0.998, 1.000]

[210.24, 1246.2]

[0.591, 0.702]

[320.14, 1973.0]

0.442

0.997

184.16

0.564

339.99

[0.421, 0.464]

[0.998, 0.997]

[108.846, 311.585]

[0.513, 0.619]

[193.00, 598.90]

high degree of accuracy. More importantly, we are able to predict imminent critical desaturation in patients receiving concurrent oxygen therapy. This finding enhances its potential value in clinical care.

123

Our findings also validate the use of software codes to model dynamic signals over different prediction horizons. We also showed that regularized data can be used to predict events with high degree of accuracy through the use of 10th

J Clin Monit Comput

Fig. 2 Summary receiver operating characteristic curves for accuracy of prediction of critical desaturation events over the two 20s and 60s predictions across oxygen treatment groups

order AR models for both categories of patients. We have also shown in our previous work [9] that a higher model order than AR-10 didn’t significantly improve prediction results. For example, negligible improvement in prediction performance is obtained when increasing the historical time duration needed for 60-s ahead predictions from 20-s (AR-10) to 60-s (AR-30). The RMSE metric gives a relatively high weight to large errors since the errors are squared before they are averaged. This supports the use of the RMSE as the primary error statistic, since large errors in prediction of SpO2 are particularly undesirable. However, the RMSE metric lacks the capability of distinguishing phase difference errors from magnitude difference errors. Additionally, it is arguably more important to be able to predict critical desaturation patterns within these

signals. In this context, our study analyses of accuracy of prediction of critical desaturation epochs are pertinent. The pooled values of specificity for both prediction intervals are close to 100 % with or without oxygen therapy, over different horizons. This also reflects the extremely low instances of false positives, which reduces the likelihood of risk from unnecessary interventions. Also, the high DOR values with or without oxygen therapy across both prediction intervals highlights the benefit of the AR-10 models in practical applications. As one would expect, the sensitivity of the current model reduces as the prediction interval increases from 20-s to 60-s. Despite this, the AR10 model is still able to capture the majority of critical desaturation events. A significant finding of this study is the homogeneity of prediction accuracy across several

123

J Clin Monit Comput

patients. This increases the likelihood that our model accuracy is generalizable in adult patients who are recovering from surgery. The variance in accuracy of prediction of both 20-s and 60-s windows with administration of oxygen is of significant interest. The two study cohorts differ primarily in their propensity for desaturation, with a greater frequency of critical desaturation events among patients receiving supplemental oxygen. The greater number of events in this group likely allows the model to learn patterns more effectively during the derivation period, allowing it to be more accurate in the validation period. Two things are apparent from this analysis: first, desaturations continue unabated despite oxygen therapy in postoperative patients. Secondly, the high number of events in these patients is concerning, and may point to an unmet need for more effective prophylactic response systems that prevent the causes of episodic hypoxemia. This study has inherent limitations. The data used for the study were collected from a convenience sample of patients recovering from orthopedic surgery. Although the data collection sheets made available to nurses were filled prospectively, it is possible that some of the patients who were considered to not have received oxygen may have indeed been treated with oxygen therapy. The desaturation data reflect real-world care, with loss of data from sensor dislodgement. Our zero-handling methodology has inherent limitations as the assumptions may not have been accurate through all episodes of POM. The depth and length of desaturation epochs may have been affected by spontaneous arousals, monitor alarms or nurse stimulation. Since these are not uniformly equal across all patients and desaturation epochs, it is possible that some patients had longer and deeper desaturations than others. None of the patients had a major respiratory event needing intubation or positive pressure ventilation. Thus the ability of the prediction models to help predict adverse events has not been tested. Further studies are currently underway to evaluate the models in patients with adverse outcomes. The signal averaging time of the POM devices was pre-set at 6 s. This limits the granularity of the observed and predicted SpO2 data, and further work is needed to test the AR model in higher resolution data. Finally, the learning period for the derivation of the AR models was 4.17 h in this study. Further, data management and analyses required fairly high-level computing to produce. These are currently beyond the scope of a bedside device, but it is anticipated that the determination of specific patterns of desaturation will inform the development of simpler systems in the near future. Despite these limitations, the study findings provide proof of concept that dynamic POM data can be modeled prospectively with a high degree of precision, opening opportunities for research into smart

123

systems that could prevent or reduce the occurrence of adverse respiratory events. In summary, we report the use of autoregressive models to predict postoperative SpO2 and forecast imminent critical desaturation events with high degree of accuracy. These models reliably predict critical desaturation in patients receiving supplemental oxygen therapy. Acknowledgments Dr. Ramachandran has received honoraria from Merck for consulting and research support in the current year. This research was supported in part by the National Center for Advancing Translational Sciences of the National Institutes of Health under Award Number 2UL1TR000433. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

References 1. Ramachandran SK, Haider N, Saran KA, Mathis M, Kim J, Morris M, O’Reilly M. Life-threatening critical respiratory events: a retrospective study of postoperative patients found unresponsive during analgesic therapy. J Clin Anesth. 2011;23(3):207–13. 2. Taylor S, Kirton OC, Staff I, Kozol RA. Postoperative day one: a high risk period for respiratory events. Am J Surg. 2005;190(5): 752–6. 3. Hamber EA, Bailey PL, James SW, Wells DT, Lu JK, Pace NL. Delays in the detection of hypoxemia due to site of pulse oximetry probe placement. J Clin Anesth. 1999;11(2):113–8. 4. Rheineck-Leyssius AT, Kalkman CJ. Advanced pulse oximeter signal processing technology compared to simple averaging II. Effect on frequency of alarms in the postanesthesia care unit. J Clin Anesth. 1999;11(3):196–200. 5. Rheineck-Leyssius AT, Kalkman CJ. Advanced pulse oximeter signal processing technology compared to simple averaging I. Effect on frequency of alarms in the operating room. J Clin Anesth. 1999;11(3):192–5. 6. Trivedi NS, Ghouri AF, Shah NK, Lai E, Barker SJ. Pulse oximeter performance during desaturation and resaturation: a comparison of seven models. J Clin Anesth. 1997;9(3):184–8. 7. Trivedi NS, Ghouri AF, Shah NK, Lai E, Barker SJ. Effects of motion, ambient light, and hypoperfusion on pulse oximeter function. J Clin Anesth. 1997;9(3):179–83. 8. Claure N, Bancalari E, D’Ugard C, Nelin L, Stein M, Ramanathan R, Hernandez R, Donn SM, Becker M, Bachman T. Multicenter crossover study of automated control of inspired oxygen in ventilated preterm infants. Pediatrics. 2011;127(1):e76–83. 9. ElMoaqet H, Tilbury DM, Ramachandran S-K. Predicting oxygen saturation levels in blood using autoregressive models: a threshold metric for evaluating predictive models. Proceedings of American control conference, 2013. p. 734–739. 10. ElMoaqet H, Tilbury D, Ramachandran SK. Evaluating predictions of critical oxygen desaturation events. Physiol Meas. 2014;35(4):639–55. 11. Bremer T, Gough DA. Is blood glucose predictable from previous values? A solicitation for data. Diabetes. 1999;48(1):403–22. 12. Dua P, Doyle FJ, Pistikopoulos EN. Model-based blood glucose control for type 1 diabetes via parametric programming. IEEE Trans Biomed Eng. 2006;53(8):1478–91. 13. Gani A, Gribok A, Rajaraman S, Ward W, Reifman J. Predicting subcutaneous glucose concentration in humans: data-driven glucose modeling. IEEE Trans Biomed Eng. 2009;56(2):246–54.

J Clin Monit Comput 14. Gani A, Gribok A, Lu Y, Ward W, Vigersky RA, Reifman J. Universal glucose models for predicting subcutaneous glucose concentration in humans. IEEE Trans Inf Technol Biomed. 2010;14(1):157–65. 15. Reifman J, Rajaraman S, Gribok A, Ward W. Predictive monitoring for improved management of glucose levels. Diabetes Sci Technol. 2007;1(4):478–86. 16. Sparacino G, Zanderigo F, Corazza S, Maran A, Facchinetti A, Cobelli C. Glucose concentration can be predicted ahead in time from continuous glucose monitoring sensor time-series. IEEE Trans Biomed Eng. 2007;54(5):931–7. 17. Sparacino G, Facchinetti A, Cobelli C. Smart continuous glucose monitoring sensors: on-line signal processing issues. Sensors. 2010;10(7):6751–72. 18. Trajanoski Z, Regittnig W, Wach P. Simulation studies on neural predictive control of glucose using the subcutaneous route. Comput Methods Programs Biomed. 1998;56(2):133–9. 19. Fu ES, Downs JB, Schweiger JW, Miguel RV, Smith RA. Supplemental oxygen impairs detection of hypoventilation by pulse oximetry. CHEST J. 2004;126(5):1552–8. 20. Voepel-Lewis T, Parker ML, Burke CN, Hemberg J, Perlin L, Kai S, Ramachandran SK. Pulse oximetry desaturation alarms on a general postoperative adult unit: a prospective observational study of nurse response time. Int J Nurs Stud. 2013;50(10):1351–8. 21. Centers for Medicare and Medicaid Services. National coverage determination (NCD) for home use of oxygen. 1993. 22. Astrom K, Wittenmark B. Computer controlled systems: theory and design. Upper Saddle River, NJ: Prentice Hall; 1997. 23. Ljung L. System identification: theory for the user. Upper Saddle River, NJ: Prentice-Hall; 1987. 24. Clarke WL. The original Clarke error grid analysis (EGA). Diabetes Technol Therapeutics. 2005;7(5):776–9. 25. Armstrong JS, Collopy F. Error measures for generalizing about forecasting methods: empirical comparisons. Int J Forecast. 1992; 8(1):69–80. 26. Hyndman RJ, Koehler AB. Another look at measures of forecast accuracy. Int J Forecast. 2006;22:679–88.

27. Altman DG, Bland JM. Diagnostic tests 2: predictive values. Br Med J. 1994;309(6947):102. 28. Altman DG, Bland JM. Diagnostic tests 1: sensitivity and specificity. Br Med J. 1994;308(6943):1552. 29. Deeks JJ. Systematic reviews of evaluations of diagnostic and screening tests. BMJ. 2001;323(7305):157–62. 30. Glas AS, Lijmer JG, Prins MH, Bonsel GJ, Bossuyt PM. The diagnostic odds ratio: a single indicator of test performance. J Clin Epidemiol. 2003;56(11):1129–35. 31. Hanley JA, McNeil BJ. The meaning and use of the area under a receiver operating characteristic (roc) curve. Radiology. 1982;143(1):29–36. 32. Moses LE, Shapiro D, Littenberg B. Combining independent studies of a diagnostic test into a summary roc curve: data-analytic approaches and some additional considerations. Stat Med. 1993;12(14):1293–316. 33. Ramachandran SK, Josephs LA. A meta-analysis of clinical screening tests for obstructive sleep apnea. Anesthesiology. 2009;110(4):928–39. 34. Higgins JP, Thompson SG, Deeks JJ, Altman DG. Measuring inconsistency in meta-analyses. Br Med J. 2003;327(7414):557. 35. Higgins JP. Commentary: heterogeneity in meta-analysis should be expected and appropriately quantified. Int J Epidemiol. 2008; 37(5):1158–60. 36. Mitchell MD. Validation of the summary ROC for diagnostic test meta-analysis: a monte carlo simulation. Acad Radiol. 2003; 10(1):25–31. 37. Walter S. Properties of the summary receiver operating characteristic (sroc) curve for diagnostic test data. Stat Med. 2002; 21(9):1237–56. 38. Leemis LM, Trivedi KS. A comparison of approximate interval estimators for the bernoulli parameter. Am Stat. 1996;50(1):63–8. 39. Taenzer AH, Pyke J, McGrath S, Blike G. Impact of pulse oximetry surveillance on rescue events and intensive care unit transfers: a before-and-after concurrence study. Anesthesiology. 2010;112(1):282–7.

123

Effect of concurrent oxygen therapy on accuracy of forecasting imminent postoperative desaturation.

Episodic postoperative desaturation occurs predominantly from respiratory depression or airway obstruction. Monitor display of desaturation is typical...
625KB Sizes 0 Downloads 3 Views