Accepted Manuscript Developing a stroke severity index based on administrative data was feasible using data mining techniques Sheng-Feng Sung, MD, Cheng-Yang Hsieh, MD, Yea-Huei Kao Yang, BPharm, Huey-Juan Lin, MD, MPH, Chih-Hung Chen, MD, Yu-Wei Chen, MD, Ya-Han Hu, PhD PII:
S0895-4356(15)00017-7
DOI:
10.1016/j.jclinepi.2015.01.009
Reference:
JCE 8793
To appear in:
Journal of Clinical Epidemiology
Received Date: 17 September 2014 Revised Date:
16 December 2014
Accepted Date: 16 January 2015
Please cite this article as: Sung S-F, Hsieh C-Y, Kao Yang Y-H, Lin H-J, Chen C-H, Chen Y-W, Hu Y-H, Developing a stroke severity index based on administrative data was feasible using data mining techniques, Journal of Clinical Epidemiology (2015), doi: 10.1016/j.jclinepi.2015.01.009. This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
ACCEPTED MANUSCRIPT Title Developing a stroke severity index based on administrative data was feasible using data mining techniques
RI PT
Authors Sheng-Feng Sung, MD Division of Neurology, Department of Internal Medicine, Ditmanson Medical Foundation Chia-Yi Christian Hospital, Chiayi City, Taiwan
[email protected] M AN U
SC
Cheng-Yang Hsieh, MD Department of Neurology, Tainan Sin Lau Hospital, Tainan, Taiwan Institute of Clinical Pharmacy and Pharmaceutical Sciences, College of Medicine, National Cheng Kung University, Tainan, Taiwan
[email protected] Yea-Huei Kao Yang, BPharm Institute of Clinical Pharmacy and Pharmaceutical Sciences, College of Medicine, National Cheng Kung University, Tainan, Taiwan
[email protected] TE D
Huey-Juan Lin, MD, MPH Department of Neurology, Chi Mei Medical Center, Tainan, Taiwan
[email protected] EP
Chih-Hung Chen, MD Department of Neurology, College of Medicine, National Cheng Kung University, Tainan, Taiwan
[email protected] AC C
Yu-Wei Chen, MD Department of Neurology, Landseed Hospital, Tao-Yuan County, Taiwan Department of Neurology, National Taiwan University Hospital, Taipei, Taiwan
[email protected] Ya-Han Hu, PhD Department of Information Management and Institute of Healthcare Information Management, National Chung Cheng University, Chiayi County, Taiwan
[email protected] * Sheng-Feng Sung and Cheng-Yang Hsieh contributed equally to this work.
1
ACCEPTED MANUSCRIPT
AC C
EP
TE D
M AN U
SC
RI PT
Corresponding author & contact information Ya-Han Hu, PhD Department of Information Management and Institute of Healthcare Information Management, National Chung Cheng University, 168 University Road, Min-Hsiung, Chiayi County 62102, Taiwan Tel: +886-5-2720411 Ext 34613; Fax: +886-5-2721501 Email:
[email protected] 2
ACCEPTED MANUSCRIPT Abstract Objective: Case-mix adjustment is difficult for stroke outcome studies using administrative data. However, relevant prescription, laboratory, procedure, and service claims might be surrogates for stroke severity. This study proposes a method for developing a stroke severity index (SSI) by using
Study Design and Setting:
RI PT
administrative data.
We identified 3,577 patients with acute ischemic stroke from a
hospital-based registry and analyzed claims data with plenty of features. Stroke severity was
SC
measured using the National Institutes of Health Stroke Scale (NIHSS). We used two data mining methods and conventional multiple linear regression to develop prediction models, comparing the
M AN U
model performance according to the Pearson correlation coefficient between the SSI and the NIHSS. We validated these models in four independent cohorts by using hospital-based registry data linked to a nationwide administrative database.
Results: We identified seven predictive features and developed three models. The k-nearest
TE D
neighbor model (correlation coefficient, 0.743; 95% confidence interval, 0.737-0.749) performed slightly better than the multiple linear regression model (0.742; 0.736-0.747), followed by the regression tree model (0.737; 0.731-0.742). In the validation cohorts, the correlation coefficients
EP
were between 0.677 and 0.725 for all three models. Conclusion: The claims-based SSI enables adjusting for disease severity in stroke studies using
AC C
administrative data.
Keywords Acute ischemic stroke, disease severity, administrative data, data mining, prediction model, outcomes research Running title Stroke severity index for administrative data research Word count 4067 3
ACCEPTED MANUSCRIPT
What is new?
Key findings This study identified seven predictive features from administrative claims data. It also
RI PT
developed and validated three models for predicting the neurological deficit severity of patients hospitalized for acute ischemic stroke.
The stroke severity indices provided by applying these models were highly correlated with
SC
What this adds to what is known?
M AN U
clinical stroke severity, as measured using the National Institutes of Health Stroke Scale.
Although administrative data typically contain inadequate clinical information, stroke severity
TE D
for each patient can be easily determined from hospitalization claims.
What is the implication, what should change now?
Stroke outcome studies using claims data have often lacked adequate adjustment for disease
EP
severity. Researchers may use the stroke severity indices presented herein to adjust for disease severity in future ischemic stroke studies using Taiwan’s National Health Insurance Research
AC C
Database.
Replication of this study’s approach to developing stroke severity indices in other large administrative databases that contain inadequate clinical information is encouraged.
4
ACCEPTED MANUSCRIPT 1. Introduction Stroke is among the leading causes of death, disability, and hospitalizations and is ranked third in disease burden worldwide [1]. Research on stroke outcomes is essential for both clinical care and policy development. Numerous stroke outcome studies have been conducted on the basis of
RI PT
administrative data [2–4]. However, because clinical information was insufficient [5], these studies share the limitation of inadequate adjustment for case-mix or stroke severity. This limitation is particularly critical and might diminish the value of the research findings; stroke is heterogeneous,
SC
and thus, stroke severity varies greatly among patients. Nevertheless, administrative claims data reflect routine clinical practice and might contain information that is highly associated with stroke
M AN U
severity.
Most case-mix adjustment models for stroke comprise markers of stroke severity, prestroke function and comorbidities [6]. Inadequate adjustment for patient case-mix has been underscored as a major factor limiting the comparisons of patient outcomes between health care providers [7]. A
TE D
previous study that used the administrative claims data for Medicare beneficiaries with acute ischemic stroke (AIS) found that a hospital risk model for 30-day mortality based on claims data alone without adjustment for stroke severity has substantially worse discrimination compared with a
EP
model that adjusted for stroke severity using the National Institutes of Health Stroke Scale (NIHSS)
AC C
according to a linkage with the Get With The Guidelines–stroke registry [8]. Because stroke severity is required for risk adjustment of ischemic stroke outcomes, and because standardized stroke scales are excluded from claims data and are not routinely recorded in clinical practice at all hospitals, exploring valid surrogates for stroke severity merits high priority [9]. Most neurological and medical complications that develop in patients with AIS within 48 hours up to few weeks in the course of stroke progression correlate with stroke severity [10,11]. Managing and treating these complications are essential in stroke care. For example, stroke patients with dysphagia generally require the placement of a nasogastric tube for feeding [10], and mannitol 5
ACCEPTED MANUSCRIPT osmotherapy is recommended for treating stroke patients with brain edema [11]. Therefore, relevant prescription, laboratory, procedure, and service claims in administrative data might be potential surrogates for stroke severity. Nevertheless, prioritizing surrogates from numerous candidate attributes in claims data is challenging.
RI PT
Research using administrative data has the advantages of describing a large sample of
geographically dispersed patients with serial longitudinal data, relative efficiency, and the potential to avoid the possible selection bias inherent in institution-specific studies [12,13]. Administrative
SC
data are thus frequently used to conduct population-based studies or performance monitoring research. Taiwan’s National Health Insurance (NHI) is a single-payer, compulsory enrollment
M AN U
health care program lunched on March 1, 1995. The NHI program covers virtually the entire population in Taiwan [14], and provides universal coverage for hospital and physician services, thus creating a large repository of health administrative data. To test the hypothesis that stroke severity could be determined according to the patterns of hospitalization claims in the NHI data set, we
TE D
combined data mining techniques with conventional statistical methods to develop stroke severity models from claims data in a single hospital. Stroke severity was represented as the admission NIHSS score in the hospital-based stroke registry. We then validated the models in four
EP
independent cohorts of stroke patients using the claims data of the NHI linked to the individual
AC C
hospital-based registry.
2. Methods
2.1 Derivation cohort
We identified patients with AIS from the stroke registry of Chia-Yi Christian Hospital, which prospectively registered all stroke patients admitted within ten days of symptom onset according to the design of the nationwide Taiwan Stroke Registry [15]. Ischemic stroke is defined as an acute onset of neurologic deficits persisting longer than 24 hours with no hemorrhage on the first brain 6
ACCEPTED MANUSCRIPT computed tomography or with acute corresponding ischemic lesion(s) on diffusion weighted magnetic resonance imaging. The study hospital is a 1,000-bed regional hospital serving a city and its adjoining rural area of approximately 1,000,000 inhabitants. The Chia-Yi Christian Hospital Institutional Review Board approved the study protocol.
RI PT
We included patients aged 18 years or older who were admitted between September 2007 and December 2013. Patients with in-hospital stroke were excluded. Stroke severity, as assessed by the NIHSS during patient admission, was obtained from the stroke registry. From the hospital data
SC
warehouse, we collected each patient’s claims that had been submitted to the NHI, including billing data for medications, laboratory tests, imaging studies, procedures, clinical services, and supplies.
M AN U
Patients whose claims data were missing were excluded from the study. 2.2 Feature selection
All of the billing codes related to medications, laboratory tests, imaging studies, procedures, and clinical services from patient claims data were considered potential predictive features. For a
TE D
given patient, a feature (billing code) was scored present (1) or absent (0) according to whether the billing code was listed on his/her claims, thus forming a large binary feature set. A three-step feature selection process was then used. The first step involved a frequency cutoff operation in
EP
which a feature was filtered out if it appeared (i.e., coded as 1) in less than 1% or more than 99% of the patient claims. A correlation-based feature selection (CFS) method was used in the second step
AC C
to evaluate the correlations between the feature subsets and the dependent variable. In other words, optimal feature subsets contain features that are highly correlated with the outcome, yet uncorrelated with each other [16]. In this study, the Weka 3.6.11 open-source data mining software (www.cs.waikato.ac.nz/ml/weka) was used to perform the CFS procedure. A subset of features that were both highly correlated with the outcome (NIHSS score) and weakly correlated with each other was identified using the CfsSubsetEval module with the greedy stepwise algorithm implemented in Weka. The third step was performed manually based on expert opinion. A panel of stroke 7
ACCEPTED MANUSCRIPT neurologists (SFS, CYH, HJL, CHC, and YWC) determined the final subset of features by consensus. Redundant features and those with low face validity were eliminated, whereas features with similar properties were aggregated. 2.3 Data mining techniques for regression
RI PT
The k-nearest neighbor (KNN), which is one of the most fundamental supervised learning algorithms, is a nonparametric method for predicting the outcome of an instance based on its nearest instances (neighbors) in a training set. Each instance is represented as a vector in a
SC
multidimensional feature space and the distance between two vectors can be easily measured. Generally, the distance measures for continuous and categorical variables are the Euclidean and
M AN U
Hamming distances, respectively. Given a testing instance t and a user-specified constant k, the KNN algorithm computes the distance between t and all other instances, denoted as dist(t, si), and then selects the top k nearest instances, i.e., those with the lowest dist(t, si), as neighbors for t. The weight of each neighbor si for the testing instance t, denoted as wi, is calculated using wi = 1/dist(t,
TE D
si). Finally, the estimated outcome of t can be determined according to the weighted or unweighted average of the outcome of si (i = 1, . . . , k).
The regression tree (RT), a decision tree algorithm, is a hierarchical structure consisting of
EP
branches and nodes. The internal node represents a selected predictor, and the branch of an internal node represents a subset of predictor values. Each instance falls into one leaf node at the end of the
AC C
tree. Each leaf node can be considered a set of instances satisfying a specific set of decision rules in the tree. Generally, the decision tree is easy to interpret and the rules in a decision tree can be applied at the bedside. Numerous decision tree-based supervised learning algorithms have been developed. To generate a RT, the algorithm recursively performs a binary split for a given set of instances based on different predictor values. The optimal split is then determined over all predictors at all possible split points according to the reduction in variance. The aforementioned process is recursively applied to each child node after each split. The tree growing process 8
ACCEPTED MANUSCRIPT continues until no further improvement on the decreasing variance or other stopping criteria, such as the minimum number of instances per leaf and maximum tree depth, are met. Each leaf stores a class value that represents the average value of instances that reach the leaf. 2.4 Model development
RI PT
The NIHSS is a 15-item neurological examination scale used to assess neurological
impairment in stroke patients. Interrater reliability and test–retest reliability for the scale are high, and the scale validity was demonstrated by its strong correlations with patient infarct size on
SC
computed tomography obtained one week after the stroke and patient functional outcome assessed three months after the stroke [17]. The NIHSS was also proved valid for patients treated with
M AN U
intravenous thrombolysis because it highly corresponded to other measures of clinical outcomes [18]. The total NIHSS scores range from 0 to 42, with higher values representing greater stroke severity. Because the distribution of the NIHSS scores was skewed, we used nonparametric methods of data mining to develop the models.
TE D
The KNN and RT algorithms implemented in Weka, i.e., the IBK module and the REPTree module, were used to train a model that can predict the NIHSS score, which was treated as a continuous variable. Because the internal parameter settings substantially influence the prediction
EP
performance of the two algorithms, the CVParameterSelection metalearner module implemented in Weka was used to optimize the parameters. A base regressor was specifically selected and an
AC C
arbitrary number of parameter combinations were defined in the CVParameterSelection metalearner module. This metalearner automatically executed the base regressor with all possible parameter combinations and determined the optimal parameter settings based on the best prediction results using cross-validation. For the IBK module, we used the unweighted average of the nearest neighbors to determine the value of the testing instance, and the number of nearest neighbors, k, was set to 11 by using the CVParameterSelection module. The REPTree module creates a RT by using variance reduction as 9
ACCEPTED MANUSCRIPT the splitting criterion and prunes the tree using the reduced-error technique. The minimum total number of instances in a leaf was set to 11 by using the CVParameterSelection metalearner module. Finally, a conventional multiple linear regression model was built using Weka’s LinearRegression module, and all of the features from the final subset were entered into the model.
RI PT
2.4 Validation cohorts
The stroke severity index models were externally validated by using patients in stroke
registries from four hospitals of varying size. The Chi Mei Medical Center (CMMC) is a medical
SC
center with 1,300 beds, and the National Cheng Kung University Hospital (NCKUH) is a
university-affiliated medical center with 1,200 beds. The Landseed Hospital (LH) and the Tainan
M AN U
Sin Lau Hospital (TSLH) are regional hospitals with 600 and 450 beds, respectively. Patients aged 18 years or older with a discharge diagnosis of AIS were included and those with in-hospital stroke were excluded. To assure the anonymity of patients, we retrieved only the sex, date of birth, date of admission, date of discharge, and NIHSS score at admission of each patient from the registry
TE D
databases.
The National Health Insurance Research Database (NHIRD), derived from the NHI claims data, is maintained and made available for research by the National Health Research Institutes of
EP
Taiwan. The NHIRD contains the medical care utilization records of the NHI beneficiaries and enables population-based research. The population of ischemic stroke in the NHIRD was identified
AC C
by extracting data on all hospitalizations for which a diagnosis of ischemic stroke was recorded (International Classification of Diseases, Ninth Revision, Clinical Modification [ICD-9-CM] diagnosis code 433.x or 434.x) from a stroke specific data set of the NHIRD between 2006 and 2010. Because hospital and patient identifiers are encrypted in the NHIRD, the registry data was linked to the NHIRD based on four non-unique patient characteristics, which are: sex, date of birth, date of admission, and date of discharge [19,20]. A deviation of ±1 day for the date of admission and date of discharge was allowed [21]. After establishing a hospital “crosswalk” by matching 10
ACCEPTED MANUSCRIPT hospitalizations from each validation hospital’s registry with the NHIRD hospitalizations using the four patient characteristics, the hospital with the highest matching records for a given validation hospital was assumed to be the correct link for that validation hospital [19]. Successfully linked cases were included in the validation cohorts.
RI PT
For each case in the validation cohorts, we obtained billing data for medications, laboratory tests, imaging studies, procedures, and clinical services from the hospitalization claims files. We determined the presence or absence of each predictive feature, determined during the
SC
aforementioned feature selection process, according to the billing codes listed on patient claims and computed the predicted stroke severity indices by using the prediction models.
M AN U
2.5 Model evaluation and statistical analysis
All models were trained to minimize the mean square error between the predictions (stroke severity indices) and targets (NIHSS scores). The Pearson correlation coefficients between the predictions and targets were reported. A tenfold cross-validation was used to evaluate the internal
TE D
validity of these models. This validation strategy partitions the original data set into ten subsamples, reserving one for testing and using the remaining nine as training data. In this way, it ensures that the model is tested on the unobserved data, thus mitigating the chance of over fitting. By varying
EP
the seed values for the random splitting of the original data set, the tenfold cross-validation process was repeated ten times and the results were averaged based on the 100 test subsamples to produce a
AC C
single estimate of the Pearson correlation coefficient. Paired t-tests were used to compare the mean correlation coefficients between the prediction models. The generalizability of the models was tested by assessing their performance in the validation cohorts and the Pearson correlation coefficients between the predicted stroke severity indices and the NIHSS scores were reported. The Stata procedure, corcor, was used to compare the performance between the models in the validation cohorts for dependent correlation coefficients [22].
11
ACCEPTED MANUSCRIPT General statistical analyses were performed using Stata 13.1 (StataCorp, College Station, Texas). Continuous variables were summarized as mean ± standard deviation or median (interquartile range), and categorical variables as counts and percentages. Chi-squared tests were used to compare categorical variables, and t-tests or Mann–Whitney U tests were used for
RI PT
continuous variables, whichever was appropriate. Two-tailed P < 0.05 was considered statistically significant.
SC
3. Results 3.1 Patient characteristics
M AN U
We enrolled 3,577 patients in the derivation cohort after excluding 105 and 16 patients with in-hospital stroke and missing claims data, respectively. Table 1 illustrates the characteristics of the patients. Approximately one-third of the patients had a prior stroke. The NIHSS scores at admission ranged from 0 to 40, with a median of 5. Nearly 88% of the patients were discharged within two
TE D
weeks, with a median length of stay of seven days.
EP
AC C
A total of 4,438, 1,694, 1,148, and 318 patients from the stroke registries of four validation hospitals were eligible for this study. After linking to the NHIRD, 3,816 (CMMC), 1,563 (NCKUH), 962 (LH), and 276 (TSLH) patients were included in the validation cohorts (Table 2). The median NIHSS score of the derivation cohort was significantly different from those in the CMMC and TSLH cohorts. The median length of stay of the derivation cohort also differed from those in the CMMC, LH, and TSLH cohorts.
12
ACCEPTED MANUSCRIPT < insert Table 2. Characteristics of the patients in the validation cohorts.>
3.2 Feature selection results
RI PT
A total of 494 from the 1,634 features retrieved from the claims data of the derivation cohort passed the frequency cutoff filter. Then correlation-based feature selection identified a subset of 67 features (see Supplemental Table 1), which was manually reviewed and was reduced to a subset of
SC
seven features (Table 3 and see Supplemental Table 2). Although half of the features (billing codes) selected by the correlation-based feature selection stand for medications, all medications, except for
M AN U
mannitol, were excluded from the final subset because the use of medications is complex and may vary considerably between physicians and hospitals. In addition, rehabilitation service claims were eliminated because the use of rehabilitation may be affected by the varied availability of rehabilitation providers among different hospitals and the significant disparity in rehabilitation use
TE D
between neurologists and non-neurologists [23]. The distribution of the features differed among the derivation and validation cohorts (see Supplemental Table 3).
EP
AC C
3.3 Evaluation of models
In the derivation cohort, the Pearson correlation coefficients between the stroke severity indices and the NIHSS scores were 0.743 (95% confidence interval [CI], 0.737-0.749) for the KNN model, 0.737 (95% CI, 0.731-0.742) for the RT model, and 0.742 (95% CI, 0.736-0.747) for the MLR model. Based on comparing the mean Pearson correlation coefficients with the paired t-test, the KNN model performed slightly better than did the MLR model (P = 0.046), whereas the RT model performed worse than did the KNN model (P < 0.001) and MLR model (P < 0.001). In the validation cohorts, the stroke severity indices and NIHSS scores exhibited strong correlations 13
ACCEPTED MANUSCRIPT (Pearson correlation coefficients were between 0.677 and 0.725) for all models (Fig. 1 and see Supplemental Table 4). The MLR model exhibited higher Pearson correlation coefficients (P < 0.001 for CMMC cohort, P < 0.001 for NCKUH, P < 0.001 for LH, and P = 0.272 for TSLH) than the KNN model, and the KNN model’s performance was similar to that of the RT model (P = 0.474
RI PT
for CMMC, P = 0.848 for NCKUH, P = 0.937 for LH, and P = 0.917 for TSLH).
SC
4. Discussion
TE D
We developed and validated three models that predicted the NIHSS score at admission for AIS patients. These models might provide an index to represent the stroke severity of patients for research using administrative data that typically lack detailed clinical information about disease
EP
severity. Among the models, the KNN model and the MLR model performed almost equally well on the internal validation in terms of the correlation between the stroke severity indices and the
models.
AC C
clinical NIHSS scores. During the external validation, the MLR model outperformed the other two
Although the three models performed reasonably well, each model has its own advantages and limitations. The KNN algorithm is analogous to clinical reasoning and is probably readily accepted by clinicians [24]; however, it might be considerably difficult to deploy and disseminate it to other researchers. To overcome this problem, we set up a website (http://hdmlab.twbbs.org:508/SSI/hdmlab/ssi2.jsp) that enables computing the stroke severity 14
ACCEPTED MANUSCRIPT indices online. The RT model was preferred for its transparency, and the tree structure could be converted into a set of classification rules (Fig. 2). The decision tree-based methods do not require specifications regarding the parametric nature of the relationship between the predictive features and target [25], and could facilitate the identification and interpretation of the interactions among
RI PT
the predictors [26]. Nevertheless, the decision tree could be too complex to be understood easily when a data set contains many predictive features [27]. The MLR model is probably the simplest to implement, and it can demonstrate the relative strength of the various features within the model.
SC
The stroke severity index can be easily obtained by using the regression equation (Table 4).
Although the linear model is limited by the assumption that linear combinations of the predictive
M AN U
variables can effectively describe the behavior of the response, the MLR model seemed suitable and performed quite well in this study. It is perhaps because the selected predictors are binary and
TE D
correlate with the stroke severity in a generally linear fashion.
EP
Institutes of Health Stroke Scale (NIHSS) score at admission by the features. ICU, intensive care
AC C
Several stroke scales, including the NIHSS [17], the Canadian Neurological Scale [28], and the Scandinavian Stroke Scale [29], have been designed for use in clinical trials. These scales generally correlate well with clinical outcomes and have been widely used in routine practice and stroke studies to evaluate stroke severity [30]. Of the scales, the NIHSS was determined to yield the
15
ACCEPTED MANUSCRIPT most prognostic information [31]. However, standardized stroke scales are not recorded in administrative data. Length of stay was used as a proxy of stroke severity and was associated with increased readmission risk in a study using administrative data [32]. However, this method is not ideal
RI PT
because length of stay typically increases with stroke severity for mild strokes and decreases with stroke severity for severe strokes [33]. Other studies have constructed proxy indicators to represent high stroke severity based on the ICD-9-CM diagnosis codes, ICD-9-CM procedure codes, and
SC
Current Procedural Terminology codes [23,34]. The commonly used indicators include use of mechanical ventilation, surgical procedures (such as gastrostomy, craniotomy, and tracheostomy),
M AN U
and neurological deficits (such as hemiplegia and hemiparesis, aphasia, and epilepsy). However, these approaches might be limited by the inaccurate coding and under-reporting of certain diagnoses or minor procedures [35,36]. Moreover, the relative weights of these proxy indicators were not understood; therefore, it was impossible to establish a composite index of stroke severity.
TE D
In particular, these proxy indicators have not been validated.
Composite disease severity indices have been developed for administrative data research on other diseases. Ting et al [37] established a claims-based rheumatoid arthritis severity index
EP
comprising type and number of laboratory tests (inflammatory markers, chemistry panels, platelet counts) used, number of outpatient visits (rehabilitation and rheumatology), and the presence of a
AC C
specific diagnosis (Felty’s syndrome). Chang et al [38] tested the predictive validity of a diabetes complications severity index incorporating seven diabetic complications based only on the ICD-9CM codes from claims data. This index predicted the number of hospitalizations in the succeeding four years. Ananthakrishnan et al [39] extracted prespecified potential predictive variables, such as anemia, malnutrition, and requirement of blood transfusion, from up to 15 discharge ICD-9-CM diagnosis codes and 15 ICD-9-CM procedure codes in hospitalization records. After identifying
16
ACCEPTED MANUSCRIPT independent predictors by multivariate logistic regression, they constructed a risk score to stratify the severity of Crohn’s disease hospitalizations. In contrast to the aforementioned approaches, the proposed approach can be used to identify critical predictive features and develop models for estimating stroke severity by exploring detailed
RI PT
billing codes in claims data instead of using diagnosis and procedure codes; only up to five ICD-9CM diagnosis codes and five ICD-9-CM procedure codes were recorded in each hospitalization claim submitted to the NHI. Therefore, by not using the ICD-9-CM diagnosis codes and procedure
SC
codes, we reduced the possibility of coding errors and under-coding, which could undermine the performance of prediction models. By contrast, the use of medications, procedures, diagnostic tests,
M AN U
and services was faithfully represented as billing codes in the claims data because Taiwan’s NHI provides universal coverage for hospitalizations, and stroke hospitalizations are reimbursed on a fee-for-service basis. By applying data mining techniques, we were able to effectively exploit the high-dimensional administrative data through feature selection algorithms. Without the feature
TE D
selection process, developing a model from high-dimensional data, i.e., data with a large number of features, might be difficult using conventional regression techniques. Our study has limitations. First, stroke patients were primarily managed by neurologists in the
EP
five participating hospitals and the diagnosis of AIS had been ascertained by stroke neurologists before patients entered into the stroke registry. Previous studies using the NHIRD have revealed
AC C
that around 45% of stroke patients were admitted to neurology service in Taiwan [23,32]. Because the stroke severity index we developed was based on the prescriptions, laboratory tests, procedures, and services during hospitalization, the differences in practice patterns between neurologists and non-neurologists might affect its performance. Second, we included patients with prior stroke who could have residual neurological deficits. Hence, the admission NIHSS score might not be fully accounted for by the stroke leading to the current admission. In addition, the NIHSS score was evaluated at admission, whereas the stroke severity index was based on data from the process of 17
ACCEPTED MANUSCRIPT care during the entire hospital stay. Therefore, the stroke severity index should be considered a global measure of neurological deficit severity for each hospitalization.
5. Conclusion
RI PT
Developing claims-based stroke severity indices is feasible by using data mining and statistical learning techniques, such as exploring billing codes in high-dimensional claims data, using a feature selection process to identify the optimal predictive features, and applying various regression
SC
methods to develop models for estimating stroke severity. Among the models, the KNN model outperformed the other two models in the derivation cohort, whereas the MLR model performed the
M AN U
most effectively in the validation cohorts. This study established a novel method by using NHIRD data to develop models for generating stroke severity indices, which represent proxy measures of neurological impairment that can be used for adjusting disease severity in ischemic stroke studies. This approach can be followed to develop stroke severity indices in other large administrative
TE D
databases that typically lack clinical data. Through adjustment for disease severity, the stroke
AC C
EP
severity index can improve future stroke outcome studies using administrative data.
18
ACCEPTED MANUSCRIPT Acknowledgments This study is based in part on data from the National Health Insurance Research Database provided by the Bureau of National Health Insurance, Department of Health and managed by National Health Research Institutes. The interpretation and conclusions contained herein do not
RI PT
represent those of Bureau of National Health Insurance, Department of Health or National Health Research Institutes.
Sources of funding: This research was supported in part by the National Cheng Kung
AC C
EP
TE D
M AN U
of China (grant number NSC 102-2410-H-194-104-MY2).
SC
University (grant number NCKUH-10206008) and the National Science Council of the Republic
19
ACCEPTED MANUSCRIPT References [1]
Lopez AD, Mathers CD, Ezzati M, Jamison DT, Murray CJL. Global and regional burden of disease and risk factors, 2001: systematic analysis of population health data. Lancet 2006;367:1747–57. Lichtman JH, Leifheit-Limson EC, Jones SB, Wang Y, Goldstein LB. 30-Day risk-
RI PT
[2]
standardized mortality and readmission rates after ischemic stroke in critical access hospitals. Stroke 2012;43:2741–2747.
Lichtman JH, Jones SB, Wang Y, Leifheit-Limson EC, Goldstein LB. Seasonal variation in
SC
[3]
30-day mortality after stroke: teaching versus nonteaching hospitals. Stroke 2013;44:531–
[4]
M AN U
533.
Tamm A, Siddiqui M, Shuaib A, Butcher K, Jassal R, Muratoglu M, et al. Impact of stroke care unit on patient outcomes in a community hospital. Stroke 2014;45:211–216.
[5]
Virnig BA, McBean M. Administrative data for public health surveillance and planning.
[6]
TE D
Annu Rev Public Health 2001;22:213–230.
Teale EA, Forster A, Munyombwe T, Young JB. A systematic review of case-mix adjustment models for stroke. Clin Rehabil 2012;26:771–86. Lilford R, Mohammed MA, Spiegelhalter D, Thomson R. Use and misuse of process and
EP
[7]
outcome data in managing performance of acute medical care: avoiding institutional stigma.
[8]
AC C
Lancet 2004;363:1147–54. Fonarow GC, Pan W, Saver JL, Smith EE, Reeves MJ, Broderick JP, et al. Comparison of 30-day mortality models for profiling hospital performance in acute ischemic stroke with vs without adjustment for stroke severity. JAMA 2012;308:257–64. [9]
Katzan IL, Spertus J, Bettger JP, Bravata DM, Reeves MJ, Smith EE, et al. Risk adjustment of ischemic stroke outcomes for comparing hospital performance: a statement for
20
ACCEPTED MANUSCRIPT healthcare professionals from the American Heart Association/American Stroke Association. Stroke 2014;45:918–44. [10]
Kumar S, Selim MH, Caplan LR. Medical complications after stroke. Lancet Neurol 2010;9:105–18. Balami JS, Chen R-L, Grunwald IQ, Buchan AM. Neurological complications of acute ischaemic stroke. Lancet Neurol 2011;10:357–71.
[12]
Lohr KN. Use of insurance claims data in measuring quality of care. Int J Technol Assess
SC
Health Care 1990;6:263–71. [13]
RI PT
[11]
Fisher ES, Whaley FS, Krushat WM, Malenka DJ, Fleming C, Baron JA, et al. The
M AN U
accuracy of Medicare's hospital claims data: progress has been made, but problems remain. Am J Public Health 1992;82:243–8. [14]
Davis K, Huang AT. Learning from Taiwan: experience with universal health insurance. Ann Intern Med 2008;148:313–4.
Hsieh F-I, Lien L-M, Chen S-T, Bai C-H, Sun M-C, Tseng H-P, et al. Get With the
TE D
[15]
Guidelines-Stroke performance indicators: surveillance of stroke care in the Taiwan Stroke Registry: Get With the Guidelines-Stroke in Taiwan. Circulation 2010;122:1116–23. Hall MA. Correlation-based feature selection for machine learning. PhD dissertation, The
EP
[16]
University of Waikato, 1999. http://www.cms.waikato.ac.nz/~ml/publications/1999/99MH-
[17]
AC C
Thesis.pdf (accessed 24 August 2014). Brott T, Adams HP, Olinger CP, Marler JR, Barsan WG, Biller J, et al. Measurements of acute cerebral infarction: a clinical examination scale. Stroke 1989;20:864–70. [18]
Lyden P, Lu M, Jackson C, Marler J, Kothari R, Brott T, et al. Underlying structure of the National Institutes of Health Stroke Scale: results of a factor analysis. NINDS tPA Stroke Trial Investigators. Stroke 1999;30:2347–2354.
21
ACCEPTED MANUSCRIPT [19]
Hammill BG, Hernandez AF, Peterson ED, Fonarow GC, Schulman KA, Curtis LH. Linking inpatient clinical registry data to Medicare claims data using indirect identifiers. Am Heart J 2009;157:995–1000.
[20]
Cheng C-L, Kao Y-HY, Lin S-J, Lee C-H, Lai M-L. Validation of the National Health
RI PT
Insurance Research Database with ischemic stroke cases in Taiwan. Pharmacoepidemiol Drug Saf 2011;20:236–42. [21]
Pasquali SK, Jacobs JP, Shook GJ, O'Brien SM, Hall M, Jacobs ML, et al. Linking clinical
SC
registry data with administrative data using indirect identifiers: implementation and validation in the congenital heart surgery population. Am Heart J 2010;160:1099–104. Goldstein R. Testing dependent correlation coefficients. Stata Technical Bulletin
M AN U
[22]
1996;32:18. [23]
Lee H-C, Chang K-C, Huang Y-C, Lan C-F, Chen J-J, Wei S-H. Inpatient rehabilitation utilization for acute stroke under a universal health insurance system. Am J Manag Care
[24]
TE D
2010;16:e67–e74.
Zhu M, Chen W, Hirdes JP, Stolee P. The K-nearest neighbor algorithm predicted rehabilitation potential better than current Clinical Assessment Protocol. J Clin Epidemiol
[25]
EP
2007;60:1015–21.
Austin PC, Tu JV, Ho JE, Levy D, Lee DS. Using methods from the data-mining and
AC C
machine-learning literature for disease classification and prediction: a case study examining classification of heart failure subtypes. J Clin Epidemiol 2013;66:398–407. [26]
Allore H, Tinetti ME, Araujo KLB, Hardy S, Peduzzi P. A case study found that a regression tree outperformed multiple linear regression in predicting the relationship between impairments and Social and Productive Activities scores. J Clin Epidemiol 2005;58:154–61.
22
ACCEPTED MANUSCRIPT [27]
Yoo I, Alafaireet P, Marinov M, Pena-Hernandez K, Gopidi R, Chang J-F, et al. Data mining in healthcare and biomedicine: a survey of the literature. J Med Syst 2012;36:2431– 48.
[28]
Cote R, Battista RN, Wolfson C, Boucher J, Adam J, Hachinski V. The Canadian
[29]
RI PT
Neurological Scale: validation and reliability assessment. Neurology 1989;39:638–643. Multicenter trial of hemodilution in ischemic stroke--background and study protocol. Scandinavian Stroke Study Group. Stroke 1985;16:885–890.
Faraji F, Ghasami K, Talaie-Zanjani A, Mohammadbeigi A. Prognostic factors in acute
SC
[30]
stroke, regarding to stroke severity by Canadian Neurological Stroke Scale: A hospital-
[31]
M AN U
based study. Asian J Neurosurg 2013;8:78–82.
Muir KW, Weir CJ, Murray GD, Povey C, Lees KR. Comparison of neurological scales and scoring systems for acute stroke prognosis. Stroke 1996;27:1817–1820.
[32]
Tseng M-C, Lin H-J. Readmission after hospitalization for stroke in Taiwan: results from a
[33]
TE D
national sample. J Neurol Sci 2009;284:52–5.
Chang K-C, Tseng M-C, Weng H-H, Lin Y-H, Liou C-W, Tan T-Y. Prediction of length of stay of first-ever ischemic stroke. Stroke 2002;33:2670–2674. Smith MA, Frytak JR, Liou J-I, Finch MD. Rehospitalization and survival for stroke
EP
[34]
patients in managed care and traditional Medicare plans. Med Care 2005;43:902–10. Quan H, Parsons GA, Ghali WA. Validity of information on comorbidity derived rom ICD-
AC C
[35]
9-CCM administrative data. Med Care 2002;40:675–85. [36]
Quan H, Parsons GA, Ghali WA. Validity of procedure codes in International Classification of Diseases, 9th revision, clinical modification administrative data. Med Care 2004;42:801–9.
23
ACCEPTED MANUSCRIPT [37]
Ting G, Schneeweiss S, Scranton R, Katz JN, Weinblatt ME, Young M, et al. Development of a health care utilisation data-based index for rheumatoid arthritis severity: a preliminary study. Arthritis Res Ther 2008;10:R95.
[38]
Chang H-Y, Weiner JP, Richards TM, Bleich SN, Segal JB. Validating the adapted
Ananthakrishnan AN, McGinley EL, Binion DG, Saeian K. A novel risk score to stratify
EP
TE D
M AN U
SC
severity of Crohn's disease hospitalizations. Am J Gastroenterol 2010;105:1799–807.
AC C
[39]
RI PT
Diabetes Complications Severity Index in claims data. Am J Manag Care 2012;18:721–6.
24
ACCEPTED MANUSCRIPT
Table 1. Characteristics of the patients in the derivation cohort. Characteristic
n = 3,577
69 (12)
Female
1,463 (41)
Risk factors 2,896 (81)
M AN U
Hypertension Diabetes mellitus
1,579 (44)
Hyperlipidemia
2,009 (56)
605 (17)
Prior stroke Coronary artery disease
AC C
EP
Congestive heart failure
TE D
Atrial fibrillation
Current smoker
SC
Age, mean (SD)
RI PT
Demographics
1,085 (30) 467 (13) 199 (6) 839 (23)
Clinical data
Admission NIHSS score, median (IQR)
5 (3-10)
Intravenous thrombolysis
205 (6)
Length of stay, median (IQR)
7 (4-10)
Data are numbers (percentage) unless specified otherwise.
ACCEPTED MANUSCRIPT
IQR, interquartile range; SD, standard deviation. Table 2. Characteristics of the patients in the validation cohorts.> CMMC
NCKUH
LH
TSLH
(n = 3,816)
(n = 1,563)
(n = 962)
(n = 276)
Aug, 2006 -
Aug, 2006 -
Aug, 2006 -
Aug, 2009 -
Dec, 2010
Dec, 2009
Dec, 2010
Dec, 2010
Age, mean (SD)
67 (13)**
68 (13)**
Female, n (%)
1,526 (40)
643 (41)
NIHSS, median (IQR)
4 (2-9)**
LOS, median (IQR)
5 (4-10)**
RI PT
Characteristic
SC
Enrollment period
68 (12)*
383 (40)
101 (37)
M AN U
68 (13)
5 (3-10)
5 (2-12)
3 (1-6)**
6 (5-11)
6 (4-9)**
5 (3-7)**
* P < 0.05 as compared with the derivation cohort; ** P < 0.01.
TE D
IQR, interquartile range; LOS, length of stay; NIHSS, National Institutes of Health Stroke Scale; SD, standard deviation.
Explanation
AC C
Feature
EP
Table 3. Final set of features after the three-step feature selection procedure.>
Airway suctioning
Patient having undergone airway suctioning
Bacterial sensitivity test
Bacteria isolated by culture and antibiotic sensitivity test having been performed
General ward stay
Patient having stayed in the general ward
ICU stay
Patient having stayed in the ICU
ACCEPTED MANUSCRIPT
Patient having undergone nasogastric intubation
Osmotherapy
Patient having received osmotherapy (mannitol or glycerol infusion)
Urinary catheterization
Patient having undergone (indwelling) urinary catheterization
ICU, intensive care unit.
SC
RI PT
Nasogastric intubation
Table 4. The multiple linear regression model for the stroke severity index.> Coefficient
Airway suctioning
3.5083*
Bacterial sensitivity test
1.3642*
General ward stay
-5.5761*
TE D 4.1770*
ICU stay
4.5809*
2.1448*
EP
Nasogastric intubation
Osmotherapy
M AN U
Feature
AC C
Urinary catheterization Constant
* P < 0.001
ICU, intensive care unit.
1.6569*
9.6804
AC C
EP
TE D
M AN U
SC
RI PT
ACCEPTED MANUSCRIPT
AC C
EP
TE D
M AN U
SC
RI PT
ACCEPTED MANUSCRIPT
Supplemental Material (Online Only)
!
ACCEPTED MANUSCRIPT
Tables Supplemental Table 1. Features (billing codes) selected by the correlation-based feature selection filter. Item
Billing code
Item
00201A
physician fee, ED, triage category 1
A017712209
cefazolin
00201B
physician fee, ED, triage category 1
A022274100
loperamide
sodium chloride, potassium acetate, sodium acetate, magnesium chloride, hexahydrate, dextrose, potassium biphosphate
RI PT
Billing code
daily physician fee, general ward
A022682265
02012A
daily physician fee, ICU
A037697100
sennoside A+B
03002A
daily ward fee, general ward
A039385100
methocarbamol
03027A
daily nursing fee, general ward
A039604277
sodium chloride
09005C
blood glucose
A042435100
benzonatate
09031C
gamma-glutamyl transferase
A044136100
cilostazol
09037C
blood ammonia
A0449771G0
aluminum dihydroxyallantoinate
09041B
blood gas analysis
AC195031G0
diphenidol
10502B
diphenylhydantoin
B009554100
digoxin
12053B
antinuclear antibody
B011315100
flupentixol, melitracen
13009B
bacterial sensitivity test
B014861216
amiodarone
18006B
doppler echocardiography
B015398265
dopamine
20013B
dopscan
B017530212
ranitidine
30028B
anticardiolipin antibody, IgM
B019457216
midazolam
39016B
daily IV pump
B021914100
losartan
PT, moderate level
B022013219
norepinephrine
acid enema
B022559199
polystyrene sulfonate
urinary catheterization
B023306245
pantoprazole
indwelling urinary catheterization
B023619100
aspirin
47017C
nasogastric intubation
B023728255
levofloxacin
47018C
daily nasogastric feeding
B024720212
famotidine
47031C
endotracheal intubation
K000739299
insulin
47041C
airway suctioning
N000817100
phenytoin
47042C
daily airway suctioning
N004463265
dextrose
47006C 47013C 47014C
M AN U
TE D
EP
AC C
42007A
SC
02007A
1
!
ACCEPTED MANUSCRIPT dressing change, small wound
N012916100
meclizine
A000480209
epinephrine
OT2
OT - passive ROM
A0016461G0
diazepam
PTC1
PT - facilitation techniques
A004951209
atropine
PTC6
PT - ambulation training
A009633266
mannitol
PTM5
PT - passive ROM
A0103581G0
diltiazem
PTM8
PT - tilting table training
A011552277
dextrose
ST1
ST - auditory comprehension training
A0150781G0
diphenidol
RI PT
48011C
ED, emergency department; ICU, intensive care unit; IV, intravenous; OT, occupational therapy; PT, physical therapy;
AC C
EP
TE D
M AN U
SC
ROM, range of motion; ST, speech therapy.
2
!
ACCEPTED MANUSCRIPT
Supplemental Table 2. Features determined by the 3-step feature selection procedure and their corresponding billing codes. Billing codes identified in the derivation cohort
Airway suctioning
47041C, 47042C
Bacterial sensitivity test
13009B
13010B, 13011B
General ward stay
03002A
03001K, 03005K, 03006A, 03008B, 03002AB, 02006K, 02007A, 02008B, 03026K, 03027A
ICU stay
02012A
02011K, 02013B, 03010E, 03011F, 03012G, 03047E, 03048F, 03049G
Nasogastric intubation
47017C, 47018C
Urinary catheterization
47013C, 47014C
SC
AC C
EP
ICU, intensive care unit.
M AN U
A009633266
A009633255, A009633277, A009745277, A013354277, A015561255, A015561266, A015561277, A016476238, A016476266, A016476277, A031387238, A033425266, A042601238, B014379277, B020322265, B020322277, N012343266, A023733263, A023733265, A023733266, A023733277, A024986209, A024986265, A024986266, A024986277, A025104266, A025104277, A025355266, A025355277, A026793265, A026793266, A026793277, A028475265, A028475277, A029475265, A029475266, A029475277, A034722277, AC23733263, AC23733266, AC23733277, AC24986265, AC24986266, AC24986277, AC28475277, AC29475265, B006604277, B017082263, B017082277, B017728265, B017728277
TE D
Osmotherapy (mannitol or glycerol infusion)
Similar billing codes in Taiwan’s National Health Insurance fee schedule
RI PT
Feature
3
!
ACCEPTED MANUSCRIPT
Supplemental Table 3. Distribution of features among the derivation and validation cohorts. Derivation cohort
Model
Validation cohorts CMMC (n = 3,816)
NCKUH (n = 1,563)
LH (n = 962)
TSLH (n = 276)
Airway suctioning
664 (18.6)
551 (14.4)b
258 (16.5)
117 (12.2)b
29 (10.5)b
Bacterial sensitivity test
327 (9.1)
643 (16.9)b
121 (7.7)
50 (5.2)b
27 (9.8)
General ward stay
3,465 (96.9)
3,746 (98.2)b
1,551 (99.2)b
932 (96.9)
276 (100)b
ICU stay
846 (23.7)
387 (10.1)b
129 (8.3)b
144 (15.0)b
23 (8.3)b
Nasogastric intubation
1,012 (28.3)
848 (22.2)b
232 (24.1)a
42 (15.2)b
Osmotherapy (mannitol or glycerol infusion)
357 (10.0)
Urinary catheterization
865 (24.2)
116 (7.4)a
35 (3.6)b
6 (2.2)b
775 (20.3)b
357 (22.8)
194 (20.2)b
36 (13.0)b
TE D
AC C
EP
ICU, intensive care unit.
SC
187 (4.9)b
< 0.05 as compared with the derivation cohort; bP < 0.01.
Data are numbers (percentage).
450 (28.8)
M AN U
aP
RI PT
CYCH (n = 3,577)
4
!
ACCEPTED MANUSCRIPT
Supplemental Table 4. Performance of various models evaluated by using the Pearson correlation coefficients between the stroke severity indices and the real NIHSS scores.
Model
Derivation cohort
Validation cohorts CMMC
NCKUH
LH
TSLH
KNN
0.743 (0.737-0.749)
0.700 (0.684-0.716)
0.677 (0.649-0.703)
0.706 (0.673-0.737)
0.698 (0.632-0.754)
RT
0.737 (0.731-0.742)
0.699 (0.682-0.715)
0.677 (0.650-0.703)
0.706 (0.673-0.736)
0.697 (0.631-0.753)
MLR
0.742 (0.736-0.747)
0.708 (0.692-0.724)
0.691 (0.664-0.716)
0.725 (0.694-0.754)
0.705 (0.641-0.760)
SC
Data are the Pearson correlation coefficients (95% confidence interval).
RI PT
CYCH
AC C
EP
TE D
M AN U
KNN, k-nearest neighbor; MLR, multiple linear regression; RT, regression tree.
5