Accepted Manuscript Developing a stroke severity index based on administrative data was feasible using data mining techniques Sheng-Feng Sung, MD, Cheng-Yang Hsieh, MD, Yea-Huei Kao Yang, BPharm, Huey-Juan Lin, MD, MPH, Chih-Hung Chen, MD, Yu-Wei Chen, MD, Ya-Han Hu, PhD PII:

S0895-4356(15)00017-7

DOI:

10.1016/j.jclinepi.2015.01.009

Reference:

JCE 8793

To appear in:

Journal of Clinical Epidemiology

Received Date: 17 September 2014 Revised Date:

16 December 2014

Accepted Date: 16 January 2015

Please cite this article as: Sung S-F, Hsieh C-Y, Kao Yang Y-H, Lin H-J, Chen C-H, Chen Y-W, Hu Y-H, Developing a stroke severity index based on administrative data was feasible using data mining techniques, Journal of Clinical Epidemiology (2015), doi: 10.1016/j.jclinepi.2015.01.009. This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

ACCEPTED MANUSCRIPT Title Developing a stroke severity index based on administrative data was feasible using data mining techniques

RI PT

Authors Sheng-Feng Sung, MD Division of Neurology, Department of Internal Medicine, Ditmanson Medical Foundation Chia-Yi Christian Hospital, Chiayi City, Taiwan [email protected]

M AN U

SC

Cheng-Yang Hsieh, MD Department of Neurology, Tainan Sin Lau Hospital, Tainan, Taiwan Institute of Clinical Pharmacy and Pharmaceutical Sciences, College of Medicine, National Cheng Kung University, Tainan, Taiwan [email protected] Yea-Huei Kao Yang, BPharm Institute of Clinical Pharmacy and Pharmaceutical Sciences, College of Medicine, National Cheng Kung University, Tainan, Taiwan [email protected]

TE D

Huey-Juan Lin, MD, MPH Department of Neurology, Chi Mei Medical Center, Tainan, Taiwan [email protected]

EP

Chih-Hung Chen, MD Department of Neurology, College of Medicine, National Cheng Kung University, Tainan, Taiwan [email protected]

AC C

Yu-Wei Chen, MD Department of Neurology, Landseed Hospital, Tao-Yuan County, Taiwan Department of Neurology, National Taiwan University Hospital, Taipei, Taiwan [email protected] Ya-Han Hu, PhD Department of Information Management and Institute of Healthcare Information Management, National Chung Cheng University, Chiayi County, Taiwan [email protected] * Sheng-Feng Sung and Cheng-Yang Hsieh contributed equally to this work.

1

ACCEPTED MANUSCRIPT

AC C

EP

TE D

M AN U

SC

RI PT

Corresponding author & contact information Ya-Han Hu, PhD Department of Information Management and Institute of Healthcare Information Management, National Chung Cheng University, 168 University Road, Min-Hsiung, Chiayi County 62102, Taiwan Tel: +886-5-2720411 Ext 34613; Fax: +886-5-2721501 Email: [email protected]

2

ACCEPTED MANUSCRIPT Abstract Objective: Case-mix adjustment is difficult for stroke outcome studies using administrative data. However, relevant prescription, laboratory, procedure, and service claims might be surrogates for stroke severity. This study proposes a method for developing a stroke severity index (SSI) by using

Study Design and Setting:

RI PT

administrative data.

We identified 3,577 patients with acute ischemic stroke from a

hospital-based registry and analyzed claims data with plenty of features. Stroke severity was

SC

measured using the National Institutes of Health Stroke Scale (NIHSS). We used two data mining methods and conventional multiple linear regression to develop prediction models, comparing the

M AN U

model performance according to the Pearson correlation coefficient between the SSI and the NIHSS. We validated these models in four independent cohorts by using hospital-based registry data linked to a nationwide administrative database.

Results: We identified seven predictive features and developed three models. The k-nearest

TE D

neighbor model (correlation coefficient, 0.743; 95% confidence interval, 0.737-0.749) performed slightly better than the multiple linear regression model (0.742; 0.736-0.747), followed by the regression tree model (0.737; 0.731-0.742). In the validation cohorts, the correlation coefficients

EP

were between 0.677 and 0.725 for all three models. Conclusion: The claims-based SSI enables adjusting for disease severity in stroke studies using

AC C

administrative data.

Keywords Acute ischemic stroke, disease severity, administrative data, data mining, prediction model, outcomes research Running title Stroke severity index for administrative data research Word count 4067 3

ACCEPTED MANUSCRIPT

What is new?

Key findings This study identified seven predictive features from administrative claims data. It also

RI PT



developed and validated three models for predicting the neurological deficit severity of patients hospitalized for acute ischemic stroke.

The stroke severity indices provided by applying these models were highly correlated with

SC



What this adds to what is known? 

M AN U

clinical stroke severity, as measured using the National Institutes of Health Stroke Scale.

Although administrative data typically contain inadequate clinical information, stroke severity

TE D

for each patient can be easily determined from hospitalization claims.

What is the implication, what should change now? 

Stroke outcome studies using claims data have often lacked adequate adjustment for disease

EP

severity. Researchers may use the stroke severity indices presented herein to adjust for disease severity in future ischemic stroke studies using Taiwan’s National Health Insurance Research



AC C

Database.

Replication of this study’s approach to developing stroke severity indices in other large administrative databases that contain inadequate clinical information is encouraged.

4

ACCEPTED MANUSCRIPT 1. Introduction Stroke is among the leading causes of death, disability, and hospitalizations and is ranked third in disease burden worldwide [1]. Research on stroke outcomes is essential for both clinical care and policy development. Numerous stroke outcome studies have been conducted on the basis of

RI PT

administrative data [2–4]. However, because clinical information was insufficient [5], these studies share the limitation of inadequate adjustment for case-mix or stroke severity. This limitation is particularly critical and might diminish the value of the research findings; stroke is heterogeneous,

SC

and thus, stroke severity varies greatly among patients. Nevertheless, administrative claims data reflect routine clinical practice and might contain information that is highly associated with stroke

M AN U

severity.

Most case-mix adjustment models for stroke comprise markers of stroke severity, prestroke function and comorbidities [6]. Inadequate adjustment for patient case-mix has been underscored as a major factor limiting the comparisons of patient outcomes between health care providers [7]. A

TE D

previous study that used the administrative claims data for Medicare beneficiaries with acute ischemic stroke (AIS) found that a hospital risk model for 30-day mortality based on claims data alone without adjustment for stroke severity has substantially worse discrimination compared with a

EP

model that adjusted for stroke severity using the National Institutes of Health Stroke Scale (NIHSS)

AC C

according to a linkage with the Get With The Guidelines–stroke registry [8]. Because stroke severity is required for risk adjustment of ischemic stroke outcomes, and because standardized stroke scales are excluded from claims data and are not routinely recorded in clinical practice at all hospitals, exploring valid surrogates for stroke severity merits high priority [9]. Most neurological and medical complications that develop in patients with AIS within 48 hours up to few weeks in the course of stroke progression correlate with stroke severity [10,11]. Managing and treating these complications are essential in stroke care. For example, stroke patients with dysphagia generally require the placement of a nasogastric tube for feeding [10], and mannitol 5

ACCEPTED MANUSCRIPT osmotherapy is recommended for treating stroke patients with brain edema [11]. Therefore, relevant prescription, laboratory, procedure, and service claims in administrative data might be potential surrogates for stroke severity. Nevertheless, prioritizing surrogates from numerous candidate attributes in claims data is challenging.

RI PT

Research using administrative data has the advantages of describing a large sample of

geographically dispersed patients with serial longitudinal data, relative efficiency, and the potential to avoid the possible selection bias inherent in institution-specific studies [12,13]. Administrative

SC

data are thus frequently used to conduct population-based studies or performance monitoring research. Taiwan’s National Health Insurance (NHI) is a single-payer, compulsory enrollment

M AN U

health care program lunched on March 1, 1995. The NHI program covers virtually the entire population in Taiwan [14], and provides universal coverage for hospital and physician services, thus creating a large repository of health administrative data. To test the hypothesis that stroke severity could be determined according to the patterns of hospitalization claims in the NHI data set, we

TE D

combined data mining techniques with conventional statistical methods to develop stroke severity models from claims data in a single hospital. Stroke severity was represented as the admission NIHSS score in the hospital-based stroke registry. We then validated the models in four

EP

independent cohorts of stroke patients using the claims data of the NHI linked to the individual

AC C

hospital-based registry.

2. Methods

2.1 Derivation cohort

We identified patients with AIS from the stroke registry of Chia-Yi Christian Hospital, which prospectively registered all stroke patients admitted within ten days of symptom onset according to the design of the nationwide Taiwan Stroke Registry [15]. Ischemic stroke is defined as an acute onset of neurologic deficits persisting longer than 24 hours with no hemorrhage on the first brain 6

ACCEPTED MANUSCRIPT computed tomography or with acute corresponding ischemic lesion(s) on diffusion weighted magnetic resonance imaging. The study hospital is a 1,000-bed regional hospital serving a city and its adjoining rural area of approximately 1,000,000 inhabitants. The Chia-Yi Christian Hospital Institutional Review Board approved the study protocol.

RI PT

We included patients aged 18 years or older who were admitted between September 2007 and December 2013. Patients with in-hospital stroke were excluded. Stroke severity, as assessed by the NIHSS during patient admission, was obtained from the stroke registry. From the hospital data

SC

warehouse, we collected each patient’s claims that had been submitted to the NHI, including billing data for medications, laboratory tests, imaging studies, procedures, clinical services, and supplies.

M AN U

Patients whose claims data were missing were excluded from the study. 2.2 Feature selection

All of the billing codes related to medications, laboratory tests, imaging studies, procedures, and clinical services from patient claims data were considered potential predictive features. For a

TE D

given patient, a feature (billing code) was scored present (1) or absent (0) according to whether the billing code was listed on his/her claims, thus forming a large binary feature set. A three-step feature selection process was then used. The first step involved a frequency cutoff operation in

EP

which a feature was filtered out if it appeared (i.e., coded as 1) in less than 1% or more than 99% of the patient claims. A correlation-based feature selection (CFS) method was used in the second step

AC C

to evaluate the correlations between the feature subsets and the dependent variable. In other words, optimal feature subsets contain features that are highly correlated with the outcome, yet uncorrelated with each other [16]. In this study, the Weka 3.6.11 open-source data mining software (www.cs.waikato.ac.nz/ml/weka) was used to perform the CFS procedure. A subset of features that were both highly correlated with the outcome (NIHSS score) and weakly correlated with each other was identified using the CfsSubsetEval module with the greedy stepwise algorithm implemented in Weka. The third step was performed manually based on expert opinion. A panel of stroke 7

ACCEPTED MANUSCRIPT neurologists (SFS, CYH, HJL, CHC, and YWC) determined the final subset of features by consensus. Redundant features and those with low face validity were eliminated, whereas features with similar properties were aggregated. 2.3 Data mining techniques for regression

RI PT

The k-nearest neighbor (KNN), which is one of the most fundamental supervised learning algorithms, is a nonparametric method for predicting the outcome of an instance based on its nearest instances (neighbors) in a training set. Each instance is represented as a vector in a

SC

multidimensional feature space and the distance between two vectors can be easily measured. Generally, the distance measures for continuous and categorical variables are the Euclidean and

M AN U

Hamming distances, respectively. Given a testing instance t and a user-specified constant k, the KNN algorithm computes the distance between t and all other instances, denoted as dist(t, si), and then selects the top k nearest instances, i.e., those with the lowest dist(t, si), as neighbors for t. The weight of each neighbor si for the testing instance t, denoted as wi, is calculated using wi = 1/dist(t,

TE D

si). Finally, the estimated outcome of t can be determined according to the weighted or unweighted average of the outcome of si (i = 1, . . . , k).

The regression tree (RT), a decision tree algorithm, is a hierarchical structure consisting of

EP

branches and nodes. The internal node represents a selected predictor, and the branch of an internal node represents a subset of predictor values. Each instance falls into one leaf node at the end of the

AC C

tree. Each leaf node can be considered a set of instances satisfying a specific set of decision rules in the tree. Generally, the decision tree is easy to interpret and the rules in a decision tree can be applied at the bedside. Numerous decision tree-based supervised learning algorithms have been developed. To generate a RT, the algorithm recursively performs a binary split for a given set of instances based on different predictor values. The optimal split is then determined over all predictors at all possible split points according to the reduction in variance. The aforementioned process is recursively applied to each child node after each split. The tree growing process 8

ACCEPTED MANUSCRIPT continues until no further improvement on the decreasing variance or other stopping criteria, such as the minimum number of instances per leaf and maximum tree depth, are met. Each leaf stores a class value that represents the average value of instances that reach the leaf. 2.4 Model development

RI PT

The NIHSS is a 15-item neurological examination scale used to assess neurological

impairment in stroke patients. Interrater reliability and test–retest reliability for the scale are high, and the scale validity was demonstrated by its strong correlations with patient infarct size on

SC

computed tomography obtained one week after the stroke and patient functional outcome assessed three months after the stroke [17]. The NIHSS was also proved valid for patients treated with

M AN U

intravenous thrombolysis because it highly corresponded to other measures of clinical outcomes [18]. The total NIHSS scores range from 0 to 42, with higher values representing greater stroke severity. Because the distribution of the NIHSS scores was skewed, we used nonparametric methods of data mining to develop the models.

TE D

The KNN and RT algorithms implemented in Weka, i.e., the IBK module and the REPTree module, were used to train a model that can predict the NIHSS score, which was treated as a continuous variable. Because the internal parameter settings substantially influence the prediction

EP

performance of the two algorithms, the CVParameterSelection metalearner module implemented in Weka was used to optimize the parameters. A base regressor was specifically selected and an

AC C

arbitrary number of parameter combinations were defined in the CVParameterSelection metalearner module. This metalearner automatically executed the base regressor with all possible parameter combinations and determined the optimal parameter settings based on the best prediction results using cross-validation. For the IBK module, we used the unweighted average of the nearest neighbors to determine the value of the testing instance, and the number of nearest neighbors, k, was set to 11 by using the CVParameterSelection module. The REPTree module creates a RT by using variance reduction as 9

ACCEPTED MANUSCRIPT the splitting criterion and prunes the tree using the reduced-error technique. The minimum total number of instances in a leaf was set to 11 by using the CVParameterSelection metalearner module. Finally, a conventional multiple linear regression model was built using Weka’s LinearRegression module, and all of the features from the final subset were entered into the model.

RI PT

2.4 Validation cohorts

The stroke severity index models were externally validated by using patients in stroke

registries from four hospitals of varying size. The Chi Mei Medical Center (CMMC) is a medical

SC

center with 1,300 beds, and the National Cheng Kung University Hospital (NCKUH) is a

university-affiliated medical center with 1,200 beds. The Landseed Hospital (LH) and the Tainan

M AN U

Sin Lau Hospital (TSLH) are regional hospitals with 600 and 450 beds, respectively. Patients aged 18 years or older with a discharge diagnosis of AIS were included and those with in-hospital stroke were excluded. To assure the anonymity of patients, we retrieved only the sex, date of birth, date of admission, date of discharge, and NIHSS score at admission of each patient from the registry

TE D

databases.

The National Health Insurance Research Database (NHIRD), derived from the NHI claims data, is maintained and made available for research by the National Health Research Institutes of

EP

Taiwan. The NHIRD contains the medical care utilization records of the NHI beneficiaries and enables population-based research. The population of ischemic stroke in the NHIRD was identified

AC C

by extracting data on all hospitalizations for which a diagnosis of ischemic stroke was recorded (International Classification of Diseases, Ninth Revision, Clinical Modification [ICD-9-CM] diagnosis code 433.x or 434.x) from a stroke specific data set of the NHIRD between 2006 and 2010. Because hospital and patient identifiers are encrypted in the NHIRD, the registry data was linked to the NHIRD based on four non-unique patient characteristics, which are: sex, date of birth, date of admission, and date of discharge [19,20]. A deviation of ±1 day for the date of admission and date of discharge was allowed [21]. After establishing a hospital “crosswalk” by matching 10

ACCEPTED MANUSCRIPT hospitalizations from each validation hospital’s registry with the NHIRD hospitalizations using the four patient characteristics, the hospital with the highest matching records for a given validation hospital was assumed to be the correct link for that validation hospital [19]. Successfully linked cases were included in the validation cohorts.

RI PT

For each case in the validation cohorts, we obtained billing data for medications, laboratory tests, imaging studies, procedures, and clinical services from the hospitalization claims files. We determined the presence or absence of each predictive feature, determined during the

SC

aforementioned feature selection process, according to the billing codes listed on patient claims and computed the predicted stroke severity indices by using the prediction models.

M AN U

2.5 Model evaluation and statistical analysis

All models were trained to minimize the mean square error between the predictions (stroke severity indices) and targets (NIHSS scores). The Pearson correlation coefficients between the predictions and targets were reported. A tenfold cross-validation was used to evaluate the internal

TE D

validity of these models. This validation strategy partitions the original data set into ten subsamples, reserving one for testing and using the remaining nine as training data. In this way, it ensures that the model is tested on the unobserved data, thus mitigating the chance of over fitting. By varying

EP

the seed values for the random splitting of the original data set, the tenfold cross-validation process was repeated ten times and the results were averaged based on the 100 test subsamples to produce a

AC C

single estimate of the Pearson correlation coefficient. Paired t-tests were used to compare the mean correlation coefficients between the prediction models. The generalizability of the models was tested by assessing their performance in the validation cohorts and the Pearson correlation coefficients between the predicted stroke severity indices and the NIHSS scores were reported. The Stata procedure, corcor, was used to compare the performance between the models in the validation cohorts for dependent correlation coefficients [22].

11

ACCEPTED MANUSCRIPT General statistical analyses were performed using Stata 13.1 (StataCorp, College Station, Texas). Continuous variables were summarized as mean ± standard deviation or median (interquartile range), and categorical variables as counts and percentages. Chi-squared tests were used to compare categorical variables, and t-tests or Mann–Whitney U tests were used for

RI PT

continuous variables, whichever was appropriate. Two-tailed P < 0.05 was considered statistically significant.

SC

3. Results 3.1 Patient characteristics

M AN U

We enrolled 3,577 patients in the derivation cohort after excluding 105 and 16 patients with in-hospital stroke and missing claims data, respectively. Table 1 illustrates the characteristics of the patients. Approximately one-third of the patients had a prior stroke. The NIHSS scores at admission ranged from 0 to 40, with a median of 5. Nearly 88% of the patients were discharged within two

TE D

weeks, with a median length of stay of seven days.

EP



AC C

A total of 4,438, 1,694, 1,148, and 318 patients from the stroke registries of four validation hospitals were eligible for this study. After linking to the NHIRD, 3,816 (CMMC), 1,563 (NCKUH), 962 (LH), and 276 (TSLH) patients were included in the validation cohorts (Table 2). The median NIHSS score of the derivation cohort was significantly different from those in the CMMC and TSLH cohorts. The median length of stay of the derivation cohort also differed from those in the CMMC, LH, and TSLH cohorts.

12

ACCEPTED MANUSCRIPT < insert Table 2. Characteristics of the patients in the validation cohorts.>

3.2 Feature selection results

RI PT

A total of 494 from the 1,634 features retrieved from the claims data of the derivation cohort passed the frequency cutoff filter. Then correlation-based feature selection identified a subset of 67 features (see Supplemental Table 1), which was manually reviewed and was reduced to a subset of

SC

seven features (Table 3 and see Supplemental Table 2). Although half of the features (billing codes) selected by the correlation-based feature selection stand for medications, all medications, except for

M AN U

mannitol, were excluded from the final subset because the use of medications is complex and may vary considerably between physicians and hospitals. In addition, rehabilitation service claims were eliminated because the use of rehabilitation may be affected by the varied availability of rehabilitation providers among different hospitals and the significant disparity in rehabilitation use

TE D

between neurologists and non-neurologists [23]. The distribution of the features differed among the derivation and validation cohorts (see Supplemental Table 3).

EP



AC C

3.3 Evaluation of models

In the derivation cohort, the Pearson correlation coefficients between the stroke severity indices and the NIHSS scores were 0.743 (95% confidence interval [CI], 0.737-0.749) for the KNN model, 0.737 (95% CI, 0.731-0.742) for the RT model, and 0.742 (95% CI, 0.736-0.747) for the MLR model. Based on comparing the mean Pearson correlation coefficients with the paired t-test, the KNN model performed slightly better than did the MLR model (P = 0.046), whereas the RT model performed worse than did the KNN model (P < 0.001) and MLR model (P < 0.001). In the validation cohorts, the stroke severity indices and NIHSS scores exhibited strong correlations 13

ACCEPTED MANUSCRIPT (Pearson correlation coefficients were between 0.677 and 0.725) for all models (Fig. 1 and see Supplemental Table 4). The MLR model exhibited higher Pearson correlation coefficients (P < 0.001 for CMMC cohort, P < 0.001 for NCKUH, P < 0.001 for LH, and P = 0.272 for TSLH) than the KNN model, and the KNN model’s performance was similar to that of the RT model (P = 0.474

RI PT

for CMMC, P = 0.848 for NCKUH, P = 0.937 for LH, and P = 0.917 for TSLH).

SC



4. Discussion

TE D

We developed and validated three models that predicted the NIHSS score at admission for AIS patients. These models might provide an index to represent the stroke severity of patients for research using administrative data that typically lack detailed clinical information about disease

EP

severity. Among the models, the KNN model and the MLR model performed almost equally well on the internal validation in terms of the correlation between the stroke severity indices and the

models.

AC C

clinical NIHSS scores. During the external validation, the MLR model outperformed the other two

Although the three models performed reasonably well, each model has its own advantages and limitations. The KNN algorithm is analogous to clinical reasoning and is probably readily accepted by clinicians [24]; however, it might be considerably difficult to deploy and disseminate it to other researchers. To overcome this problem, we set up a website (http://hdmlab.twbbs.org:508/SSI/hdmlab/ssi2.jsp) that enables computing the stroke severity 14

ACCEPTED MANUSCRIPT indices online. The RT model was preferred for its transparency, and the tree structure could be converted into a set of classification rules (Fig. 2). The decision tree-based methods do not require specifications regarding the parametric nature of the relationship between the predictive features and target [25], and could facilitate the identification and interpretation of the interactions among

RI PT

the predictors [26]. Nevertheless, the decision tree could be too complex to be understood easily when a data set contains many predictive features [27]. The MLR model is probably the simplest to implement, and it can demonstrate the relative strength of the various features within the model.

SC

The stroke severity index can be easily obtained by using the regression equation (Table 4).

Although the linear model is limited by the assumption that linear combinations of the predictive

M AN U

variables can effectively describe the behavior of the response, the MLR model seemed suitable and performed quite well in this study. It is perhaps because the selected predictors are binary and

TE D

correlate with the stroke severity in a generally linear fashion.



EP

Institutes of Health Stroke Scale (NIHSS) score at admission by the features. ICU, intensive care

AC C



Several stroke scales, including the NIHSS [17], the Canadian Neurological Scale [28], and the Scandinavian Stroke Scale [29], have been designed for use in clinical trials. These scales generally correlate well with clinical outcomes and have been widely used in routine practice and stroke studies to evaluate stroke severity [30]. Of the scales, the NIHSS was determined to yield the

15

ACCEPTED MANUSCRIPT most prognostic information [31]. However, standardized stroke scales are not recorded in administrative data. Length of stay was used as a proxy of stroke severity and was associated with increased readmission risk in a study using administrative data [32]. However, this method is not ideal

RI PT

because length of stay typically increases with stroke severity for mild strokes and decreases with stroke severity for severe strokes [33]. Other studies have constructed proxy indicators to represent high stroke severity based on the ICD-9-CM diagnosis codes, ICD-9-CM procedure codes, and

SC

Current Procedural Terminology codes [23,34]. The commonly used indicators include use of mechanical ventilation, surgical procedures (such as gastrostomy, craniotomy, and tracheostomy),

M AN U

and neurological deficits (such as hemiplegia and hemiparesis, aphasia, and epilepsy). However, these approaches might be limited by the inaccurate coding and under-reporting of certain diagnoses or minor procedures [35,36]. Moreover, the relative weights of these proxy indicators were not understood; therefore, it was impossible to establish a composite index of stroke severity.

TE D

In particular, these proxy indicators have not been validated.

Composite disease severity indices have been developed for administrative data research on other diseases. Ting et al [37] established a claims-based rheumatoid arthritis severity index

EP

comprising type and number of laboratory tests (inflammatory markers, chemistry panels, platelet counts) used, number of outpatient visits (rehabilitation and rheumatology), and the presence of a

AC C

specific diagnosis (Felty’s syndrome). Chang et al [38] tested the predictive validity of a diabetes complications severity index incorporating seven diabetic complications based only on the ICD-9CM codes from claims data. This index predicted the number of hospitalizations in the succeeding four years. Ananthakrishnan et al [39] extracted prespecified potential predictive variables, such as anemia, malnutrition, and requirement of blood transfusion, from up to 15 discharge ICD-9-CM diagnosis codes and 15 ICD-9-CM procedure codes in hospitalization records. After identifying

16

ACCEPTED MANUSCRIPT independent predictors by multivariate logistic regression, they constructed a risk score to stratify the severity of Crohn’s disease hospitalizations. In contrast to the aforementioned approaches, the proposed approach can be used to identify critical predictive features and develop models for estimating stroke severity by exploring detailed

RI PT

billing codes in claims data instead of using diagnosis and procedure codes; only up to five ICD-9CM diagnosis codes and five ICD-9-CM procedure codes were recorded in each hospitalization claim submitted to the NHI. Therefore, by not using the ICD-9-CM diagnosis codes and procedure

SC

codes, we reduced the possibility of coding errors and under-coding, which could undermine the performance of prediction models. By contrast, the use of medications, procedures, diagnostic tests,

M AN U

and services was faithfully represented as billing codes in the claims data because Taiwan’s NHI provides universal coverage for hospitalizations, and stroke hospitalizations are reimbursed on a fee-for-service basis. By applying data mining techniques, we were able to effectively exploit the high-dimensional administrative data through feature selection algorithms. Without the feature

TE D

selection process, developing a model from high-dimensional data, i.e., data with a large number of features, might be difficult using conventional regression techniques. Our study has limitations. First, stroke patients were primarily managed by neurologists in the

EP

five participating hospitals and the diagnosis of AIS had been ascertained by stroke neurologists before patients entered into the stroke registry. Previous studies using the NHIRD have revealed

AC C

that around 45% of stroke patients were admitted to neurology service in Taiwan [23,32]. Because the stroke severity index we developed was based on the prescriptions, laboratory tests, procedures, and services during hospitalization, the differences in practice patterns between neurologists and non-neurologists might affect its performance. Second, we included patients with prior stroke who could have residual neurological deficits. Hence, the admission NIHSS score might not be fully accounted for by the stroke leading to the current admission. In addition, the NIHSS score was evaluated at admission, whereas the stroke severity index was based on data from the process of 17

ACCEPTED MANUSCRIPT care during the entire hospital stay. Therefore, the stroke severity index should be considered a global measure of neurological deficit severity for each hospitalization.

5. Conclusion

RI PT

Developing claims-based stroke severity indices is feasible by using data mining and statistical learning techniques, such as exploring billing codes in high-dimensional claims data, using a feature selection process to identify the optimal predictive features, and applying various regression

SC

methods to develop models for estimating stroke severity. Among the models, the KNN model outperformed the other two models in the derivation cohort, whereas the MLR model performed the

M AN U

most effectively in the validation cohorts. This study established a novel method by using NHIRD data to develop models for generating stroke severity indices, which represent proxy measures of neurological impairment that can be used for adjusting disease severity in ischemic stroke studies. This approach can be followed to develop stroke severity indices in other large administrative

TE D

databases that typically lack clinical data. Through adjustment for disease severity, the stroke

AC C

EP

severity index can improve future stroke outcome studies using administrative data.

18

ACCEPTED MANUSCRIPT Acknowledgments This study is based in part on data from the National Health Insurance Research Database provided by the Bureau of National Health Insurance, Department of Health and managed by National Health Research Institutes. The interpretation and conclusions contained herein do not

RI PT

represent those of Bureau of National Health Insurance, Department of Health or National Health Research Institutes.

Sources of funding: This research was supported in part by the National Cheng Kung

AC C

EP

TE D

M AN U

of China (grant number NSC 102-2410-H-194-104-MY2).

SC

University (grant number NCKUH-10206008) and the National Science Council of the Republic

19

ACCEPTED MANUSCRIPT References [1]

Lopez AD, Mathers CD, Ezzati M, Jamison DT, Murray CJL. Global and regional burden of disease and risk factors, 2001: systematic analysis of population health data. Lancet 2006;367:1747–57. Lichtman JH, Leifheit-Limson EC, Jones SB, Wang Y, Goldstein LB. 30-Day risk-

RI PT

[2]

standardized mortality and readmission rates after ischemic stroke in critical access hospitals. Stroke 2012;43:2741–2747.

Lichtman JH, Jones SB, Wang Y, Leifheit-Limson EC, Goldstein LB. Seasonal variation in

SC

[3]

30-day mortality after stroke: teaching versus nonteaching hospitals. Stroke 2013;44:531–

[4]

M AN U

533.

Tamm A, Siddiqui M, Shuaib A, Butcher K, Jassal R, Muratoglu M, et al. Impact of stroke care unit on patient outcomes in a community hospital. Stroke 2014;45:211–216.

[5]

Virnig BA, McBean M. Administrative data for public health surveillance and planning.

[6]

TE D

Annu Rev Public Health 2001;22:213–230.

Teale EA, Forster A, Munyombwe T, Young JB. A systematic review of case-mix adjustment models for stroke. Clin Rehabil 2012;26:771–86. Lilford R, Mohammed MA, Spiegelhalter D, Thomson R. Use and misuse of process and

EP

[7]

outcome data in managing performance of acute medical care: avoiding institutional stigma.

[8]

AC C

Lancet 2004;363:1147–54. Fonarow GC, Pan W, Saver JL, Smith EE, Reeves MJ, Broderick JP, et al. Comparison of 30-day mortality models for profiling hospital performance in acute ischemic stroke with vs without adjustment for stroke severity. JAMA 2012;308:257–64. [9]

Katzan IL, Spertus J, Bettger JP, Bravata DM, Reeves MJ, Smith EE, et al. Risk adjustment of ischemic stroke outcomes for comparing hospital performance: a statement for

20

ACCEPTED MANUSCRIPT healthcare professionals from the American Heart Association/American Stroke Association. Stroke 2014;45:918–44. [10]

Kumar S, Selim MH, Caplan LR. Medical complications after stroke. Lancet Neurol 2010;9:105–18. Balami JS, Chen R-L, Grunwald IQ, Buchan AM. Neurological complications of acute ischaemic stroke. Lancet Neurol 2011;10:357–71.

[12]

Lohr KN. Use of insurance claims data in measuring quality of care. Int J Technol Assess

SC

Health Care 1990;6:263–71. [13]

RI PT

[11]

Fisher ES, Whaley FS, Krushat WM, Malenka DJ, Fleming C, Baron JA, et al. The

M AN U

accuracy of Medicare's hospital claims data: progress has been made, but problems remain. Am J Public Health 1992;82:243–8. [14]

Davis K, Huang AT. Learning from Taiwan: experience with universal health insurance. Ann Intern Med 2008;148:313–4.

Hsieh F-I, Lien L-M, Chen S-T, Bai C-H, Sun M-C, Tseng H-P, et al. Get With the

TE D

[15]

Guidelines-Stroke performance indicators: surveillance of stroke care in the Taiwan Stroke Registry: Get With the Guidelines-Stroke in Taiwan. Circulation 2010;122:1116–23. Hall MA. Correlation-based feature selection for machine learning. PhD dissertation, The

EP

[16]

University of Waikato, 1999. http://www.cms.waikato.ac.nz/~ml/publications/1999/99MH-

[17]

AC C

Thesis.pdf (accessed 24 August 2014). Brott T, Adams HP, Olinger CP, Marler JR, Barsan WG, Biller J, et al. Measurements of acute cerebral infarction: a clinical examination scale. Stroke 1989;20:864–70. [18]

Lyden P, Lu M, Jackson C, Marler J, Kothari R, Brott T, et al. Underlying structure of the National Institutes of Health Stroke Scale: results of a factor analysis. NINDS tPA Stroke Trial Investigators. Stroke 1999;30:2347–2354.

21

ACCEPTED MANUSCRIPT [19]

Hammill BG, Hernandez AF, Peterson ED, Fonarow GC, Schulman KA, Curtis LH. Linking inpatient clinical registry data to Medicare claims data using indirect identifiers. Am Heart J 2009;157:995–1000.

[20]

Cheng C-L, Kao Y-HY, Lin S-J, Lee C-H, Lai M-L. Validation of the National Health

RI PT

Insurance Research Database with ischemic stroke cases in Taiwan. Pharmacoepidemiol Drug Saf 2011;20:236–42. [21]

Pasquali SK, Jacobs JP, Shook GJ, O'Brien SM, Hall M, Jacobs ML, et al. Linking clinical

SC

registry data with administrative data using indirect identifiers: implementation and validation in the congenital heart surgery population. Am Heart J 2010;160:1099–104. Goldstein R. Testing dependent correlation coefficients. Stata Technical Bulletin

M AN U

[22]

1996;32:18. [23]

Lee H-C, Chang K-C, Huang Y-C, Lan C-F, Chen J-J, Wei S-H. Inpatient rehabilitation utilization for acute stroke under a universal health insurance system. Am J Manag Care

[24]

TE D

2010;16:e67–e74.

Zhu M, Chen W, Hirdes JP, Stolee P. The K-nearest neighbor algorithm predicted rehabilitation potential better than current Clinical Assessment Protocol. J Clin Epidemiol

[25]

EP

2007;60:1015–21.

Austin PC, Tu JV, Ho JE, Levy D, Lee DS. Using methods from the data-mining and

AC C

machine-learning literature for disease classification and prediction: a case study examining classification of heart failure subtypes. J Clin Epidemiol 2013;66:398–407. [26]

Allore H, Tinetti ME, Araujo KLB, Hardy S, Peduzzi P. A case study found that a regression tree outperformed multiple linear regression in predicting the relationship between impairments and Social and Productive Activities scores. J Clin Epidemiol 2005;58:154–61.

22

ACCEPTED MANUSCRIPT [27]

Yoo I, Alafaireet P, Marinov M, Pena-Hernandez K, Gopidi R, Chang J-F, et al. Data mining in healthcare and biomedicine: a survey of the literature. J Med Syst 2012;36:2431– 48.

[28]

Cote R, Battista RN, Wolfson C, Boucher J, Adam J, Hachinski V. The Canadian

[29]

RI PT

Neurological Scale: validation and reliability assessment. Neurology 1989;39:638–643. Multicenter trial of hemodilution in ischemic stroke--background and study protocol. Scandinavian Stroke Study Group. Stroke 1985;16:885–890.

Faraji F, Ghasami K, Talaie-Zanjani A, Mohammadbeigi A. Prognostic factors in acute

SC

[30]

stroke, regarding to stroke severity by Canadian Neurological Stroke Scale: A hospital-

[31]

M AN U

based study. Asian J Neurosurg 2013;8:78–82.

Muir KW, Weir CJ, Murray GD, Povey C, Lees KR. Comparison of neurological scales and scoring systems for acute stroke prognosis. Stroke 1996;27:1817–1820.

[32]

Tseng M-C, Lin H-J. Readmission after hospitalization for stroke in Taiwan: results from a

[33]

TE D

national sample. J Neurol Sci 2009;284:52–5.

Chang K-C, Tseng M-C, Weng H-H, Lin Y-H, Liou C-W, Tan T-Y. Prediction of length of stay of first-ever ischemic stroke. Stroke 2002;33:2670–2674. Smith MA, Frytak JR, Liou J-I, Finch MD. Rehospitalization and survival for stroke

EP

[34]

patients in managed care and traditional Medicare plans. Med Care 2005;43:902–10. Quan H, Parsons GA, Ghali WA. Validity of information on comorbidity derived rom ICD-

AC C

[35]

9-CCM administrative data. Med Care 2002;40:675–85. [36]

Quan H, Parsons GA, Ghali WA. Validity of procedure codes in International Classification of Diseases, 9th revision, clinical modification administrative data. Med Care 2004;42:801–9.

23

ACCEPTED MANUSCRIPT [37]

Ting G, Schneeweiss S, Scranton R, Katz JN, Weinblatt ME, Young M, et al. Development of a health care utilisation data-based index for rheumatoid arthritis severity: a preliminary study. Arthritis Res Ther 2008;10:R95.

[38]

Chang H-Y, Weiner JP, Richards TM, Bleich SN, Segal JB. Validating the adapted

Ananthakrishnan AN, McGinley EL, Binion DG, Saeian K. A novel risk score to stratify

EP

TE D

M AN U

SC

severity of Crohn's disease hospitalizations. Am J Gastroenterol 2010;105:1799–807.

AC C

[39]

RI PT

Diabetes Complications Severity Index in claims data. Am J Manag Care 2012;18:721–6.

24

ACCEPTED MANUSCRIPT

Table 1. Characteristics of the patients in the derivation cohort. Characteristic

n = 3,577

69 (12)

Female

1,463 (41)

Risk factors 2,896 (81)

M AN U

Hypertension Diabetes mellitus

1,579 (44)

Hyperlipidemia

2,009 (56)

605 (17)

Prior stroke Coronary artery disease

AC C

EP

Congestive heart failure

TE D

Atrial fibrillation

Current smoker

SC

Age, mean (SD)

RI PT

Demographics

1,085 (30) 467 (13) 199 (6) 839 (23)

Clinical data

Admission NIHSS score, median (IQR)

5 (3-10)

Intravenous thrombolysis

205 (6)

Length of stay, median (IQR)

7 (4-10)

Data are numbers (percentage) unless specified otherwise.

ACCEPTED MANUSCRIPT

IQR, interquartile range; SD, standard deviation. Table 2. Characteristics of the patients in the validation cohorts.> CMMC

NCKUH

LH

TSLH

(n = 3,816)

(n = 1,563)

(n = 962)

(n = 276)

Aug, 2006 -

Aug, 2006 -

Aug, 2006 -

Aug, 2009 -

Dec, 2010

Dec, 2009

Dec, 2010

Dec, 2010

Age, mean (SD)

67 (13)**

68 (13)**

Female, n (%)

1,526 (40)

643 (41)

NIHSS, median (IQR)

4 (2-9)**

LOS, median (IQR)

5 (4-10)**

RI PT

Characteristic

SC

Enrollment period

68 (12)*

383 (40)

101 (37)

M AN U

68 (13)

5 (3-10)

5 (2-12)

3 (1-6)**

6 (5-11)

6 (4-9)**

5 (3-7)**

* P < 0.05 as compared with the derivation cohort; ** P < 0.01.

TE D

IQR, interquartile range; LOS, length of stay; NIHSS, National Institutes of Health Stroke Scale; SD, standard deviation.

Explanation

AC C

Feature

EP

Table 3. Final set of features after the three-step feature selection procedure.>

Airway suctioning

Patient having undergone airway suctioning

Bacterial sensitivity test

Bacteria isolated by culture and antibiotic sensitivity test having been performed

General ward stay

Patient having stayed in the general ward

ICU stay

Patient having stayed in the ICU

ACCEPTED MANUSCRIPT

Patient having undergone nasogastric intubation

Osmotherapy

Patient having received osmotherapy (mannitol or glycerol infusion)

Urinary catheterization

Patient having undergone (indwelling) urinary catheterization

ICU, intensive care unit.

SC

RI PT

Nasogastric intubation

Table 4. The multiple linear regression model for the stroke severity index.> Coefficient

Airway suctioning

3.5083*

Bacterial sensitivity test

1.3642*

General ward stay

-5.5761*

TE D 4.1770*

ICU stay

4.5809*

2.1448*

EP

Nasogastric intubation

Osmotherapy

M AN U

Feature

AC C

Urinary catheterization Constant

* P < 0.001

ICU, intensive care unit.

1.6569*

9.6804

AC C

EP

TE D

M AN U

SC

RI PT

ACCEPTED MANUSCRIPT

AC C

EP

TE D

M AN U

SC

RI PT

ACCEPTED MANUSCRIPT

Supplemental Material (Online Only)

!

ACCEPTED MANUSCRIPT

Tables Supplemental Table 1. Features (billing codes) selected by the correlation-based feature selection filter. Item

Billing code

Item

00201A

physician fee, ED, triage category 1

A017712209

cefazolin

00201B

physician fee, ED, triage category 1

A022274100

loperamide

sodium chloride, potassium acetate, sodium acetate, magnesium chloride, hexahydrate, dextrose, potassium biphosphate

RI PT

Billing code

daily physician fee, general ward

A022682265

02012A

daily physician fee, ICU

A037697100

sennoside A+B

03002A

daily ward fee, general ward

A039385100

methocarbamol

03027A

daily nursing fee, general ward

A039604277

sodium chloride

09005C

blood glucose

A042435100

benzonatate

09031C

gamma-glutamyl transferase

A044136100

cilostazol

09037C

blood ammonia

A0449771G0

aluminum dihydroxyallantoinate

09041B

blood gas analysis

AC195031G0

diphenidol

10502B

diphenylhydantoin

B009554100

digoxin

12053B

antinuclear antibody

B011315100

flupentixol, melitracen

13009B

bacterial sensitivity test

B014861216

amiodarone

18006B

doppler echocardiography

B015398265

dopamine

20013B

dopscan

B017530212

ranitidine

30028B

anticardiolipin antibody, IgM

B019457216

midazolam

39016B

daily IV pump

B021914100

losartan

PT, moderate level

B022013219

norepinephrine

acid enema

B022559199

polystyrene sulfonate

urinary catheterization

B023306245

pantoprazole

indwelling urinary catheterization

B023619100

aspirin

47017C

nasogastric intubation

B023728255

levofloxacin

47018C

daily nasogastric feeding

B024720212

famotidine

47031C

endotracheal intubation

K000739299

insulin

47041C

airway suctioning

N000817100

phenytoin

47042C

daily airway suctioning

N004463265

dextrose

47006C 47013C 47014C

M AN U

TE D

EP

AC C

42007A

SC

02007A

1

!

ACCEPTED MANUSCRIPT dressing change, small wound

N012916100

meclizine

A000480209

epinephrine

OT2

OT - passive ROM

A0016461G0

diazepam

PTC1

PT - facilitation techniques

A004951209

atropine

PTC6

PT - ambulation training

A009633266

mannitol

PTM5

PT - passive ROM

A0103581G0

diltiazem

PTM8

PT - tilting table training

A011552277

dextrose

ST1

ST - auditory comprehension training

A0150781G0

diphenidol

RI PT

48011C

ED, emergency department; ICU, intensive care unit; IV, intravenous; OT, occupational therapy; PT, physical therapy;

AC C

EP

TE D

M AN U

SC

ROM, range of motion; ST, speech therapy.

2

!

ACCEPTED MANUSCRIPT

Supplemental Table 2. Features determined by the 3-step feature selection procedure and their corresponding billing codes. Billing codes identified in the derivation cohort

Airway suctioning

47041C, 47042C

Bacterial sensitivity test

13009B

13010B, 13011B

General ward stay

03002A

03001K, 03005K, 03006A, 03008B, 03002AB, 02006K, 02007A, 02008B, 03026K, 03027A

ICU stay

02012A

02011K, 02013B, 03010E, 03011F, 03012G, 03047E, 03048F, 03049G

Nasogastric intubation

47017C, 47018C

Urinary catheterization

47013C, 47014C

SC

AC C

EP

ICU, intensive care unit.

M AN U

A009633266

A009633255, A009633277, A009745277, A013354277, A015561255, A015561266, A015561277, A016476238, A016476266, A016476277, A031387238, A033425266, A042601238, B014379277, B020322265, B020322277, N012343266, A023733263, A023733265, A023733266, A023733277, A024986209, A024986265, A024986266, A024986277, A025104266, A025104277, A025355266, A025355277, A026793265, A026793266, A026793277, A028475265, A028475277, A029475265, A029475266, A029475277, A034722277, AC23733263, AC23733266, AC23733277, AC24986265, AC24986266, AC24986277, AC28475277, AC29475265, B006604277, B017082263, B017082277, B017728265, B017728277

TE D

Osmotherapy (mannitol or glycerol infusion)

Similar billing codes in Taiwan’s National Health Insurance fee schedule

RI PT

Feature

3

!

ACCEPTED MANUSCRIPT

Supplemental Table 3. Distribution of features among the derivation and validation cohorts. Derivation cohort

Model

Validation cohorts CMMC (n = 3,816)

NCKUH (n = 1,563)

LH (n = 962)

TSLH (n = 276)

Airway suctioning

664 (18.6)

551 (14.4)b

258 (16.5)

117 (12.2)b

29 (10.5)b

Bacterial sensitivity test

327 (9.1)

643 (16.9)b

121 (7.7)

50 (5.2)b

27 (9.8)

General ward stay

3,465 (96.9)

3,746 (98.2)b

1,551 (99.2)b

932 (96.9)

276 (100)b

ICU stay

846 (23.7)

387 (10.1)b

129 (8.3)b

144 (15.0)b

23 (8.3)b

Nasogastric intubation

1,012 (28.3)

848 (22.2)b

232 (24.1)a

42 (15.2)b

Osmotherapy (mannitol or glycerol infusion)

357 (10.0)

Urinary catheterization

865 (24.2)

116 (7.4)a

35 (3.6)b

6 (2.2)b

775 (20.3)b

357 (22.8)

194 (20.2)b

36 (13.0)b

TE D

AC C

EP

ICU, intensive care unit.

SC

187 (4.9)b

< 0.05 as compared with the derivation cohort; bP < 0.01.

Data are numbers (percentage).

450 (28.8)

M AN U

aP

RI PT

CYCH (n = 3,577)

4

!

ACCEPTED MANUSCRIPT

Supplemental Table 4. Performance of various models evaluated by using the Pearson correlation coefficients between the stroke severity indices and the real NIHSS scores.

Model

Derivation cohort

Validation cohorts CMMC

NCKUH

LH

TSLH

KNN

0.743 (0.737-0.749)

0.700 (0.684-0.716)

0.677 (0.649-0.703)

0.706 (0.673-0.737)

0.698 (0.632-0.754)

RT

0.737 (0.731-0.742)

0.699 (0.682-0.715)

0.677 (0.650-0.703)

0.706 (0.673-0.736)

0.697 (0.631-0.753)

MLR

0.742 (0.736-0.747)

0.708 (0.692-0.724)

0.691 (0.664-0.716)

0.725 (0.694-0.754)

0.705 (0.641-0.760)

SC

Data are the Pearson correlation coefficients (95% confidence interval).

RI PT

CYCH

AC C

EP

TE D

M AN U

KNN, k-nearest neighbor; MLR, multiple linear regression; RT, regression tree.

5

Developing a stroke severity index based on administrative data was feasible using data mining techniques.

Case-mix adjustment is difficult for stroke outcome studies using administrative data. However, relevant prescription, laboratory, procedure, and serv...
1MB Sizes 6 Downloads 17 Views