Zech J, et al. J Am Med Inform Assoc 2015;22:682–687. doi:10.1093/jamia/ocu005, Research and Applications

Identifying homelessness using health information exchange data

RECEIVED 29 June 2014 REVISED 9 October 2014 ACCEPTED 20 October 2014 PUBLISHED ONLINE FIRST 10 February 2015

John Zech1, Gregg Husk2, Thomas Moore3, Gilad J Kuperman4, Jason S Shapiro1

ABSTRACT ....................................................................................................................................................

RESEARCH AND APPLICATIONS

Background Homeless patients experience poor health outcomes and consume a disproportionate amount of health care resources compared with domiciled patients. There is increasing interest in the federal government in providing care coordination for homeless patients, which will require a systematic way of identifying these individuals. Objective We analyzed address data from Healthix, a New York City–based health information exchange, to identify patterns that could indicate homelessness. Methods Patients were categorized as likely to be homeless if they registered with the address of a hospital, homeless shelter, place of worship, or an address containing a keyword synonymous with “homelessness.” Results We identified 78 460 out of 7 854 927 Healthix patients (1%) as likely to have been homeless over the study period of September 30, 2008 to July 19, 2013. We found that registration practices for these patients varied widely across sites. Conclusions The use of health information exchange data enabled us to identify a large number of patients likely to be homeless and to observe the wide variation in registration practices for homeless patients within and across sites. Consideration of these results may suggest a way to improve the quality of record matching for homeless patients. Validation of these results is necessary to confirm the homeless status of identified individuals. Ultimately, creating a standardized and structured field to record a patient’s housing status may be a preferable approach.

.................................................................................................................................................... Key words: homelessness, health information exchange, health care costs, health care reform

INTRODUCTION

not consistently gathered at health care facilities during patient registration for health care encounters. We describe a technique to identify potentially homeless individuals using address data collected during patient registration at a health care facility and maintained by a health information exchange (HIE). HIEs enable sharing of patient records across different health care facilities to support patient care, but HIE data can also enable health services research on an interinstitutional population.14,15 We demonstrate our technique on an active HIE in New York City: Healthix.16 We describe the variation in registration patterns for homeless patients across health care organizations participating in Healthix, and the variety of addresses used by individuals likely to be homeless.

Homelessness has an enormous effect on the health of individuals who are homeless, with a 3–4-fold increase in the mortality rate and increased rates of incidence of certain diseases.1–7 On average, a patient who is homeless uses a disproportionately large amount of health care resources compared with patients who are not homeless, estimated to be nearly 4 times the amount of the average Medicaid recipient.8 Taxpayers are responsible for much of this expense: federal and state government pays directly for the care of homeless patients insured through Medicaid, and federal, state, and local government pays indirectly for the care of uninsured homeless patients by providing hospitals with government reimbursements for charity care.4,8,9 Homelessness in New York City, in particular, is a large problem.10 With the passage of the Affordable Care Act, the federal government has indicated its interest in enrolling patients with chronic medical and/or behavioral health conditions into payment reform models like “health homes” and accountable care organizations.11–13 In order to enroll homeless individuals into such programs, the government will first require a way of identifying them. This step poses a challenge, as housing status is

METHODS Setting and Population Data for this analysis were provided by Healthix. As of July 2013, the organization linked records for over 7 million individual patients across 32 major hospitals and 250 total participating facilities in New York City and Long Island.16 Healthix maintains a master patient index (MPI) containing all patient registration records from participating sites. The MPI includes

Correspondence to Jason S. Shapiro, Department of Emergency Medicine, Icahn School of Medicine at Mount Sinai, Box 1620, One Gustav Levy Place, New York, NY 10029, USA; [email protected] C The Author 2015. Published by Oxford University Press on behalf of the American Medical Informatics Association. V All rights reserved. For Permissions, please email: [email protected] For numbered affiliations see end of article.

682

Zech J, et al. J Am Med Inform Assoc 2015;22:682–687. doi:10.1093/jamia/ocu005, Research and Applications

patient names, birthdates, and addresses. The address field is mandatory at each facility, and registration staff cannot leave it blank. Each time an individual patient registers at a health care facility with new demographic data (e.g., a new address), the site sends these data to Healthix and a new registration record is created in the MPI.

hospital, worship, and shelter. Patients were assigned to groups on the basis of their registration history; for example, any patient who had registered at least once with a keyword address was included in the keyword group. We believe that patients we define as undomiciled are very likely to satisfy the definitions of homelessness of both the Department of Health and Human Services and the Department of Housing and Urban Development.21,22 However, our approach is not able to detect all individuals defined as homeless under these criteria, specifically patients whose address gives no indication of their homeless status.

Assignment of Records to Patients Our analysis required us to connect patient records across sites, determining which records belong to the same patient. We did not use the Healthix MPI for connecting patient records as we believed that homeless individuals may be registered with a variety of different addresses at different sites, and that this could cause the MPI to inaccurately split the records of a single homeless individual into multiple profiles. For the purposes of this analysis, we considered all records with matching first name, last name, and date of birth to belong to the same patient, but excluded address, a technique used elsewhere.17 Data Gathering Healthix ran queries on the MPI as it existed on July 19, 2013 and covered a period starting September 30, 2008. However, the number of participating sites changed over this period, as sites were added on a rolling basis. Data Cleaning and Processing Addresses were preprocessed using open-source address standardization software that was modified to fix common typographical errors.18 This step allowed us to successfully match common variants of addresses associated with homelessness, and it prevented common variations of an address from being counted as 2 separate addresses. As an example, “101 Main St. Apt 5C” and “101 Main Street, #5C” would be mapped to a common address street line through the use of this standardization software. Address street lines that contained specific keywords of interest, such as “homeless” or “shelter,” were mapped to dummy variables to indicate the presence of these words. We consider 2 addresses to match when there is an exact match between standardized address street line and the first 5 digits of a ZIP Code. Finally, a deidentified patient-level dataset was created for analysis using techniques described in detail elsewhere.19,20 All 18 Health Insurance Portability and Accountability Act (HIPAA) identifiers were removed, and indicator variables were used to represent whether or not a patient had registered with particular categories of addresses, as described below.

RESULTS Linking the 16 892 903 patient registration records in the MPI on the basis of first name, last name, and date of birth, 7 854 927 unique patients were identified in the dataset. We found considerable variation in the patterns of use of proxy addresses within and across sites, as is illustrated in Figure 2. These percentages varied widely across sites. For example, the percent of undomiciled patients registered with a hospital proxy address varied from 0.9% to 92.8%. Undomiciled patients visited an average of 2.02 health care facilities, compared to the domiciled average of 1.59 facilities. Undomiciled patients used more addresses than the domiciled population on average (2.34 vs 1.36, t test P < 0.001). Undomiciled patients also registered with more nonproxy addresses for homelessness than proxy addresses (1.25 vs 1.09, t test P < 0.001). A majority (56.0%) of undomiciled patients made at least 1 transition between a nonproxy and proxy address, with 34.5%, 17.0%, 1.9%, and 2.5% making 1, 2, 3, and 4 or more transitions, respectively. In Table 2 we analyze the overlap among unique patients for each type of proxy address in a pair-wise comparison. We found that overlap between the groups was limited. For example, of the 6970 patients who registered with a shelter address, only 676 had separately registered with a keyword address.

Overview of Approach Figure 1 illustrates our approach. We expected to find a variety of addresses used to register undomiciled patients. We categorized every registration address used by a patient as being either a “proxy” address indicating homelessness, or a “nonproxy” address indicating a domiciled patient. We developed the categories for proxy addresses for homelessness (Table 1). Addresses in the MPI were compared against the 4 types of proxy addresses from our framework: keyword,

DISCUSSION By analyzing address data from the registrations of provider organizations, we were able to identify patterns that are likely

683

RESEARCH AND APPLICATIONS

Analyses Performed We examined the variation in registration practices across sites by calculating the percentage of undomiciled patients at each site who registered with each category of proxy address for homelessness. To understand the registration practices of undomiciled patients, we calculated the average number of sites visited and average number of addresses used by undomiciled and domiciled patients. We also calculated how frequently undomiciled patients transitioned between a proxy and nonproxy address. Finally, to determine how much patient overlap existed between these categories, we calculated how frequently patients who registered with each category of proxy addresses also registered with another category of proxy addresses.23,24 This approach was reviewed by the Mount Sinai Institutional Review Board and deemed not human research.

Zech J, et al. J Am Med Inform Assoc 2015;22:682–687. doi:10.1093/jamia/ocu005, Research and Applications

Figure 1: Overview of our approach to analyzing data from the Healthix Patient Demographics Database.

RESEARCH AND APPLICATIONS

Table 1: Categories for Proxy Addresses for Homelessness Category

Includes

Keyword

Addresses that include a variant of keywords “homeless” and “undomiciled”

Hospital

Addresses of health care facilities participating in Healthix

Shelter

Addresses of 270 shelters in New York City and Long Island

Worship

Addresses of 9677 places of worship in New York City and Long Island

The undomiciled group consists of patients who had registered at least once with an address matching any one of the keyword, hospital, worship, or shelter categories. The domiciled group consists of patients who had never registered with any of the keyword, hospital, worship, or shelter addresses.

belonging to patients having a previous instance of a proxy undomiciled address. We believe that better HIE record matching for homeless patients could improve HIE usefulness and HIE-enabled care coordination efforts aimed at helping this population. Running this analysis at the level of a HIE offered several advantages over doing the analysis at a single site. Using HIE data allowed us to observe the wide variation in registration practices for homeless patients that exists across sites. HIE data also enabled us to include 7.8 million individuals in our analysis; no site within Healthix had more than 1 million unique patient registrations during our study period. Our approach should be replicable at other HIEs that use centralized MPIs relying on demographic data (name, address, and birth date).

indicators of homelessness. The technique we present may offer a tool to improve patient record matching. Hospitals and HIEs use algorithms that rely on patient demographic data, including address data, to match patient records. The results of this record matching are stored in the registration system (MPI) of each site and the HIE. If 2 patient records contain differing address information, the matching algorithm will likely split those records into 2 different patients, even if they correspond to 1 unique individual who has registered at different times with different addresses. This fact is especially relevant for undomiciled patients, who register with more addresses and visit more health care facilities than the average patient. This issue could be mitigated by setting the MPI’s matching algorithm to down-weight address information for records

684

Zech J, et al. J Am Med Inform Assoc 2015;22:682–687. doi:10.1093/jamia/ocu005, Research and Applications

Figure 2: Percentage of all undomiciled patients identified using each of the 4 categories of proxy addresses at each of the 32 hospitals contributing data to Healthix. Please note that the sum of these percentages can exceed 100% at a site as patients can be registered in multiple categories.

Groups

Keyword

Worship

Shelter

Keyword

17 310

162

676

2596

17 310

0

9535

6294

46 673

61 150

162

9697

204

429

9697

17 148

0

6766

48 840

68 763

676

204

6970

1072

6970

16 634

9493

0

48 197

71 490

2596

429

1072

49 269

49 269

Nonhospital

14 714

9268

5898

0

29 191

Undomiciled

17 310

9697

6970

49 269

78 460

Nonkeyword Worship Nonworship Shelter Nonshelter Hospital

Hospital

Undomiciled

registration systems, which typically do not represent final ICD-9–coded billing diagnoses and were therefore unusable. If a reliable demographic file of patients known to be homeless could be identified for some subset of our data, further analysis could be able to validate our approach. A second limitation is that we have united patient records solely on the basis of matching first name, last name, and date of birth. Cases will arise in which a typographical error (e.g., “Jon Smith” instead of “John Smith”) or a name change may cause records from the same patient at different sites to be

The primary limitation of this work is that we have classified patients as being homeless only on the basis of their historical address information. We believe that patients who register with these addresses are highly likely to be homeless, but we did not have another gold standard data source to use for validation. Furthermore, an approach based on historical address information may misclassify patients transitioning between normal housing and homelessness. Whereas an International Classification of Diseases, Ninth Revision (ICD-9) code exists for homelessness, Healthix only contains diagnosis data from

685

RESEARCH AND APPLICATIONS

Table 2: Patient Overlap between Groups

Zech J, et al. J Am Med Inform Assoc 2015;22:682–687. doi:10.1093/jamia/ocu005, Research and Applications

T.M., and G.J.K. each reviewed and critically revised the manuscript for important intellectual content.

RESEARCH AND APPLICATIONS

split into multiple patient profiles in our approach. Alternatively, relatively rare cases will occur in which 2 people share the exact same first name, last name, and date of birth. In that case, those individuals would be improperly united into 1 patient profile in our approach, although there would be no way for us to directly measure this effect. A third limitation is the existence of additional proxy addresses for homelessness not included in our analysis. We used addresses only from the New York City and Long Island area and would not have been able to detect registrations with the address of a shelter or place of worship outside of this region. We mitigated the impact of this issue by reviewing lists of commonly used registration addresses to ensure that we included all commonly used addresses indicative of homelessness in our analysis. A final limitation is that we were not able to detect homeless individuals who sought care exclusively at hospitals not participating in Healthix, including the 11 public hospitals in New York City Health and Hospitals Corporation. We observed wide variation in the types of proxy addresses with which undomiciled patients were registered both within and across sites. We also observed that undomiciled patients registered with many nonproxy addresses not indicative of homeless. These findings demonstrate the difficulty of identifying homeless patients after they have accessed health care services. A patient’s housing status is frequently known to registration staff at the time a patient is registered, and we believe it may be beneficial to adopt a new policy that requires structured, standardized data on housing status to be collected whenever a patient registers for a health care encounter.

FUNDING This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.

COMPETING INTERESTS None.

REFERENCES 1. Weinstein LC, Lanoue MD, Plumb JD, et al. A primary carepublic health partnership addressing homelessness, serious mental illness, and health disparities. J Am Board Fam Med. 2013;26:279–287. 2. O’Connell JJ. Premature mortality in homeless populations: a review of the literature. http://santabarbarastreetmedicine.org/wordpress/wp-content/uploads/2011/04/Prema tureMortalityFinal.pdf. Accessed January 16, 2014. 3. Brickner PW, Scanlan BC, Conanan B, et al. Homeless persons and health care. Ann Intern Med. 1986;104: 405–409. 4. Zlotnick C, Zerger S. Survey findings on characteristics and health status of clients treated by the federally funded (US) Health Care for the Homeless Programs. Health Soc Care Commun. 2009;17:18–26. 5. Torres RA, Mani S, Altholz J, et al. Human immunodeficiency virus infection among homeless men in a New York City shelter association with mycobacterium tuberculosis infection. Arch Intern Med. 1990;150:2030–2036. 6. Zlotnick C, Zerger S, Wolfe PB. Health care for the homeless: what we have learned in the past 30 years and what’s next. Am J Public Health. 2013;103:1–7. 7. O’Toole TP, Conde-Martel A, Gibbon JL, et al. Where do people go when they first become homeless? A survey of homeless adults in the USA. Health Soc Care Comm. 2007; 15:446–453. 8. Bharel M, Lin W-C, Zhang J, et al. Health care utilization patterns of homeless individuals in Boston: preparing for Medicaid expansion under the Affordable Care Act. Am J Public Health. 2013;103:1–7. 9. The Kaiser Commission On Medicaid and the Uninsured. The uninsured, a primer. http://kff.org/uninsured/report/ the-uninsured-a-primer-key-facts-about-health-insuranceon-the-eve-of-coverage-expansions/. Accessed February 12, 2014. 10. Doran KM, Misa EJ, Shah NR. Housing as health care - New York’s boundary-crossing experiment. N Engl J Med. 2013; 369:2374–2376. 11. Centers for Medicare and Medicaid Services. Department of Health and Human Services. Letter to state Medicaid directors and state health officials re: health homes for enrollees with chronic conditions. http://downloads.cms.gov/cmsgov/ archived-downloads/SMDL/downloads/SMD10024.pdf. Accessed January 16, 2014.

CONCLUSION We have presented an algorithm that may identify homelessness on the basis of historical address information present in a HIE. This algorithm classified 1.0% of all participants in a New York City–based HIE as homeless. We believe that future work to validate these results is important and would help determine the broader utility of this approach. Ultimately, creating a standardized and structured field to record a patient’s housing status may be a preferable approach.

ACKNOWLEDGEMENTS We gratefully acknowledge the support of the Department of Emergency Medicine at Mount Sinai, which hosted John Zech as a research student throughout the duration of this study. We also gratefully acknowledge Healthix, Inc. and the New York e-Health Collaborative, which created the deidentified patientlevel dataset used in the study.

CONTRIBUTORSHIP STATEMENT J.S.S., T.M., and G.H. jointly conceived of the study and contributed to its design. J.S.S. directed and supervised the study. T.M. facilitated acquisition of the data. J.Z. performed the literature search, suggested new lines of analysis, refined the data cleaning and analysis approach, performed the data analysis, and drafted the original manuscript with input from J.S.S. G.H.,

686

Zech J, et al. J Am Med Inform Assoc 2015;22:682–687. doi:10.1093/jamia/ocu005, Research and Applications

19. Grinspan ZM, Abramson EL, Banerjee S, et al. Potential value of health information exchange for people with epilepsy: crossover patterns and missing clinical data. In: Proceeding of the AMIA Annual Symposium, Washington, D.C; 2013. Vol. 2013:527–536. 20. Neamatullah I, Douglass MM, Lehman LH, et al. Automated de-identification of free-text medical records. BMC Med Inform Decis Mak. 2008;8:32. 21. Public Health Service Act. Section 330. 42 U.S.C. § 254(b). 22. Homeless emergency assistance and rapid transition to housing: defining “homeless.” Fed Regist. 2010;75: 20541–20546. 23. R Core Team. R: A Language and Environment for Statistical Computing. 2013. 24. Wickham H. ggplot2: Elegant Graphics for Data Analysis. New York, NY: Springer; 2009.

12. Devore S, Champion RW. Driving population health through accountable care organizations. Health Aff. 2011;30: 41–50. 13. Berwick DM, Hackbarth AD. Eliminating waste in US health care. JAMA. 2012;307:1513–1516. 14. Johnson KB, Unertl KM, Chen Q, et al. Health information exchange usage in emergency departments and clinics: the who, what, and why. J Am Med Inform Assoc. 2011;18: 690–697. 15. Shapiro JS. Evaluating public health uses of health information exchange. J Biomed Inform. 2007;40:S46–S49. 16. Healthix. Healthix: about us. https://services.lipixportal.org/ HealthixPortal/Home/About. Accessed November 11, 2013. 17. McCoy AB, Wright A, Kahn MG, et al. Matching identifiers in electronic health records: implications for duplicate records and patient safety. BMJ Qual Saf. 2013;22:219–224. 18. The Analysis and Solutions Company. Address Standardization Solution. http://www.analysisandsolutions. com/software/addr/. Accessed August 2, 2013.

1

3

Department of Emergency Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA,

Healthix, Inc., New York, NY, 10013, USA,

4

New York-Presbyterian Hospital, New York, NY, 10032, USA

2

Department of Emergency Medicine, Mount Sinai Beth Israel, New York, NY, 10003, USA,

687

RESEARCH AND APPLICATIONS

AUTHOR AFFILIATIONS ....................................................................................................................................................

Identifying homelessness using health information exchange data.

Homeless patients experience poor health outcomes and consume a disproportionate amount of health care resources compared with domiciled patients. The...
347KB Sizes 1 Downloads 9 Views