Using network projections to explore co-incidence and context in large clinical datasets: application to homelessness among U.S. veterans

Network projections of data can provide an efficient format for data exploration of co-incidence in large, clinical datasets. This study presents and explores the utility of a network projection approach to finding patterns in health care data that could be exploited to prevent homelessness among U.S. Veterans. The study split Veteran ICD-9-CM data into two time periods (0-59 and 60-364 days prior to the first evidence of homelessness) and then used Pajek social network analysis software to visualize these data as three different networks. A multi-relational network simultaneously displayed the magnitude of ties between the most frequent ICD-9-CM pairings. A new association network visualized ICD-9-CM pairings that greatly increased or decreased. A signed, subtraction network visualized the presence, absence, and magnitude difference between ICD-9-CM associations by time period.

A cohort of 9,468 U.S. Veterans was identified as having administrative evidence of homelessness and visits in both time periods. They were seen in 222,599 outpatient visits that generated 484,339 ICD-9-CM codes (average of 11.4 (range 1-23) visits and 2.2 (range 1-60) ICD-9-CM codes per visit). Using the three network projection methods, we were able to show distinct differences in the pattern of co-morbidities in the two time periods. In the more distant time period preceding homelessness, the network was dominated by routine health maintenance visits and physical ailment diagnoses. In the 59 days immediately prior to the homelessness identification, alcohol related diagnoses along with economic circumstances such as unemployment, legal circumstances, along with housing instability were noted.

Network visualizations of large clinical datasets traditionally treated as tabular and difficult to manipulate reveal rich, previously hidden connections between data variables related to homelessness. A key feature is the ability to visualize changes in variables with temporality and in proximity to the event of interest. These visualizations lend support to cognitive tasks such as exploration of large clinical datasets as a prelude to hypothesis generation.

Publication Date: 
In Press
Journal Name: 
Journal of Biomedical Informatics
United States