Expanding Secondary Use of Health Data: An NSF Biomedical Informatics Workshop

Data Acquisition

Capturing data and natural language processing

Data Standards

Focus on limitations of current standards such as the HL7V3 standard but also potential utility of emerging standards

Semantic Interoperability

Vocabularies, ontologies, and techniques for semantic level sharing of data

Data Management

Beyond the capture stage, management challenges associated with scale, heterogeneity, distributed, and the fragmentary nature of data

Data Presentation

Visual, adaptive, and optimal presentation of data for enhancing use and understanding

Data Services

Emerging applications for supporting research, quality and safety management, public health studies, etc.

Due to the intense focus on computerization of health care services and emphasis on sharing of data in digital form, the volume of health data is increasing at a rapid pace. Under the theme of "Expanding Secondary Use of Health Data", the American Medical Informatics Association has organized two important meetings1 recently, where the focus was on expanding usage of health data beyond documentation of and applications in clinical care. Some key technical impediments that were identified and discussed in these meetings were broad-based and repeated collection, storage, aggregation, linkage, transmission, and presentation of health data2.

Screen-shot of a data mining tool CureHunter

Screen-shot of a data mining tool supporting automated discovery of associations among key biologically relevant elements. Panel presentation by Judge Schonfeld, CEO, CureHunter (www.curehunter.com) scheduled to take place at the workshop.

In the long range plan of the National Library of Medicine published last year3 similar themes, related to barriers to access and expanding usage of health data in research, received attention. A key goal articulated in the report was the development of "integrated biomedical, clinical, and public health information systems that promote scientific discovery and speed the translation of research into practice." The specific recommendations associated with the goal were three-fold, namely 1) improving biomedical knowledge representation; 2) creating next-generation electronic health records standards for patient-centric care, clinical practices, and public health; and 3) linking of databases for discovering associations between clinical evidence, genetic information, and environmental factors.

We know that many barriers to expanded usage of health data must be understood from a broad context that includes social, economical, and political factors. However, in the NSF Biomedical workshop the computational and informatics challenges will receive the central focus. Our primary aim is to understand the intrinsic nature of the challenges as clearly as possible, develop a small prioritized list of key computational and informatics problems that may be solvable in a relatively short time-horizon, and then explore the broader contextual issues in relation to these problems.

1Toward a National Framework for the Secondary Use of Health Data. Accessed on February 25th, 2007
2Please see the report above for elaboration on how health data evolve through various stages of use.
3The NLM long range plan.