charts

Publications

Publication details

Enabling Integrative Genomic Analysis of High-Impact Human Diseases through Text Mining
Conference Proceeding
Reference:
J. Dudley, A. J. Butte. Pacific Symposium on Biocomputing, Big Island, Hawaii, 13, 580-591. Published 2008.
Abstract:

characterizations from public genomic data repositories remains a major bottleneck in
efforts to translate genomics experiments to medicine. Through comprehensive,
integrative genomic analysis of all available human disease characterizations we gain
crucial insight into the molecular phenomena underlying pathogenesis as well as intraand
inter-disease differentiation. Such knowledge is crucial in the development of
improved clinical diagnostics and the identification of molecular targets for novel
therapeutics. In this study we build on our previous work to realize the next important
step in large-scale translational discovery and analysis, which is to automatically identify
those genomic experiments in which a disease state is compared to a normal control state.
We present an automated text mining method that employs Natural Language Processing
(NLP) techniques to automatically identify disease-related experiments in the NCBI Gene
Expression Omnibus (GEO) that include measurements for both disease and normal
control states. In this manner, we find that 62% of disease-related experiments contain
sample subsets that can be automatically identified as normal controls. Furthermore, we
calculate that the identified experiments characterize diseases that contribute to 30% of
all human disease-related mortality in the United States. This work demonstrates that we
now have the necessary tools and methods to initiate large-scale translational
bioinformatics inquiry across the broad spectrum of high-impact human disease.

Full PDF version available here
View the Genomic Nosology for Medicine (GNOMED) project
Back to Search Results
 
Information last updated: Mon Feb 25 2008
Make Corrections to this Publication
Stanford School of Medicine