OAMiner: Integrative Knowledge Anchored Hypothesis Discovery

The objective of this proposal, which is designed to address the ever increasing need for integrated knowledge discovery in biology and medicine, is to enable the discovery, verification, and validation of hypotheses concerning interrelationships between image-based, phenotypic, and bio-molecular features in heterogeneous data sets by leveraging multiple conceptual knowledge sources - ultimately supporting "high throughput" knowledge-driven translational science. To provide for a manageable project scope, Osteoarthritis Initiative (OAI) data sets will be used as a primary, motivating use case for the development and evaluation of the projected research products. This project necessarily involves analysis of initial hypotheses by subject matter experts (SMEs) for system training and verification. However, the ultimate goal of our proposed approach is to minimize the need for human intervention to identify or validate knowledge-anchored hypotheses. In order to generate such hypotheses, four interrelated knowledge sources are used: 1) full-text published bio-medical literature accessed by both conventional text mining and NLP analyses of articles as found in the Medline database and associated full text repositories;2) publically available ontologies included in the National Library of Medicine's Unified Medical Language System (UMLS);3) one or more databases containing phenotypic and functional (e.g. quality of life, psychological, strength and performance measures) data;and 4) computerized-image analysis derived features (e.g. cross-sectional area of the quadriceps).

