How can clinicians easily ask complex questions of knowledge and data sources, when they may be unfamiliar with how the information in a source is structured or even whether the source exists? Consider the domain of HIV drug resistance research, where investigators ask about which mutations emerge when a drug is administered and whether the presence of a new mutation is associated with poor drug response. Answering such questions not only requires considerable knowledge of the domain but also an ability to evaluate temporal relations among the data. Answers may depend on results in several resources, such as RxNorm for drug information and the Stanford HIV Drug Resistance Database for clinical cases. Investigators would normally need to identify and select the appropriate sources, specify suitable parameters within the resource’s query language, and integrate the answer fragments that are returned into a comprehensive answer.
To make such question answering easier, our group, in collaboration with researchers at PARC and SRI, is developing a system that uses a natural language English interface to request clinical information. Our approach is domain independent, but our implemented system, called Quadri (Question answering about drug resistance information), support the types of clinical research questions asked about HIV drug resistance. Quadri transforms such questions into an unambiguous logical form using natural language technology (PARC’s Bridge), which is then sent to a theorem prover (SRI’s SNARK) that operates over an axiomatic theory of the subject domain.
Our NIH-supported work is distinguished from prior efforts in intelligent query interfaces by (1) using language analysis that does not prematurely eliminate syntactic ambiguity but rather preserves it in a compact form; (2) using domain knowledge to prune ambiguities found during both the language analysis and the search for answers; (3) generating a logical form for a query that captures logical dependencies and that uses a higher level vocabulary interpreted by axioms of the domain theory (handling logical constructs beyond the capabilities of SQL); (4) enabling users to extend, refine, and alter their questions using a stream of queries, and to ask follow-up questions that use the results of preceding queries; and (5) giving feedback on a query’s logical interpretation and an explanation of how the answer was obtained.