Is repeat blood-testing necessary? AI could help decide

You want and expect your doctor to be thorough, but over-testing can raise concerns as well, can you be too thorough? Jonathan Chen, MD, Ph. D., Assistant Professor of Medicine, believes that the answer is yes, especially when looking at blood diagnostic testing. Blood tests are an important diagnostic tool, but often repeated tests yield diminished results. Repeated tests not only often show no change, but they are uncomfortable for the patient. Couple this with concerns about rising healthcare expenses and it is clear that a solution needs to be found.

Machine learning algorithms can synthesize volumes of electronic medical record data to systematically identify low yield tests, quantifying the predictability of results to encourage high-value care. Chen and a team of researchers and physicians have taken steps to try and address the problem and have created an algorithm that can predict whether a given blood test will come back "normal." Their work was recently published in the JAMA Network Open. This algorithm can tell doctors if a repeat test will be likely to produce a result that is different from the original test. These algorithms can often predict >95% chance that a test will yield normal results, but it is a separate issue for laboratorians, clinicians, and patients to decide in individual cases whether it is *worth* checking for something when you were already 95% sure of the answer.

Read the Scope Article

Read the JAMA Article

Dr. Zihuai He receives R01 to develop statistical and computational methods for integrative analysis of Alzheimer's disease genetics

Genetic factors play an important role in the development of Alzheimer's disease. While much progress has been made in Alzheimer’s disease genetics, the role of noncoding variants is largely unknown. The noncoding genome covers ~98% of the human genome and includes elements that regulate when, where, and to what degree protein-coding genes (e.g. APOE) are transcribed. The NIH research grant will support the research group led by Dr. Zihuai He to develop innovative methodologies for the analysis of noncoding variants, combining whole-genome sequencing, epigenetic technologies and multi-layered phenotypic data such as imaging and biomarkers. The proposed methods will be applied to a total of roughly 20,000 whole genomes unifying the Alzheimer's Disease Neuroimaging Initiative (ADNI), the Alzheimer's Disease Sequencing Project (ADSP), the Religious Orders Study and Memory and Aging Project (ROSMAP) and a newly established cohort, the Stanford Extreme Phenotypes in Alzheimer's Disease (StEP AD). The application of the proposed methods will significantly improve our understanding of the genetic architecture of Alzheimer's disease and, critically, provide a set of well-defined, novel targets for the development of genomic-driven medicine.

Multi-omics data fusion with AMARETTO fosters a Community of Cancer Informatics Researchers

The Gevaert lab developed the initial AMARETTO algorithm in 2013 and has since then expanded the team by working together with Dr. Nathalie Pochet at Brigham & Women/Harvard University and with Dr. Mikel Hernaez at University of Illinois Urbana-Champaign. Together these teams have taken a tool that required data scientists to operate, to create the more user friendly GenePattern environment supported by the Informatics Technology for Cancer Research program at the National Cancer Institute. In addition AMARETTO has been expanded to include the ability to link patient data to model systems for improved drug target discovery, linking with perturbation data from the LINCS project to validate predictions, and integration of images and image phenotypes including radiography and histopathology imaging data. Read more about how AMARETTO was expanded here.


The CEDAR project has been chosen as a key participant in the GO FAIR Metadata for Machines (M4M) workshops and the FAIR Funder pilot program. The GO FAIR initiative, an international program to advance FAIR (findable, accessible, interoperable, reusable) data and services, launched the M4M workshop series to stimulate the creation and re-use of FAIR metadata standards and machine-ready metadata templates. The M4M workshops have adopted CEDAR to provide the metadata capabilities for the initial workshops.

The M4M workshops are agile events that bring together domain experts, metadata specialists, and technical developers to define metadata elements and standards, create machine-actionable templates for collecting metadata according to those standards, and register the templates for open access, discovery, and re-use. CEDAR has worked with the Leiden-based GO FAIR team for over a year to demonstrate and evaluate the CEDAR technologies for these tasks, and participated in two recent M4M workshops. At the second of these, two national science funders—the Health Research Board of Ireland (HRB) and the Netherlands Organisation for Health Research and Development (ZonMW)—took the first steps toward a complete life cycle of FAIR metadata for their communities by creating CEDAR templates to capture metadata of interest to research funding agencies.

In coming months, CEDAR expects to participate in several more M4M workshops, and advance the FAIR Funder pilot program (depicted in the diagram) through many more of the 7 stages shown in the diagram. CEDAR initially provided services to define metadata elements (1), create machine-actionable templates (2), and register the templates in the CEDAR repository for later re-use (3), and CEDAR enhancements already under way will provide advanced search and open publication for the CEDAR metadatda resources that have been developed, and integrated metadata authoring in coordination with the DSL Data Wizard (4). Also important to GO FAIR: CEDAR already provides rigorous semantic capabilities that let CEDAR metadata be published as JSON-LD or simple RDF triples, two highly interoperable formats for sharing research metadata.

The CEDAR technologies adopted by GO FAIR's M4M workshops and the FAIR Funder pilot program are helping the GO FAIR community move metadata management from ad-hoc, individually developed solutions to a more rigorous, structured, and user-friendly approach to metadata.


Explore BMIR

At BMIR, we develop computatiional methods for biomedical discovery that influence medical decisions.

Learn more about the cutting-edge ways we are advancing technology and biomedicine to improve human health.

Our state of the art research advances patient care by improving semantic technology, biostatistics, and the modeling of biomedical systems. Read more about our research labs.

Join us for our weekly research talks featuring world-renowned scientists, faculty, staff, and students.

BMIR Colloquia and Research in Progress talks occur on Thursdays from 12-1 PM during the academic year in Medical School Office Building room X275, 1265 Welch Road, Stanford, CA. See schedule.  

Notable Projects and Services

CEDAR is making data submission smarter and faster, so biomedical researchers and analysts create and use better metadata.


Diagnostics - Infectious Diseases

EteRNA, an online puzzle, enlists video gamers to try to design a sensor module that could make diagnosing TB as easy as taking a home pregnancy test.

Read more


The NCBO manages a repository of all the world’s publicly available biomedical ontologies and terminologies—now more than 390 in number.

Green Button

Green button: the promise of personalizing medical practice guidelines in real time

Protégé is the most widely used ontology-development system in the world.


CoINcIDE, is a novel methodological framework for the discovery of patient subtypes across multiple datasets that requires no between dataset transformations.

Read more