Mammography is the best test we have for the early detection of breast cancer but it is not perfect largely because performance is attenuated by significant variability of practice. We set out to develop a probabilistic expert system that would uniformly improve performance of all radiologists to the level of expert knowledge. This expert system has been found to effectively discriminate between benign and malignant conditions based on individual patient risk factors and mammographic findings. In this experiment, we test whether the expert system can generate well-calibrated probability estimates of malignancy based on mammographic findings for use in decision-making.