by Jeremy Irvin,Pranav Rajpurkar,Michael Ko,Yifan Yu,Silviana Ciurea-Ilcus,Chris Chute,Henrik Marklund,Behzad Haghgoo,Robyn Ball,Katie Shpanskaya,Jayne Seekins,David A. Mong,Safwan S. Halabi,Jesse K. Sandberg,Ricky Jones,David B. Larson,Curtis P. Langlotz,Bhavik N. Patel,Matthew P. Lungren,Andrew Y. NgResearch Only
by Jeremy Irvin,Pranav Rajpurkar,Michael Ko,Yifan Yu,Silviana Ciurea-Ilcus,Chris Chute,Henrik Marklund,Behzad Haghgoo,Robyn Ball,Katie Shpanskaya,Jayne Seekins,David A. Mong,Safwan S. Halabi,Jesse K. Sandberg,Ricky Jones,David B. Larson,Curtis P. Langlotz,Bhavik N. Patel,Matthew P. Lungren,Andrew Y. NgLicense : Research Only
What is CheXpert? CheXpert is a large dataset of chest X-rays and competition for automated chest x-ray interpretation, which features uncertainty labels and radiologist-labeled reference standard evaluation sets. Why CheXpert? Chest radiography is the most common imaging examination globally, critical for screening, diagnosis, and management of many life threatening diseases. Automated chest radiograph interpretation at the level of practicing radiologists could provide substantial benefit in many medical settings, from improved workflow prioritization and clinical decision support to large-scale screening and global population health initiatives. For progress in both development and validation of automated algorithms, we realized there was a need for a labeled dataset that (1) was large, (2) had strong reference standards, and (3) provided expert human performance metrics for comparison. How did we collect and label CheXpert? CheXpert is a large public dataset for chest radiograph interpretation, consisting of 224,316 chest radiographs of 65,240 patients. We retrospectively collected the chest radiographic examinations from Stanford Hospital, performed between October 2002 and July 2017 in both inpatient and outpatient centers, along with their associated radiology reports. Label Extraction from Radiology Reports Each report was labeled for the presence of 14 observations as positive, negative, or uncertain. We decided on the 14 observations based on the prevalence in the reports and clinical relevance, conforming to the Fleischner Society’s recommended glossary whenever applicable. We then developed an automated rule-based labeler to extract observations from the free text radiology reports to be used as structured labels for the images. Our labeler is set up in three distinct stages: mention extraction, mention classification, and mention aggregation. In the mention extraction stage, the labeler extracts mentions from a list of observations from the Impression section of radiology reports, which summarizes the key findings in the radiographic study. In the mention classification stage, mentions of observations are classified as negative, uncertain, or positive. In the mention aggregation stage, we use the classification for each mention of observations to arrive at a final label for the 14 observations.