TY - JOUR
T1 - DeepHeme, a high-performance, generalizable deep ensemble for bone marrow morphometry and hematologic diagnosis
AU - Sun, Shenghuan
AU - Yin, Zhanghan
AU - Van Cleave, Jacob G.
AU - Wang, Linlin
AU - Fried, Brenda
AU - Bilal, Khawaja H.
AU - Lucas, Fabienne
AU - Isgor, Irem S.
AU - Webb, Dylan C.
AU - Singi, Siddharth
AU - Brown, Laura
AU - Shouval, Roni
AU - Lin, Jeff
AU - Yan, Ethan S.
AU - Spector, Jacob D.
AU - Ardon, Orly
AU - Boiocchi, Leonardo
AU - Sardana, Rohan
AU - Baik, Jeeyeon
AU - Zhu, Menglei
AU - Syed, Aijazuddin
AU - Yabe, Mariko
AU - Lu, Chuanyi M.
AU - Roshal, Mikhail
AU - Vanderbilt, Chad
AU - Goldgof, Dmitry B.
AU - Dogan, Ahmet
AU - Prakash, Sonam
AU - Carmichael, Iain
AU - Butte, Atul J.
AU - Goldgof, Gregory M.
N1 - Publisher Copyright:
Copyright © 2025 The Authors, some rights reserved.
PY - 2025/6/11
Y1 - 2025/6/11
N2 - Cytomorphological analysis of the bone marrow aspirate (BMA) is pivotal for the diagnostic workup of a broad range of hematological disorders. However, this skill is error prone, highly complex, and time consuming. Deep learning–based models for the automatic classification of bone marrow cell morphology demonstrate the potential to improve diagnostic efficiency and accuracy. However, existing deep learning approaches in this field fall short of expert-level performance and lack generalizability beyond a single dataset. Working with multiple hematopathologists, we curated a dataset from the University of California, San Francisco, which included a training set of 30,394 images from 40 patients with morphologically normal marrows and a test set of 8507 images from 10 different patients, all derived from 400×-equivalent whole-slide images (WSIs). We then developed DeepHeme, a snapshot ensemble deep learning classifier, which outperformed previous models in accuracy while expanding the total number of differentiable cell classes. We externally validated DeepHeme using an independent dataset from the Memorial Sloan Kettering Cancer Center, which included 2694 images from 10 morphologically normal patients and 11,076 images from 655 patients with normal or diseased marrow, scanned using a different WSI system, demonstrating robust generalizability. At the level of individual cell classifications, we systematically compared DeepHeme’s diagnostic performance with that of three medical experts from different academic hospitals, demonstrating that DeepHeme achieved accuracy comparable to, or exceeding, that of human experts. Accurate and generalizable cell classification represents a step toward automated analysis of hematopathology slides and the development of quantitative, morphology-based, predictive markers.
AB - Cytomorphological analysis of the bone marrow aspirate (BMA) is pivotal for the diagnostic workup of a broad range of hematological disorders. However, this skill is error prone, highly complex, and time consuming. Deep learning–based models for the automatic classification of bone marrow cell morphology demonstrate the potential to improve diagnostic efficiency and accuracy. However, existing deep learning approaches in this field fall short of expert-level performance and lack generalizability beyond a single dataset. Working with multiple hematopathologists, we curated a dataset from the University of California, San Francisco, which included a training set of 30,394 images from 40 patients with morphologically normal marrows and a test set of 8507 images from 10 different patients, all derived from 400×-equivalent whole-slide images (WSIs). We then developed DeepHeme, a snapshot ensemble deep learning classifier, which outperformed previous models in accuracy while expanding the total number of differentiable cell classes. We externally validated DeepHeme using an independent dataset from the Memorial Sloan Kettering Cancer Center, which included 2694 images from 10 morphologically normal patients and 11,076 images from 655 patients with normal or diseased marrow, scanned using a different WSI system, demonstrating robust generalizability. At the level of individual cell classifications, we systematically compared DeepHeme’s diagnostic performance with that of three medical experts from different academic hospitals, demonstrating that DeepHeme achieved accuracy comparable to, or exceeding, that of human experts. Accurate and generalizable cell classification represents a step toward automated analysis of hematopathology slides and the development of quantitative, morphology-based, predictive markers.
UR - http://www.scopus.com/inward/record.url?scp=105008146719&partnerID=8YFLogxK
U2 - 10.1126/scitranslmed.adq2162
DO - 10.1126/scitranslmed.adq2162
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
C2 - 40498857
AN - SCOPUS:105008146719
SN - 1946-6234
VL - 17
JO - Science Translational Medicine
JF - Science Translational Medicine
IS - 802
M1 - eadq2162
ER -