Longitudinal clinical data improve survival prediction after hematopoietic cell transplantation using machine learning

Yiwang Zhou, Jesse Smith, Dinesh Keerthi, Cai Li, Yilun Sun, Suraj Sarvode Mothi, David C. Shyr, Barbara Spitzer, Andrew Harris, Avijit Chatterjee, Subrata Chatterjee, Roni Shouval, Swati Naik, Alice Bertaina, Jaap Jan Boelens, Brandon M. Triplett, Li Tang, Akshay Sharma

Research output: Contribution to journalArticlepeer-review


Serial prognostic evaluation after allogeneic hematopoietic cell transplantation (allo-HCT) might help identify patients at high risk of lethal organ dysfunction. Current prediction algorithms based on models that do not incorporate changes to patients’ clinical condition after allo-HCT have limited predictive ability. We developed and validated a robust risk-prediction algorithm to predict short- and long-term survival after allo-HCT in pediatric patients that includes baseline biological variables and changes in the patients’ clinical status after allo-HCT. The model was developed using clinical data from children and young adults treated at a single academic quaternary-care referral center. The model was created using a randomly split training data set (70% of the cohort), internally validated (remaining 30% of the cohort) and then externally validated on patient data from another tertiary-care referral center. Repeated clinical measurements performed from 30 days before allo-HCT to 30 days afterwards were extracted from the electronic medical record and incorporated into the model to predict survival at 100 days, 1 year, and 2 years after allo-HCT. Naïve-Bayes machine learning models incorporating longitudinal data were significantly better than models constructed from baseline variables alone at predicting whether patients would be alive or deceased at the given time points. This proof-of-concept study demonstrates that unlike traditional prognostic tools that use fixed variables for risk assessment, incorporating dynamic variability using clinical and laboratory data improves the prediction of mortality in patients undergoing allo-HCT.

Original languageEnglish
Pages (from-to)686-698
Number of pages13
JournalBlood advances
Issue number3
StatePublished - 13 Feb 2024
Externally publishedYes

Bibliographical note

Publisher Copyright:
© 2024 by The American Society of Hematology.


Y.Z., J.S., D.C.S., C.L., Y.S., S.S.M., S.N., B.M.T., L.T., and A.S. are supported for their work at St Jude Children’s Research Hospital by the American Lebanese Syrian Associated Charities and National Institutes of Health/National Cancer Institute grant (P30 CA021765). R.S. was supported by a Memorial Sloan Kettering Cancer Center Core grant (P30 CA008748) from the National Institutes of Health/National Cancer Institute. compensation; received research funding from CRISPR Therapeutics and honoraria from Vindico Medical Education; and is also the St Jude Children’s Research Hospital site principal investigator for clinical trials of genome editing for sickle cell disease sponsored by Vertex Pharmaceuticals/CRISPR Therapeutics (NCT03745287), Novartis Pharmaceuticals (NCT04443907), and Beam Therapeutics (NCT05456880) (the industry sponsors provide funding for the clinical trials, which includes salary support paid to the investigator’s institution and A.S. has no direct financial interest in these therapies). The remaining authors declare no competing financial interests.

FundersFunder number
Beam TherapeuticsNCT05456880
Vindico Medical EducationNCT03745287
National Institutes of Health
National Cancer InstituteP30 CA021765, P30 CA008748
St. Jude Children's Research Hospital
Novartis Pharmaceuticals CorporationNCT04443907
American Lebanese Syrian Associated Charities


    Dive into the research topics of 'Longitudinal clinical data improve survival prediction after hematopoietic cell transplantation using machine learning'. Together they form a unique fingerprint.

    Cite this