Insights from a machine learning model for predicting the hospital Length of Stay (LOS) at the time of admission

Lior Turgeman, Jerrold H. May, Roberta Sciulli

Research output: Contribution to journalArticlepeer-review

82 Scopus citations


A model that accurately predicts, at the time of admission, the Length of Stay (LOS) for hospitalized patients could be an effective tool for healthcare providers. It could enable early interventions to prevent complications, enabling more efficient utilization of manpower and facilities in hospitals. In this study, we apply a regression tree (Cubist) model for predicting the LOS, based on static inputs, that is, values that are known at the time of admission and that do not change during patient's hospital stay. The model was trained and validated on de-identified administrative data from the Veterans Health Administration (VHA) hospitals in Pittsburgh, PA. We chose to use a Cubist model because it produced more accurate predictions than did alternative techniques. In addition, tree models enable us to examine the classification rules learned from the data, in order to better understand the factors that are most correlated with hospital LOS. Cubist recursively partitions the data set as it estimates linear regressions for each partition, and the error level differs for different partitions, so that it is possible to deduce what are the characteristics of patients whose LOS can be accurately predicted at admission, and what are the characteristics of patients for whom the LOS estimate at that point in time is more highly uncertain. For example, our model indicates that the prediction error is greater for patients who had more admissions in the recent past, and for those who had longer previous hospital stays. Our approach suggests that mapping the cases into a higher dimensional space, using a Radial Basis Function (RBF) kernel, helps to separate them by their level of Cubist error, using a Support Vector Machine (SVM).

Original languageEnglish
Pages (from-to)376-385
Number of pages10
JournalExpert Systems with Applications
StatePublished - 15 Jul 2017
Externally publishedYes

Bibliographical note

Publisher Copyright:
© 2017 Elsevier Ltd


  • Continuous association rule mining algorithm (CARMA)
  • Cubist decision tree
  • Decision function
  • Error distribution
  • Length of Stay (LOS)
  • Support vector machine (SVM)


Dive into the research topics of 'Insights from a machine learning model for predicting the hospital Length of Stay (LOS) at the time of admission'. Together they form a unique fingerprint.

Cite this