Predicting autism traits from baby wellness records: A machine learning approach

Ayelet Ben-Sasson, Joshua Guedalia, Keren Ilan, Meirav Shaham, Galit Shefer, Roe Cohen, Yuval Tamir, Lidia V. Gabis

Research output: Contribution to journalArticlepeer-review


Early detection of autism spectrum condition is crucial for children to maximally benefit from early intervention. The study examined a machine learning model predicting the increased likelihood for autism from wellness records from 0 to 24 months. The study included 591,989 non-autistic and 12,846 autistic children. A gradient boosting model with a threefold cross-validation and SHAPley additive explanation tool quantified feature importance. The model had an average area under the curve of 0.81 (SD = 0.004). The high-likelihood group detected by the model had a 0.073 autism spectrum condition incidence rate; 3.42-fold more than in the entire cohort (0.02). Sex-specific models had higher specificity (0.81 boys and 0.79 girls) than sensitivity (0.64 boys and 0.66 girls). The common predictors were more parental concerns, older mothers, never nursing, lower initial and higher last weight percentiles, and several delayed milestones. SHAPley additive explanation tool results show common, important predictors in the full sample and separate boys’ and girls’ models. These included birth, growth, familial, postnatal parameters and delayed language, fine motor, and social milestones from 12 to 24 months. Machine learning algorithms can help detect increased autism signs by relying on the multidimensional data routinely recorded during the first 2 years. Lay abstract: Timely identification of autism spectrum conditions is a necessity to enable children to receive the most benefit from early interventions. Emerging technological advancements provide avenues for detecting subtle, early indicators of autism from routinely collected health information. This study tested a model that provides a likelihood score for autism diagnosis from baby wellness visit records collected during the first 2 years of life. It included records of 591,989 non-autistic children and 12,846 children with autism. The model identified two-thirds of the autism spectrum condition group (boys 63% and girls 66%). Sex-specific models had several predictive features in common. These included language development, fine motor skills, and social milestones from visits at 12–24 months, mother’s age, and lower initial growth but higher last growth measurements. Parental concerns about development or hearing impairment were other predictors. The models differed in other growth measurements and birth parameters. These models can support the detection of early signs of autism in girls and boys by using information routinely recorded during the first 2 years of life.

Original languageEnglish
Early online date29 May 2024
StateE-pub ahead of print - 29 May 2024
Externally publishedYes

Bibliographical note

Publisher Copyright:
© The Author(s) 2024.


  • autism spectrum conditions
  • developmental milestones
  • electronic health records
  • machine learning
  • screening


Dive into the research topics of 'Predicting autism traits from baby wellness records: A machine learning approach'. Together they form a unique fingerprint.

Cite this