Skip to main navigation Skip to search Skip to main content

The Forest or the Trees? Tackling Simpson's Paradox with Classification Trees

  • Galit Shmueli
  • , Inbal Yahav

Research output: Contribution to journalArticlepeer-review

21 Scopus citations

Abstract

Studying causal effects is central to research in operations management in manufacturing and services, from evaluating prevention procedures, to effects of policies and new operational technologies and practices. The growing availability of micro-level data creates challenges for researchers and decision makers in terms of choosing the right level of data aggregation for inference and decisions. Simpson's paradox describes the case where the direction of a causal effect is reversed in the aggregated data compared to the disaggregated data. Detecting whether Simpson's paradox occurs in a dataset used for decision making is therefore critical. This study introduces the use of Classification and Regression Trees for automated detection of potential Simpson's paradoxes in data with few or many potential confounding variables, and even with large samples (big data). Our approach relies on the tree structure and the location of the cause vs. the confounders in the tree. We discuss theoretical and computational aspects of the approach and illustrate it using several real applications in e-governance and healthcare.

Original languageEnglish
Pages (from-to)696-716
Number of pages21
JournalProduction and Operations Management
Volume27
Issue number4
DOIs
StatePublished - Apr 2018

Bibliographical note

Publisher Copyright:
© 2017 Production and Operations Management Society

Funding

Galit Shmueli was partially funded by Grant 105-2410-H-007-034-MY3 from the Ministry of Science and Technology, Taiwan.

FundersFunder number
Ministry of Science and Technology, Taiwan105-2410-H-007-034-MY3

    UN SDGs

    This output contributes to the following UN Sustainable Development Goals (SDGs)

    1. SDG 9 - Industry, Innovation, and Infrastructure
      SDG 9 Industry, Innovation, and Infrastructure

    Keywords

    • Simpson's paradox
    • casual effect
    • classification and regression trees
    • data aggregation
    • decision making

    Fingerprint

    Dive into the research topics of 'The Forest or the Trees? Tackling Simpson's Paradox with Classification Trees'. Together they form a unique fingerprint.

    Cite this