Anticipatory troubleshooting

Netanel Hasidi, Meir Kalech

Research output: Contribution to journalArticlepeer-review

2 Scopus citations


Troubleshooting is the process of diagnosing and repairing a system that is behaving abnormally. It involves performing various diagnostic and repair actions. Performing these actions may incur costs, and traditional troubleshooting algorithms aim to minimize the costs incurred until the system is fixed. Prognosis deals with predicting future failures. We propose to incorporate prognosis and diagnosis techniques to solve troubleshooting problems. This integration enables (1) better fault isolation and (2) more intelligent decision making with respect to the repair actions to employ to minimize troubleshooting costs over time. In particular, we consider an anticipatory troubleshooting challenge in which we aim to minimize the costs incurred to fix the system over time, while reasoning about both current and future failures. Anticipatory troubleshooting raises two main dilemmas: the fix–replace dilemma and the replace-healthy dilemma. The fix–replace dilemma is the question of how to repair a faulty component: fixing it or replacing it with a new one. The replace-healthy dilemma is the question of whether a healthy component should be replaced with a new one in order to prevent it from failing in the future. We propose to solve these dilemmas by modeling them as a Markov decision problem and reasoning about future failures using techniques from the survival analysis literature. The resulting algorithm was evaluated experimentally, showing that the proposed anticipatory troubleshooting algorithms yield lower overall costs compared to troubleshooting algorithms that do not reason about future faults.

Original languageEnglish
Article number995
Pages (from-to)1-22
Number of pages22
JournalApplied Sciences (Switzerland)
Issue number3
StatePublished - 1 Feb 2021
Externally publishedYes

Bibliographical note

Publisher Copyright:
© 2021 by the authors. Licensee MDPI, Basel, Switzerland.


  • Diagnosis
  • Survival analysis
  • Troubleshooting


Dive into the research topics of 'Anticipatory troubleshooting'. Together they form a unique fingerprint.

Cite this