A Game-Theoretic Foundation for the Maximum Software Resilience against Dense Errors

Chung Hao Huang, Doron A. Peled, Sven Schewe, Farn Wang

Research output: Contribution to journalArticlepeer-review

11 Scopus citations

Abstract

Safety-critical systems need to maintain their functionality in the presence of multiple errors caused by component failures or disastrous environment events. We propose a game-theoretic foundation for synthesizing control strategies that maximize the resilience of a software system in defense against a realistic error model. The new control objective of such a game is called k-resilience. In order to be k-resilient, a system needs to rapidly recover from infinitely many waves of a small number of up to k close errors provided that the blocks of up to k errors are separated by short time intervals, which can be used by the system to recover. We first argue why we believe this to be the right level of abstraction for safety critical systems when local faults are few and far between. We then show how the analysis of k-resilience problems can be formulated as a model-checking problem of a mild extension to the alternating-time μ -calculus (AMC). The witness for k resilience, which can be provided by the model checker, can be used for providing control strategies that are optimal with respect to resilience. We show that the computational complexity of constructing such optimal control strategies is low and demonstrate the feasibility of our approach through an implementation and experimental results.

Original languageEnglish
Article number7360234
Pages (from-to)605-622
Number of pages18
JournalIEEE Transactions on Software Engineering
Volume42
Issue number7
DOIs
StatePublished - 1 Jul 2016

Bibliographical note

Publisher Copyright:
© 2015 IEEE.

Funding

This article is an extended version of [25]. All tool implementation and related experiment materials are available at https://github.com/yyergg/Resil. Peled is partially supported by ISF Grant 126/12: "Efficient Synthesis Method of Control for Concurrent Systems." Schewe is supported by the Engineering and Physical Science Research Council (EPSRC), grant EP/H046623/1, United Kingdom. Wang is supported by Grant MOST 103-2221-E-002-150-MY3, Taiwan, ROC and a research project by Research Center for Information Technology Innovation (CITI), Academia Sinica, Taiwan, ROC. For more information, please email to [email protected].

FundersFunder number
CITI
Research Center for Information Technology Innovation
Engineering and Physical Sciences Research CouncilMOST 103-2221-E-002-150-MY3, EP/H046623/1
Academia Sinica
Israel Science Foundation126/12

    Keywords

    • Fault tolerance
    • complexity
    • formal verification
    • game
    • model-checking
    • resilience
    • strategy

    Fingerprint

    Dive into the research topics of 'A Game-Theoretic Foundation for the Maximum Software Resilience against Dense Errors'. Together they form a unique fingerprint.

    Cite this