Abstract
Safety-critical systems need to maintain their functionality in the presence of multiple errors caused by component failures or disastrous environment events. We propose a game-theoretic foundation for synthesizing control strategies that maximize the resilience of a software system in defense against a realistic error model. The new control objective of such a game is called k-resilience. In order to be k-resilient, a system needs to rapidly recover from infinitely many waves of a small number of up to k close errors provided that the blocks of up to k errors are separated by short time intervals, which can be used by the system to recover. We first argue why we believe this to be the right level of abstraction for safety critical systems when local faults are few and far between. We then show how the analysis of k-resilience problems can be formulated as a model-checking problem of a mild extension to the alternating-time μ -calculus (AMC). The witness for k resilience, which can be provided by the model checker, can be used for providing control strategies that are optimal with respect to resilience. We show that the computational complexity of constructing such optimal control strategies is low and demonstrate the feasibility of our approach through an implementation and experimental results.
| Original language | English |
|---|---|
| Article number | 7360234 |
| Pages (from-to) | 605-622 |
| Number of pages | 18 |
| Journal | IEEE Transactions on Software Engineering |
| Volume | 42 |
| Issue number | 7 |
| DOIs | |
| State | Published - 1 Jul 2016 |
Bibliographical note
Publisher Copyright:© 2015 IEEE.
Funding
This article is an extended version of [25]. All tool implementation and related experiment materials are available at https://github.com/yyergg/Resil. Peled is partially supported by ISF Grant 126/12: "Efficient Synthesis Method of Control for Concurrent Systems." Schewe is supported by the Engineering and Physical Science Research Council (EPSRC), grant EP/H046623/1, United Kingdom. Wang is supported by Grant MOST 103-2221-E-002-150-MY3, Taiwan, ROC and a research project by Research Center for Information Technology Innovation (CITI), Academia Sinica, Taiwan, ROC. For more information, please email to [email protected].
| Funders | Funder number |
|---|---|
| CITI | |
| Research Center for Information Technology Innovation | |
| Engineering and Physical Sciences Research Council | MOST 103-2221-E-002-150-MY3, EP/H046623/1 |
| Academia Sinica | |
| Israel Science Foundation | 126/12 |
Keywords
- Fault tolerance
- complexity
- formal verification
- game
- model-checking
- resilience
- strategy