Early detection of message forwarding faults

Amir Herzberg, Shay Kutten

Research output: Contribution to journalArticlepeer-review

16 Scopus citations


In most communication networks, pairs of processors communicate by sending messages over a path connecting them. We present communication-efficient protocols that quickly detect and locate any failure along the path. Whenever there is excessive delay in forwarding messages along the path, the protocols detect a failure (even when the delay is caused by maliciously programmed processors). The protocols ensure optimal time for either message delivery or failure detection. We observe that the actual delivery time δ of a message over a link is usually much smaller than the a priori known upper bound D on that delivery time. The main contribution of this paper is the way to model and take advantage of this observation. We introduce the notion of asynchronously early-terminating protocols, as well as protocols that are asynchronously early-terminating, i.e., time optimal in both worst case and typical cases. More precisely, we present a time complexity measure according to which one evaluates protocols both in terms of D and δ. We observe that asynchronously early termination is a form of competitiveness. The protocols presented here are asynchronously early terminating since they are time optimal both in terms of D and of δ. Previous communication-efficient solutions were slow in the case where δ ≪ D. We observe that this is the most typical case. It is suggested that the time complexity measure introduced, as well as the notion of asynchronously early-terminating, can be useful when evaluating protocols for other tasks in communication networks. The model introduced can be a useful step towards a formal analysis of real-time systems. Our protocols have O(n log n) worst-case communication complexity. We show that this is the best possible for protocols that send immediately any acknowledgment they ever send. Then we show an early-terminating protocol which uses timing and delay to reduce the communication complexity in the typical executions where the number of failures is small and δ ≪ D. In such executions, its message complexity is linear, as is the complexity of nonfault tolerant protocols.

Original languageEnglish
Pages (from-to)1169-1196
Number of pages28
JournalSIAM Journal on Computing
Issue number4
StatePublished - 2000
Externally publishedYes


  • Competitive algorithms
  • Distributed algorithms
  • Fault tolerance
  • Network protocols
  • Real time
  • Time adaptive


Dive into the research topics of 'Early detection of message forwarding faults'. Together they form a unique fingerprint.

Cite this