Fault tolerant gradient clock synchronization

Johannes Bund, Christoph Lenzen, Will Rosenbaum

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Scopus citations

Abstract

Synchronizing clocks in distributed systems is well-understood, both in terms of fault-tolerance in fully connected systems, and the optimal achievable local skew in general fault-free networks. However, so far nothing non-trivial is known about the local skew that can be achieved in non-fully-connected topologies even under a single Byzantine fault. In this work, we show that asymptotically optimal local skew can be achieved in the presence of Byzantine faults. Our approach combines the Lynch-Welch algorithm [19] for synchronizing a clique of n nodes with up to f < n/3 Byzantine faults, and the gradient clock synchronization (GCS) algorithm by Lenzen et al. [15] in order to render the latter resilient to faults. This is not possible on general graphs, so we augment an arbitrary input graph G by replacing each node with a fully connected cluster of 3f + 1 copies, and execute an instance of the Lynch-Welch algorithm within each cluster. We interpret the clusters as supernodes executing the GCS algorithm on G, where each node in the cluster maintains an estimate of the logical clock of its supernode. By also fully connecting clusters corresponding to neighbors in G, supernodes maintain estimates of neighboring clusters' logical clocks. We achieve asymptotically optimal local skew, assuming that no cluster contains more than f faulty nodes. This construction yields factors of O(f) and O(f2) overheads in terms of nodes and edges, respectively. Since tolerating f faulty neighbors trivially requires degrees larger than f , these overheads are asymptotically optimal.

Original languageEnglish
Title of host publicationPODC 2019 - Proceedings of the 2019 ACM Symposium on Principles of Distributed Computing
PublisherAssociation for Computing Machinery
Pages357-365
Number of pages9
ISBN (Electronic)9781450362177
DOIs
StatePublished - 16 Jul 2019
Externally publishedYes
Event38th ACM Symposium on Principles of Distributed Computing, PODC 2019 - Toronto, Canada
Duration: 29 Jul 20192 Aug 2019

Publication series

NameProceedings of the Annual ACM Symposium on Principles of Distributed Computing

Conference

Conference38th ACM Symposium on Principles of Distributed Computing, PODC 2019
Country/TerritoryCanada
CityToronto
Period29/07/192/08/19

Bibliographical note

Publisher Copyright:
© 2019 Association for Computing Machinery. All rights reserved.

Keywords

  • Clock synchronization
  • Fault tolerance
  • Gradient clock synchronization
  • Local skew

Fingerprint

Dive into the research topics of 'Fault tolerant gradient clock synchronization'. Together they form a unique fingerprint.

Cite this