Self-Stabilizing Clock Synchronization in the Presence of Byzantine Faults

Shlomi Dolev and Jennifer L. Welch

We initiate a study of bounded clock synchronization under a more severe fault model than that proposed by Lamport and Melliar-Smith. Realistic aspects of the problem of synchronizing clocks in the presence of faults are considered. One aspect is that clock synchronization is an on-going task, thus the assumption that in any period of the execution at least two thirds of the processors are nonfaulty is too optimistic. To cope with this reality we suggest self-stabilizing protocols that stabilize in any (long enough) period in which less than a third of the processors are faulty. Another aspect is that the clock value is bounded. A single transient fault may cause the clock to reach the upper bound. Therefore we suggest a bounded clock that wraps around when appropriate. We present two randomized self-stabilizing protocols for synchronizing bounded clocks in the presence of Byzantine processor failures. The first protocol assumes that processors have a common pulse, while the second protocol does not. A new type of distributed counter based on the Chinese remainder theorem is used as part of the first protocol.