Transaction Commit in a Realistic Timing Model

Brian A. Coan and Jennifer Lundelius Welch

An important problem in the construction of fault-tolerant distributed database systems is the design of nonblocking transaction commit protocols. This problem has been extensively studied for synchronous systems (i.e., systems where no messages ever arrive late). In this paper, the synchrony assumption is relaxed. A new partially synchronous timing model is described. Developed for this model is a new nonblocking randomized transaction commit protocol, which incorporates an agreement protocol of Ben-Or. The new protocol works as long as fewer than half the processors fail. A matching lower bound is proved, showing that the number of processor faults tolerated is optimal. If half or more of the processors fail, the protocol degrades gracefully: it blocks, but no processor produces a wrong answer. A notion of asynchronous round is defined, and the protocol is shown to terminate in a small constant expected number of asynchronous rounds. In contrast it is shown that no protocol in this model can guarantee that a processor terminates in a bounded expected number of its own steps, even if processors are synchronous.