Jennifer L. Welch
Research Summary and List of Papers

[Clock Synchronization] [Timing Issues] [Communication Protocols] [Shared Objects] [Modularity] [Mobile Ad Hoc Networks] [Metamorphic Robots] [Randomized Distributed Data Structures]

Clock Synchronization: Synchronized clocks are very useful in a distributed system. However, getting and keeping clocks synchronized in a distributed system is challenging due to uncertainties in message delays, failures, and clock drift. The first two papers give lower bounds on how closely clocks can be synchronized even in the well-behaved case of no failures and no clock drift. The next group of papers give algorithms for synchronizing clocks under various failure models. An old survey: Back to beginning
Timing Issues: The level of synchrony available in a system can impact what problems can be solved and how efficiently they can be solved. The first two papers show how the time required to solve a synchronization problem called the session problem is affected by the kind of timing behavior exhibited by the system. This paper analyzes the response time for the dining philosophers resource allocation problem. The next paper provides an algorithm and two related lower bounds for a version of the transaction commit problem in a partially synchronous model. This paper shows that a system with asynchronous processors and messages can simulate, in the presence of various kinds of processor failures, a system with synchronous processors but asynchronous failures. This result implies that the impossibility of fault-tolerant consensus in the synchronous processors model follows immediately from the impossibility of fault-tolerant consensus in the totally asynchronous model. Back to beginning
Communication Protocols: These papers are all concerned with some aspect of interprocessor communication using messages. The first paper analyzed how to choose how long to wait before deciding to retransmit a (possibly lost) message. The next papers give algorithms and impossibility results for setting up and tearing down connections and delivering messages within connections, under various system assumptions. Other topics: Back to beginning
Shared Objects: The first two papers explored the complexity of implementing, in a wait-free manner, multi-valued shared registers (the "logical" registers) out of binary-valued shared registers (the "physical" registers). Wait-free means that any operation of a user of the implemented register will terminate successfully even if all the other users crash. The complexity measures studied were the number of physical reads per logical read, the number of physical writes per logical write, and the number of physical registers required. The next two papers concern implementing a shared object on top of message-passing, with emphasis on the consistency conditions provided (not on fault-tolerance). The first paper presents a framework for defining consistency conditions that allows optimized, non-sequential memory accesses and describes, and proves correct, several programming paradigms for use with such conditions. The second paper proves that implementing linearizability is strictly more expensive than implementing sequential consistency, in terms of elapsed time per operation, for three common shared data types. The last two papers studied how to implement a shared object on top of message passing in the presence of crash failures. Back to beginning
Modularity: These two papers constructed efficient algorithms for Byzantine agreement (getting processors to agree on a decision value even when some of them can exhibit arbitrary, or Byzantine, failures); using subroutines in a clever way allowed us to achieve good complexity bounds. The next two papers described existing distributed algorithms (one for finding a minimum spanning tree and another for solving a resource allocation problem) in modular ways and used that modularity to understand more about the algorithms and give rigorous correctness proofs. Back to beginning
Mobile Ad Hoc Networks: Mobile ad hoc networks are formed by a collection of, potentially mobile, wireless nodes; communication links form and disappear as nodes come into and go out of each other's communication range. Such networks have many practical applications, including home networking, personal area networking, search-and-rescue, and military operations. There has been a significant amount of activity in the area of routing and medium access control for mobile ad hoc networks. In contrast, we are investigating the design of distributed algorithms, and services based on these distributed algorithms. The eventual objective is to develop a middleware that provides services useful to upper layers. The next paper is grouped here since it concerns mobile computing but it is for the cellular model, not the ad hoc network model. Back to beginning
Metamorphic Robots: The robotics community has been exploring how to get a collection of homogeneous interlocking robots to change their global shape. Much previous work focused on centralized algorithms for the problem. We found some simple distributed algorithms for some simple cases. We hope to use them as building blocks to achieve more complex reconfigurations. Back to beginning
Randomized Distributed Data Structures: Randomized algorithms can be simpler and more efficient than deterministic algorithms for the same problem and can even allow the existence of solutions to some problems for which there are no deterministic algorithms. The drawback is that absolute correctness cannot be guaranteed in all situations; typically a randomized algorithm has some, preferably minuscule, probability of either not terminating or returning an incorrect answer. We are interested in developing useful and rigorous specifications of distributed data structures that have probabilistic behavior; finding efficient distributed algorithms to implement the shared objects so specified; and identifying classes of distributed applications that can tolerate probabilistic data structures. Back to beginning