Next: Conclusions Up: HYDRANET: Network Support for Previous: Measurements

Building Applications on HYDRANET

From an application perspective, HYDRANET allows processes to execute remotely while keeping their ``home IP address''. This is conceptually very simple, but extremely versatile. We have developed a replica management protocol, which allows servers to dynamically install replicas on host servers, and remove them when they are no longer needed. We are currently developing a suite of applications, which illustrates the wide variety of benefits that can be gained from network support for service replication.

The first such application is HYDRAWEB, a distributed Web server with a traditional server on the origin host and active caches on host servers. Active caches provide the means to eliminate most of the problems encountered in Web caching:

Servers can dynamically install active caches and push data to make them hot in order to pro-actively diffuse expected flash crowds. While the caches remain installed, the servers can keep them hot by pushing modified or soon-to-expire data.
The active caches run under the control of the origin's site. This means that access control, hit metering, and dynamic page reconfiguration (for example for dynamic placement of advertising banners) is done in behalf of and under the control of the original site.
Similarly, copyright issues for cached data are eliminated, because the site controls the placement of its data.
The management of cache sites is simplified, because they are under centralized control of the original site. The configuration of distributed Web caches is a serious problem. As an example, [28] describes how fourteen separate Australian SQUID sites link themselves directly onto the cache tree in the U.S. Each page is fetched across the Pacific separately by each of the fourteen sites.
The generation of dynamically computed pages can be done at the cache sites.
Explicit mirroring is eliminated.

Given the above points, this solution can be used for cache breakers, notoriously problematic Web sites, and so dramatically improve the performance of already deployed passive Web cache infrastructures. In addition, active caches on the host servers can take over control in the case of failure or excessive congestion of the origin host, effectively providing a highly fault-tolerant Web service.

HYDRAWEB is a simple Java-based realization of an active-cache based replicated Web server. It consists of replicas and replica managers. Replicas are small programs that can be downloaded onto hosts servers. They provide a HTTP interface to clients (i.e. behave like Web servers). Whenever possible, the client requests are handled locally by the replica. For this purpose, the replicas maintain a cache of HTTP objects. The replica manager on the origin host installs and removes replicas and generally manages them. Whenever client requests cannot be handled by a replica locally, the replica manager is contacted, and the request is handled on the origin server.

To measure the performance of HYDRAWEB, we installed a web server at the International Computer Science Institute at UC Berkeley. A number of Web clients at the Computer Science Department of Texas A&M University requested objects from that location in a controlled fashion. We have a redirector and a host server locally, and deployed HYDRAWEB by installing a replica manager on the origin host at Berkeley; that in turn installs a replica locally at Texas A&M. Figure 4 illustrates the setup. In these experiments, the replica is preheated at installation by receiving a predefined set of HTTP objects to cache locally. After it is installed, it does no caching on its own. Whenever the cache is missed, the replica forwards the request to the replica manager, which replies to the replica after having contacted the local Web server.

Figure 4: Deployment of HYDRAWEB between Texas A&M and Berkeley

Figure 5 compares the Web service latencies with and without HYDRAWEB deployed. All requests are for pages of 1200 bytes each. Figure 5(a) shows the service latency distribution for a client at Texas A&M for accesses to the Web server at ICSI with no local HYDRAWEB replica installed. The average time to get a page from the Web server is 1.5 sec. The ping trace in Figure 5(a) shows that the round-trip time for 64 byte ping packets during the experiment averages 390 msec.

Figure 5(b) shows the result of the same experiment, but with a local replica installed. The replica cache registers 1235 hits against 765 misses. The average service latency for a page in the cache is 120 msec, while for a missed page it is 2.2 sec. Compared to the experiment without HYDRAWEB, page misses in this experiment take 700msec longer to be served. This is in part due to protocol overhead. When a page miss occurs, the replica contacts that replica manager, which gets the page from the Web server and returns it to the replica. The latter then sends it to the client. However, the results of the two experiments should not be directly compared. As the ping traces indicate, the internet was more congested during the second experiment. Indeed, the round-trip for ping packets in the second experiment averages 520 msec.

Figure 5: Web Access Delays from Texas A&M to Berkeley

Next: Conclusions Up: HYDRANET: Network Support for Previous: Measurements

Riccardo Bettati

Tue Jun 9 00:52:24 CDT 1998