Web Caching-Related Papers and Articles

books ·
protocols and standards ·

These papers are placed into one of the following categories:

  • Origin Server Performance and Characterization
  • Proxy Cache Traffic Characterization
  • User Perception of Web Performance
  • Web Client Traffic Characterization
  • Web Cache Consistency
  • Web Cache Replacement
  • HTTP
  • Inter-cache Communication
  • Prefetching
  • Issues and Politics

Origin Server Performance and Characterization

Traffic Model and Performance Evaluation of Web Servers

Zhen Liu, Nicolas Niclausse, Cesar Jalpa-Villanueva, and Sylvain Barbier

December 1999

In this paper we present a new model of Web traffic and
its applications in the performance evaluation of Web
servers. We consider typical behavior of a user’s hypertext
navigation within a Web server. We propose a traffic model
at the session level, formulated as a stochastic marked
point process, which describes when users arrive and how
they browse the server. We provide results of statistical
analyses and goodness-of-fit of various simple parametric
distributions and of their mixtures. We developed a Web
server benchmark: WAGON (Web trAffic GeneratOr and beNchmark),
and we validated the traffic model by comparing various
characteristics of the synthetic traffic generated by WAGON
against measurements. We then report benchmark and analysis
results on the Apache server, the currently most used Web
server software. We analyze the impact of the traffic
parameters on the HTTP request arrival process and the
packet arrival process. We also show that the aggregate
traffic is self-similar in most cases, and that, more
importantly, the Hurst parameter is increasing in the
traffic intensity. We further provide performance comparison
results between HTTP1.0 and HTTP1.1 and show that HTTP1.1
could be much worse for users as well as for servers if
some control parameters of the server and of browsers are
incorrectly set.

the Capacity of a Web Server.

Gaurav Banga and Peter Druschel

The widespread use of the World Wide Web and related
applications places interesting performance demands on
network servers. The ability to measure the effect of these
demands is important for tuning and optimizing the various
software components that make up a Web server. To measure
these effects, it is necessary to generate realistic HTTP
client requests. Unfortunately, accurate generation of
such traffic in a testbed of limited scope is not trivial.
In particular, the commonly used approach is unable to
generate client request-rates that exceed the capacity of
the server being tested even for short periods of time.
This paper examines pitfalls that one encounters when
measuring Web server capacity using a synthetic workload.
We propose and evaluate a new method for Web traffic
generation that can generate bursty traffic, with peak
loads that exceed the capacity of the server. Finally, we
use the proposed method to measure the performance of a Web server.

Characterizing Reference Locality in the WWW

Virgilio Almeida, Azer Bestavros, Mark Crovella, and Adriana de Oliveira.

In this paper we propose models for both temporal and spatial
locality of reference in streams of requests arriving at Web servers.
We show that simple models based on document popularity alone are
insufficient for capturing either temporal or spatial locality.
instead, we rely on an equivalent, but numerical, representation of
a reference stream: a stack distance trace. We show that temporal
locality can be characterized b the marginal distribution of the
stack distance trace, and we propose models for typical distributions
and compare their cache performance to our traces. We also show
that spatial locality in a reference stream can be characterized
using the notion of self-similarity. Self-similarity describes
long-range correlations in the dataset, which is a property that
previous researchers have found hard to incorporate into synthetic
reference strings. We show that stack distance strings appear to
be strongly self-similar, and we provide measurements of the degree
of self-similarity in our traces. Finally, we discuss methods for
generating synthetic Web traces that exhibit the properties of
temporal and spatial locality that we measured in our data.

TCP Behavior of a Busy Internet Server: Analysis and Improvements

Hari Balakrishnan,
Venkata N. Padmanabhan,
Srinivasan Seshan,
Mark Stemm,
and Randy H. Katz, 1997

In this paper, we analyze the way in which Web browsers
use TCP connections based on extensive traffic traces
obtained from a busy Web server (the official Web server
of the 1996 Atlanta Olympic games). At the time of operation,
this Web server was one of the busiest on the Internet,
handling tens of millions of requests per day from hundreds
of thousands of clients. We first describe the techniques
used to gather these traces and reconstruct the behavior
of TCP on the server. We then present a detailed analysis
of TCPs loss recovery and congestion control behavior from
the recorded transfers. The two most important results are
that: (1) short Web transfers lead to poor loss recovery
performance for TCP, and (2) concurrent connections cause
an overly aggressive use of the network. We then discuss
techniques designed to solve these problems. To improve
the loss recovery performance of short transfers, we present
a new technique for TCP loss recovery. To improve the
congestion control and loss recovery performance of multiple
parallel connections, we present a new transport-level
integrated connection approach to congestion control and
loss recovery. Simulation and trace analysis results show
that our enhanced loss recovery scheme could have eliminated
25% of all timeout events, and our integrated connection
approach provides greater fairness and improved startup
performance for parallel connections. Our solutions are
more general than a specific protocol enhancement such as
the use of persistent connections in P-HTTP and HTTP/1.1,
and include techniques, such as improved TCP loss
recovery, that are not addressed by these protocols.

traffic characterization: an assessment of the impact of
caching documents from the NCSA’s web server

Hans-Werner Braun and k claffy, 1994

Proceedings of the Second WWW Conference

We analyze two days of queries to the popular NCSA Mosaic
server to assess the geographic distribution of transaction
requests. The wide geographic diversity of query sources
and popularity of a relatively small portion of the web
server file set present a strong case for deployment of
geographically distributed caching mechanisms to improve
server and network efficiency.

Proxy Cache Traffic Characterization

Caching and Zipf-like Distributions: Evidence and

Lee Breslau, Pei Cao, Li Fan, Graham Phillips and Scott Shenker.

This paper addresses two unresolved issues about Web caching. […]
We first
investigate the page request distribution seen by Web proxy caches
using traces from a variety of sources. We find that the distribution
does not follow Zipf’s law precisely, but instead follows a Zipf-like
distribution with the exponent varying from trace to trace.
Furthermore, we find that there is i) a weak correlation between
the access frequency of a Web page and its size and ii) a weak
correlation between access frequency and its rate of change. We
then consider a simple model where the Web accesses are independent
and the reference probability of the documents follows a Zipf-like
distribution. We find that the model yields asymptotic behaviors
that are consistent with the experimental observations, suggesting
that the various observed properties of hit ratios and temporal
locality are indeed inherent to Web accesses observed by proxies.
Finally, we revisit Web cache replacement algorithms and show that
the algorithm that is suggested by this simple model performs best
on real trace data. The results indicate that while page requests
do indeed reveal short-term correlations and other structures, a
simple model for an independent request stream following a Zipf-like
distribution is sufficient to capture certain asymptotic properties
observed at Web proxies.

Measured Access Characteristics of World-Wide-Web Client
Proxy Caches

Bradley M. Duska, David Marwood, and Michael J. Feeley

This paper presents an analysis of access traces collected from
seven proxy servers deployed in various locations throughout the
Internet. The traces record a total of 47.4 million requests made
by 23,700 clients over a twenty-one day period. We use a combination
of static analysis and trace-driven cache simulation to characterize
the locality and sharing properties of these accesses.
Our analysis shows that a 2- to 10-GB second-level cache yields
hit rates between 24% and 45% with 85% of these hits due to sharing
among different clients. Caches with more clients exhibit more
sharing and thus higher hit rates. Between 2% and 7% of accesses
are consistency misses to unmodified objects, using the Squid and
CERN proxy cache coherence protocols. Sharing is bimodal. Requests
for shared objects are divided evenly between objects that are
narrowly shared and those that are shared by many clients; widely
shared objects also tend to be shared by clients from unrelated

the Overall Impact of Caching and Replication in the Web

M. Baentsch, A. Lauer, L. Baum, G. Molter, S. Rothkugel,
P. Sturm, 1997

This paper discusses the benefits and drawbacks of caching
and replication strategies in the WWW with respect to the
Internet infrastructure. Bandwidth consumption, latency,
and overall error rates are considered to be most important
from a network point of view. The dependencies of these
values with input parameters like degree of replication,
document popularity, actual cache hit rates, and error
rates are highlighted. In order to determine the influence
of different caching and replication strategies on the
behavior of a single proxy server with respect to these
values, trace-based simulations are used. Since the overall
effects of such strategies can hardly be decided with this
approach alone, a mathematical model has been developed to
deal with their influence on the network as a whole.
Together, this two-tiered approach permits us to propose
quantitative assessments on the influence different caching
and replication proposals (are going to) have on the Internet

User Perception of Web Performance

A Tool For Exploring User-Perceived Web Performance

Mimika Koletsou and Geoffrey M. Voelker

Proceedings of the Sixth Web Caching and Content Distribution Workshop

In this paper we describe the Medusa proxy, a tool for exploring
user-perceived Web performance. The Medusa proxy is a non-caching
forwarding proxy used in conjunction with a user’s browser. The
two key features that the Medusa proxy provides are (1) the ability
to simultaneously mirror HTTP requests from the browser to different
Web delivery systems and directly compare the results, and (2) the
ability to transform requests, e.g., to transform Akamai URLs back
into URLs to the customer’s origin

We show how the Medusa proxy can be used to explore user-perceived
Web performance using two experiments that deter- mine (1) to what
extent using large-scale caching systems like the NLANR cache
hierarchy improve user-perceived latency, and (2) to what extent
content distribution networks like Akamai improve latency compared
with using their customer’s origin server. We found that the NLANR
cache hierarchy served 63% of HTTP requests in our workload at
least as fast as the origin servers, resulting in a decrease of
average latency of 15%. By transforming and mirroring Akamai URLs
to both Akamai edge servers and their customer’s origin servers,
we find that Akamai edge servers are able to serve HTTP requests
an average of 5.7 times as fast as their customers. However, because
requests to Akamai edge servers are only 6% of our workload, we
find that using Akamai reduces overall mean latency by only 2% for
our workload.

Client-Perceived Response Times on the WWW

Ramakrishnan Rajamony and Mootaz Elnozahy

The response time of a WWW service often plays an important
role in its success or demise. From a user’s perspective,
the response time is the time elapsed from when a request
is initiated at a client to the time that the response is
fully loaded by the client. This paper presents a framework
for accurately measuring the client-perceived response time
in a WWW service. Our framework provides feedback to the
service provider and eliminates the uncertainties that are
common in existing methods. This feedback can be used to
determine whether performance expectations are met, and
whether additional resources (e.g. more powerful server or
better network connection) are needed. The framework can
also be used when a consolidator provides Web hosting
service, in which case the framework provides quantitative
measures to verify the consolidator’s compliance to a
specified Service Level Agreement. Our approach assumes
the existing infrastructure of the Internet with its current
technologies and protocols. No modification is necessary
to existing browsers or servers, and we accommodate
intermediate proxies that cache documents. The only
requirement is to instrument the documents to be measured,
which can be done automatically using a tool we provide.

Web Client Traffic

Web Proxy Caching: The Devil is in the Details

Ramón Cáceres, Fred Douglis, Anja Feldmann Gideon Glass, and Michael Rabinovich

Proceedings of the 1998 Workshop on Internet Server Performance

Much work in the analysis of proxy caching has focused on
high-level metrics such as hit rates, and has approximated
actual reference patterns by ignoring exceptional cases
such as connection aborts. Several of these low-level
details have a strong impact on performance, particularly
in heterogeneous bandwidth environments such as modem pools
connected to faster networks. Trace-driven simulation of
the modem pool of a large ISP suggests that “cookies”
dramatically affect the cacheability of resources; wasted
bandwidth due to aborted connections can more than offset
the savings from cached documents; and using a proxy to
keep from repeatedly opening new TCP connections can reduce
latency more than simply caching data.

of WWW Client-based Traces

Carlos R. Cunha, Azer Bestavros, and Mark E. Crovella, 1995

The explosion of WWW traffic necessitates an accurate picture of
WWW use, and in particular requires a good understanding of client
requests for WWW documents. To address this need, we have collected
traces of actual executions of NCSA Mosaic, reflecting over half
a million user requests for WWW documents. In this paper we describe
the methods we used to collect our traces, and the formats of the
collected data. Next, we present a descriptive statistical summary
of the traces we collected, which identifies a number of trends
and reference patterns in WWW use. In particular, we show that many
characteristics of WWW use can be modelled using power-law
distributions, including the distribution of document sizes, the
popularity of documents as a function of size, the distribution of
user requests for documents, and the number of references to
documents as a function of their overall rank in popularity (Zipf’s
law). Finally, we show how the power-law distributions derived from
our traces can be used to guide system designers interested in
caching WWW documents.

Web Cache

World-Wide Web Cache Consistency

James Gwertzman and Margo Seltzer

The bandwidth demands of the World Wide Web continue to
grow at a hyper-exponential rate. Given this rocketing
growth, caching of web objects as a means to reduce network
bandwidth consumption is likely to be a necessity in the
very near future. Unfortunately, many Web caches do not
satisfactorily maintain cache consistency. This paper
presents a survey of contemporary cache consistency mechanisms
in use on the Internet today and examines recent research
in Web cache consistency. Using trace-driven simulation,
we show that a weak cache consistency protocol (the one
used in the Alex ftp cache) reduces network bandwidth
consumption and server load more than either time-to-live
fields or an invalidation protocol and can be tuned to
return stale data less than 5% of the time.

A Name-Based Mapping Scheme for Rendezvous

David G. Thaler and Chinya V. Ravishankar

Clusters of identical intermediate servers are often created
to improve availability and robustness in many domains.
The use of proxy servers for the WWW and of Rendezvous
Points in multicast routing are two such situations.
However, this approach is inefficient if identical requests
are received and processed by multiple servers. We present
an analysis of this problem, and develop a method called
the Highest Random Weight (HRW) Mapping that eliminates
these difficulties. Given an object name, HRW maps it to
a server within a given cluster using the object name,
rather than any a priori knowledge of server states. Since
HRW always maps a given object name to the same server
within a given cluster, it may be used locally at client
sites to achieve consensus on object-server mappings. We
present an analysis of HRW and validate it with simulation
results showing that it gives faster service times than
traditional request allocation schemes such as round-robin
or least-loaded, and adapts well to changes in the set of
servers. HRW is particularly applicable to domains in
which there are a large number of requestable objects, and
where there is a significant probability that a requested
object will be requested again. HRW has now been adopted
by the multicast routing protocol PIMv2 as its mechanism
for clients to identify Rendezvous Points.

The Effect of Consistency on Cache Response Time

John Dilley

Caching in the World Wide Web improves response time,
reduces network bandwidth demand and reduces load on origin
web servers. Caches achieve these benefits by storing copies
of recently requested objects near end users, avoiding
future need to transfer those objects. Cached objects are
usually served more quickly and do not consume external
network or server resources. Before returning an object,
a cache must guess if the object it holds is still consistent
with the original copy of the object. The cache may choose
to validate the object’s consistency with the origin server
or may serve it directly to the user. If the cache must
communicate with the origin server the response will take
longer than one directly from cache. This report analyzes
the impact of cache consistency on the response time of
client requests. The analysis divides cache responses into
classes according to whether or not the cache communicated
with a remote server and whether or not object data was
served from the cache. Analysis of traces from deployed
proxy cache servers demonstrates that a round trip to a
remote server is the dominant factor for response time.
This study concludes that improving cache consistency will
reduce response time and allow a cache to serve more user

in timestamp-based HTTP header values

Jeffrey Mogul, Compaq WRL

Many of the caching mechanism in HTTP, especially in
HTTP/1.0, depend on header fields that carry absolute
timestamp values. Errors in these values could lead to
undetected cache incoherence, or to excessive cache misses.
Using an extensive proxy trace, we looked for HTTP responses
exhibiting several different categories of timestamp-related
errors. A significant fraction of these responses have
detectable errors in timestamp-based header fields.

Web Cache

Policies in Network Caches for World-Wide Web Documents

S. Williams, M. Abrams, C.R. Standridge, G. Abdulla, and E.A. Fox

In Proceedings of ACM Sigcomm’96.

World-Wide Web proxy servers that cache documents can
potentially reduce three quantities: the number of requests
that reach popular servers, the volume of network traffic
resulting from document requests, and the latency that an
end-user experiences in retrieving a document. This paper
examines the first two using the measures of cache hit rate
(or fraction of client-requested URLs returned by the proxy)
and weighted cache hit rate (or fraction of client-requested
bytes returned by the proxy). A client request for an
uncached document may cause the removal of one or more
cached documents. Variable document sizes and types allow
a rich variety of policies to select a document for removal,
in contrast to policies for CPU caches or demand paging,
that manage homogeneous objects. We present a taxonomy of
removal policies. Through trace-driven simulation, we
determine the maximum possible hit rate and weighted hit
rate that a cache could ever achieve, and the removal policy
that maximizes hit rate and weighted hit rate. The experiments
use five traces of 38 to 190 days of client URL requests.
Surprisingly, the criteria used by current proxy-server
removal policies (LRU and a proposal by Pitkow and Recker)
are among the worst performing criteria in our simulation;
instead, replacing documents based on size maximizes hit
rate in each of the studied workloads.

WWW Proxy Caching Algorithms

Pei Cao and Sandy Irani,

In Proceedings of the USENIX Symposium on Internet
Technologies and Systems December 8-11, 1997 Monterey,
California, USA

Web caches can not only reduce network traffic and downloading
latency, but can also affect the distribution of web traffic
over the network through cost-aware caching. This paper
introduces GreedyDual-Size, which incorporates locality
with cost and size concerns in a simple and non-parameterized
fashion for high performance. Trace-driven simulations
show that with the appropriate cost definition, GreedyDual-Size
outperforms existing web cache replacement algorithms in
many aspects, including hit ratios, latency reduction and
network cost reduction. In addition, GreedyDual-Size can
potentially improve the performance of main-memory caching
of Web documents.

Optimal replacement policies for non-uniform cache objects with optional eviction

Omri Bahat and Armand M. Makowski

In Proceedings of INFOCOM 2003, March 30-April 3, 2003,
San Francisco, California, USA

Replacement policies for general caching applications and Web caching
in particular have been discussed extensively in the literature.
Many ad-hoc policies have been proposed
that attempt to take adavantage of the retrieval latency of
documents, their size, the popularity of references and temporal
locality of requested documents. However, the problem of finding
optimal replacement policies under these factors has not been
pursued in any systematic manner. In this paper, we take a
step in that direction: We first show, still under the Independent
Reference Model, that a simple Markov stationary replacement
policy, called the policy C0, minimizes the long-run average
metric induced by non-uniform document costs when document
eviction is optional. We then propose a framework for operating
caching systems with multiple performance metrics. We do so by
solving a constrained caching problem with a single constraint.
The resulting constrained optimal replacement policy is obtained
by simple randomization between two Markov stationary optimal
replacement policies C0 but induced by different costs.


Key Differences between HTTP/1.0 and HTTP/1.1

Balachander Krishnamurthy, Jeffrey C. Mogul, and David M. Kristol

The HTTP/1.1 protocol is the result of four years of
discussion and debate among a broad group of Web researchers
and developers. It improves upon its phenomenally successful
predecessor, HTTP/1.0, in numerous ways. We discuss the
differences between HTTP/1.0 and HTTP/1.1, as well as some
of the rationale behind these changes.

Wrong with HTTP and Why It Doesn’t Matter

Jeffrey Mogul, Compaq Western Research Lab

This is the presentation material from Jeff’s invited talk at
the 1999 USENIX Annual Technical Conference.

Protocol Compliance on the Web–A Longitudinal Study

Balachander Krishnamurthy and Martin Arlitt

With the recent (draft) standardization of the HTTP/1.1
protocol on the Web, it is natural to ask what percentage
of popular Web sites speak HTTP/1.1 and how compliant are
these so-called HTTP/1.1 servers. We attempt to answer
these questions through a series of experiments based on
the protocol standard. The tests are run on a comprehensive
list of popular Web sites to which a good fraction of the
Web traffic is directed. Our experiments were conducted on
a global extensible testing infrastructure that we built
to answer the above questions. The same infrastructure will
be used to answer questions like the percentage of the
traffic flow that is end-to-end HTTP/1.1. Our results show
reasons for concern on the state of HTTP/1.1 protocol
compliancy: some servers do no properly support basic
features such as the HEAD method, while many popular servers
do not support some of the key features (such as persistent
connections) that were added in HTTP/1.1. Perhaps most
alarming of all, some servers crashed during testing. As
a result we believe that changes to the protocol specification
may be required.

HTTP/TCP connection and flow characteristics

Joachim Charzinski

Volume 42, Issue 2-3 of Sept 2000 Performance Evaluation

Currently, most traffic in the Internet backbone and access
networks is World Wide Web (WWW) traffic. On the basis of
recent long-time traffic traces we present characteristics
of WWW traffic for different flow levels, namely port-to-port,
host-to-host, and total client access. Using flow duration
distributions, we obtain estimates for the point in time
when a shortcut connection should be established. Investigations
of the correlation structure of different flow characteristics
reveal that symmetrical connections occur even at high bit
rates. A large number of TCP connections can therefore
benefit from symmetric access network bandwidths although
HTTP traffic on average is asymmetric.

DHTTP: An Efficient and Cache-Friendly Transfer Protocol for Web Traffic

Michael Rabinovich and Hua Wang

Proceedings of IEEE Infocom 2001

Today’s Web interactions are frequently short, with an
increasing number of responses carrying only control
information and no data. While HTTP uses client-initiated
TCP for all Web interactions, TCP is not always well-suited
for short interactions. Furthermore, client-initiated TCP
handicaps the deployment of interception caches in the
network because of the possibility of disrupted connections
when some client packets bypass the cache on their way to
the server. We propose a new transfer protocol for Web
traffic, called Dual-transport HTTP (DHTTP), which splits
the traffic between UDP and TCP channels. When choosing
the TCP channel, it is the server who opens the connection
back to the client. Among important aspects of DHTTP are
adapting to bottleneck shifts between a server and the
network and coping with the unreliable nature of UDP. The
comparative performance study of DHTTP and HTTP using
trace-driven simulation as well as testing real HTTP and
DHTTP servers showed a significant performance advantage
of DHTTP when the bottleneck is at the server and comparable
performance when the bottleneck is in the network. By using
server-initiated TCP, DHTTP also eliminates the possibility
of disrupted TCP connections in the presence of interception
caches thereby allowing unrestricted caching within backbones.

Network Performance Effects of HTTP/1.1, CSS1, and PNG

W3C, 1997

We describe our investigation of the effect of persistent
connections, pipelining and link level document compression
on our client and server HTTP implementations. A simple
test setup is used to verify HTTP/1.1’s design and understand
HTTP/1.1 implementation strategies. We present TCP and
real time performance data between the libwww robot
and both the W3C’s Jigsaw and Apache HTTP servers
using HTTP/1.0, HTTP/1.1 with persistent connections,
HTTP/1.1 with pipelined requests, and HTTP/1.1 with pipelined
requests and deflate data compression. We also
investigate whether the TCP Nagle algorithm has an effect
on HTTP/1.1 performance. While somewhat artificial and
possibly overstating the benefits of HTTP/1.1, we believe
the tests and results approximate some common behavior seen
in browsers. The results confirm that HTTP/1.1 is meeting
its major design goals. Our experience has been that
implementation details are very important to achieve all
of the benefits of HTTP/1.1.

An Empirical Model of HTTP Network Traffic

Bruce A. Mah

Proceedings of the INFOCOM ’97

The workload of the global Internet is dominated by the
Hypertext Transfer Protocol (HTTP), an application protocol
used by World Wide Web clients and servers. Simulation
studies of IP networks will require a model of the traffic
patterns of the World Wide Web, in order to investigate
the effects of this increasingly popular application. We
have developed an empirical model of network traffic produced
by HTTP. Instead of relying on server or client logs, our
approach is based on packet traces of HTTP conversations.
Through traffic analysis, we have determined statistics
and distributions for higher-level quantities such as the
size of HTTP files, the number of files per “Web page”,
and user browsing behavior. These quantities form a model
can then be used by simulations to mimic World Wide Web
network applications.

Inter-cache Communication

Compressed Bloom Filters

Michael Mitzenmacher

ACM Symposium on Principles of Distributed Computing (PODC) 2001

A Bloom filter is a simple space-efficient randomized data
structure for representing a set in order to support
membership queries. Although Bloom filters allow false
positives, for many applications the space savings outweigh
this drawback when the probability of an error is sufficiently
low. We introduce compressed Bloom filters, which improve
performance when the Bloom filter is passed as a message,
and its transmission size is a limiting factor. For example,
Bloom filters have been suggested as a means for sharing
Web cache information. In this setting, proxies do not
share the exact contents of their caches, but instead
periodically broadcast Bloom filters representing their
cache. By using compressed Bloom filters, proxies can reduce
the number of bits broadcast, the false positive rate,
and/or the amount of computation per lookup. The cost is
the processing time for compression and decompression,
which can use simple arithmetic coding, and more memory
use at the proxies, which utilize the larger uncompressed
form of the Bloom filter.

the scale and performance of cooperative Web proxy caching

Alec Wolman, Geoffrey M. Voelker, Nitin Sharma, Neal Cardwell, Anna Karlin, and Henry M. Levy

Proceedings of the 17th ACM Symposium on Operating Systems Principles (SOSP ’99)

While algorithms for cooperative proxy caching have been
widely studied, little is understood about cooperative-caching
performance in the large-scale World Wide Web environment.
This paper uses both trace-based analysis and analytic
modelling to show the potential advantages and drawbacks
of inter-proxy cooperation. With our traces, we evaluate
quantitatively the performance-improvement potential of
cooperation between 200 small-organization proxies within
a university environment, and between two large-organization
proxies handling 23,000 and 60,000 clients, respectively.
With our model, we extend beyond these populations to
project cooperative caching behavior in regions with millions
of clients. Overall, we demonstrate that cooperative caching
has performance benefits only within limited population
bounds. We also use our model to examine the implications
of future trends in Web-access behavior and traffic.

Cache Digests

Alex Rousskov and Duane Wessels

Third International Web Caching Workshop, 1998

This paper presents Cache Digest, a novel protocol and
optimization technique for cooperative Web caching. Cache
Digest allows proxies to make information about their cache
contents available to peers in a compact form. A peer uses
digests to identify neighbors that are likely to have a
given document. Cache Digest is a promising alternative to
traditional per-request query/reply schemes such as ICP.

We discuss the design ideas behind Cache Digest and its
implementation in the Squid proxy cache. The performance
of Cache Digest is compared to ICP using real-world Web
caches operated by NLANR. Our analysis shows that Cache
Digest outperforms ICP in several categories. Finally, we
outline improvements to the techniques we are currently
working on.

and the Squid Web Cache

Duane Wessels and k claffy

IEEE Journal on Selected Areas in Communication,
April 1998, Vol 16, #3, pages 345-357.

We describe the structure and functionality of the Internet
Cache Protocol (ICP) and its implementation in the Squid
Web Caching software. ICP is a lightweight message format
used for communication among Web caches. Caches exchange
ICP queries and replies to gather information to use in
selecting the most appropriate location from which to
retrieve an object.

We present background on the history of ICP, and discuss
issues in ICP deployment, efficiency, security, and
interaction with other aspects of Web traffic behavior. We
catalog successes, failures, and lessons learned from using
ICP to deploy a global Web cache hierarchy.

Summary Cache: A Scalable Wide-Area Web Cache Sharing Protocol

Li Fan, Pei Cao, Jussara Almeida, and Andrei Z. Broder

Proceedings of SIGCOMM ’98

The sharing of caches among Web proxies is an important
technique to reduce Web traffic and alleviate network
bottlenecks. Nevertheless it is not widely deployed due to
the overhead of existing protocols. In this paper we
demonstrate the benefits of cache sharing, measure the
overhead of the existing protocols, and propose a new
protocol called “Summary Cache”. In this new protocol,
each proxy keeps a summary of the cache directory of each
participating proxy, and checks these summaries for potential
hits before sending any queries. Two factors contribute to
our protocol’s low overhead: the summaries are updated only
periodically, and the directory representations are very
economical, as low as 8 bits per entry. Using trace-driven
simulations and a prototype implementation, we show that,
compared to existing protocols such as the Internet Cache
Protocol (ICP), Summary Cache reduces the number of
inter-cache protocol messages by a factor of 40 to 65},
reduces the bandwidth consumption by over 50%, eliminates
75% and 95% of the protocol CPU overhead, all while maintain
ing almost the same cache hit ratio as ICP. Hence Summary
Cache scales to a large number of proxies.


Prefetching Hyperlinks

Dan Duchamp, AT&T Labs — Research

This paper develops a new method for prefetching Web pages
into the client cache. Clients send reference information
to Web servers, which aggregate the reference information
in near-real-time and then disperse the aggregated information
to all clients, piggybacked on GET responses. The information
indicates how often hyperlink URLs embedded in pages have
been previously accessed relative to the embedding page.
Based on knowledge about which hyperlinks are generally
popular, clients initiate prefetching of the hyperlinks
and their embedded images according to any algorithm they
prefer. Both client and server may cap the prefetching
mechanism’s space overhead and waste of network resources
due to speculation. The result of these differences is
improved prefetching: lower client latency (by 52.3%) and
less wasted network bandwidth (24.0%).

Issues and

Copyright Law on
the Internet: The Special Problem of Caching and Copyright

Cyberspace Law Institute, 1995

An author’s exclusive right to control the making of “copies”
of original works is the fundamental building block of any
copyright regime. Application of this concept to electronic
documents on the global Internet is proving difficult, in
light of the fact that “copying” — duplicating the
information stored in binary files — is an essential step
in the transmission of information across the Internet,
and thus is required for any and all utilization of works
of authorship in this environment. To the extent the
transitory duplication of electronic files is characterized
as copying for purposes of section 106 of the U.S. Copyright
Act — a view that, arguably, has been endorsed by several
courts and, less arguably, by the federal government’s
National Information Infrastructure Working Group on
Intellectual Property — much of what takes place over
electronic networks (including the Internet) is potentially
subject to a claim of copyright infringement.


Eric Schlachter

Intellectual Propterty Magazine, 1996

Copying and storing Web pages is vital to the Internet’s
survival — but is it legal?

Turning On and Turning Off

Ronald S. Katz and Janet Arnold Hart

Intellectual Property Magazine, 1995

Can merely turning on a computer constitute copyright
infringement? In the Ninth Circuit, yes.

The Digital Millennium Copyright Act

Jonathan Band, 1998

On October 12, 1998, Congress passed the Digital Millennium
Copyright Act (DMCA), a complex piece of legislation which
makes major changes in U.S. copyright law to address the
digitally networked environment. The President is expected
to sign the DMCA shortly. This memorandum discusses the
law’s five titles …

Is the Internet Heading for a Cache Crunch?

Russel Baird Tewksbury

This article outlines the hidden dangers of network/web
caching and the implementation of a global mesh of caches.
According to the author, “the integrity of information will
decline, and data security risks are sure to escalate. Both
Web caching and the international hierarchical cache system
set the stage for abuses such as: individual monitoring
and surveillance, tampering, identity theft, censorship,
intellectual property/copyright infringement, invasion of
privacy, and taxation.”

$Date: 2006/12/28 01:03:03 $