Prefetching Concepts of Wcol

Why prefetch ? How to prefetch ?

This page describes motivations and core designs of Wcol's prefetching scheme.


Why prefetch ?

WWW is too slow

The resource retrieval over WWW takes often a long time. People worry server down or URL is misspelled when they wait a long retrieval time. So, a long retrieval time put stress to users. If we reduce or improve it, we cat browser many resources with no stress.

Prefetching is best way to improve retrieval time

There are mirroring or caching to improve this problem. But these way are not good way.

Mirroring reduces retrieval time. However, mirroring requires target selection by manually. Who select mirroring targets ? The number of WWW resources is very large and it have been increasing. Who can select effective mirroring targets ? You want god/godnes ?

Caching reduces retrieval time of repeated accessed resources. But it has no effect to first retrieval of these access, and repeated accessed is very little in all retrievals. Several papers reported hit rate of caching system for WWW is 30% to 50%. It means to repeated accesses are very little.

The altanetive solution is prefetching. Prefetching reduces many of resource retrievals and improves retrieval time of other accesses. This target range is wider than range of both mirroring and caching.

target selection target range
Mirroring manually narrow
Caching automatically narrow
Prefetching automatically wide

Then, we do prefetching

Prefetching makes fast WWW service. However, best prefetching scheme is not clear. We studied what is 'effective prefetching' and how to do it. The result of these studies is Wcol.


How to prefetch ?

Check your utilitization

WWW is page oriented service

How do you browse WWW resources ? In our observations, we recognized three important things. These are well known but it is very important. Because these things mean to the WWW surfing is acted page oriented.

How many repeated retrieval are there ?

Caching and several other way to improve retrieval is based on repeated resource retrieval occur frequently. Is it really ?

In our observations, repeated retrieval are little part of all retrievals. So, we expect little benefit using it to improve retrieval time.

###image of 'the frequency of retrievals'###

Interactive prefetching scheme

The prefetching system should prefetch page oriented, because WWW is page oriented service.

What is page oriented prefetching ? We designed a prefetching scheme to get following points at each client request, which is called interactive prefetching scheme.

Basic Algorithm

input: base_page (the page of browser's requested)
begin
  anchor_set := SeekAnchor(base_page);
  foreach A in anchor_set do
  begin
    Get(A);
    included_set := SeekIncluded(A);
    foreach I in included_set do
    begin
      Get(I);
    end;
  end;
end.
Basic Algorithm of interactive prefetching scheme

Alternatives

Statistical prefetching scheme

Hybrid of caching and mirroring

What prefetching scheme is good ?

In this section, we discuss what is good in many prefetching schemes. Since, mirroring is considered also long period and manually selection prefetching, we discuss mirroring, too.