ALCOMFT-TR-02-30

ALCOM-FT
 

Peter Sanders and Kurt Mehlhorn
Scanning Multiple Sequences Via Cache Memory
MPI. Work packages 1, 4 and 5. May 2002.
Abstract: We consider the simple problem of scanning multiple sequences. There are k sequences of total length N which are to be scanned concurrently. One pointer into each sequence is maintained and an adversary specifies which pointer is to be advanced. The concept of scanning multiple sequence is ubiquitous in algorithms designed for hierarchical memory.

In the external memory model of computation with block size B, a memory consisting of m blocks, and at most m sequences the problem is trivially solved with N/B memory misses by reserving one block of memory for each sequence.

However, in a cache memory with associativity a, every access may lead to a cache fault if k > a. For a direct mapped cache (a = 1) two sequences suffice. We show that by randomizing the starting addresses of the sequences the number of cache misses can be kept to O(N/B) provided that k = O(m/B1/a), i.e., the number of sequences that can be supported is decreased by a factor B1/a. We also show a corresponding lower bound.

Our result leads to a general method for converting sequence based algorithms designed for the external memory model of computation to cache memory even for caches with small associativity.

Postscript file: ALCOMFT-TR-02-30.ps.gz (140 kb).

System maintainer Gerth Stølting Brodal <gerth@cs.au.dk>