ALCOMFT-TR-02-55
|
![ALCOM-FT](../Main/logo_160x41.gif)
|
Gerth Stølting Brodal, Rune B. Lyngsø, Anna Östlin and Christian N. S. Pedersen
Solving the String Statistics Problem in Time O(nlog n)
Århus.
Work package 4.
May 2002.
Abstract: The string statistics problem consists of preprocessing a string of
length n such that given a query pattern of length m, the
maximum number of non-overlapping occurrences of the query pattern
in the string can be reported efficiently. Apostolico and Preparata
introduced the minimal augmented suffix tree (MAST) as a data
structure for the string statistics problem, and showed how to
construct the MAST in time O(nlog2 n) and how it supports
queries in time O(m) for constant sized alphabets. A subsequent
theorem by Fraenkel and Simpson stating that a string has at most a
linear number of distinct squares implies that the MAST requires
space O(n). In this paper we improve the construction time for
the MAST to O(nlog n) by extending the algorithm of Apostolico
and Preparata to exploit properties of efficient joining and
splitting of search trees together with a refined analysis.
Postscript file: ALCOMFT-TR-02-55.ps.gz (177 kb).
System maintainer Gerth Stølting Brodal <gerth@cs.au.dk>