Two space saving tricks for linear time LCP array computation

Giovanni Manzini

Risultato della ricerca: Capitolo in libro/report/atti di convegnoContributo in volume (Capitolo o Saggio)peer review

Abstract

In this paper we consider the linear time algorithm of Kasai et al. [6] for the computation of the Longest Common Prefix (LCP) array given the text and the suffix array. We show that this algorithm can be implemented without any auxiliary array in addition to the ones required for the input (the text and the suffix array) and the output (the LCP array). Thus, for a text of length n, we reduce the space occupancy of this algorithm from 13n bytes to 9n bytes. We also consider the problem of computing the LCP array by "overwriting" the suffix array. For this problem we propose an algorithm whose space occupancy can be bounded in terms of the empirical entropy of the input text. Experiments show that for linguistic texts our algorithm uses roughly 7n bytes. Our algorithm makes use of the Burrows-Wheeler Transform even if it does not represent any data in compressed form. To our knowledge this is the first application of the Burrows-Wheeler Transform outside the domain of data compression. The source code for the algorithms described in this paper has been included in the lightweight suffix sorting package [13] which is freely available under the GNU GPL.

Lingua originaleInglese
Titolo della pubblicazione ospiteLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
EditorTorben Hagerup, Jyrki Katajainen
EditoreSpringer Verlag
Pagine372-383
Numero di pagine12
ISBN (elettronico)3540223398, 9783540223399
DOI
Stato di pubblicazionePubblicato - 2004
Pubblicato esternamente

Serie di pubblicazioni

NomeLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume3111
ISSN (stampa)0302-9743
ISSN (elettronico)1611-3349

Fingerprint

Entra nei temi di ricerca di 'Two space saving tricks for linear time LCP array computation'. Insieme formano una fingerprint unica.

Cita questo