TY - GEN
T1 - The myriad virtues of wavelet trees
AU - Ferragina, Paolo
AU - Giancarlo, Raffaele
AU - Manzini, Giovanni
PY - 2006
Y1 - 2006
N2 - Wavelet Trees have been introduced in [Grossi, Gupta and Vitter, SODA '03] and have been rapidly recognized as a very flexible tool for the design of compressed full-text indexes and data compressors. Although several papers have investigated the beauty and usefulness of this data structure in the full-text indexing scenario, its impact on data compression has not been fully explored. In this paper we provide a complete theoretical analysis of a wide class of compression algorithms based on Wavelet Trees. We also show how to improve their asymptotic performance by introducing a novel framework, called Generalized Wavelet Trees, that aims for the best combination of binary compressors (like, Run-Length encoders) versus non-binary compressors (like, Huffman and Arithmetic encoders) and Wavelet Trees of properly-designed shapes. As a corollary, we prove high-order entropy bounds for the challenging combination of Burrows-Wheeler Transform and Wavelet Trees.
AB - Wavelet Trees have been introduced in [Grossi, Gupta and Vitter, SODA '03] and have been rapidly recognized as a very flexible tool for the design of compressed full-text indexes and data compressors. Although several papers have investigated the beauty and usefulness of this data structure in the full-text indexing scenario, its impact on data compression has not been fully explored. In this paper we provide a complete theoretical analysis of a wide class of compression algorithms based on Wavelet Trees. We also show how to improve their asymptotic performance by introducing a novel framework, called Generalized Wavelet Trees, that aims for the best combination of binary compressors (like, Run-Length encoders) versus non-binary compressors (like, Huffman and Arithmetic encoders) and Wavelet Trees of properly-designed shapes. As a corollary, we prove high-order entropy bounds for the challenging combination of Burrows-Wheeler Transform and Wavelet Trees.
UR - http://www.scopus.com/inward/record.url?scp=33746345004&partnerID=8YFLogxK
U2 - 10.1007/11786986_49
DO - 10.1007/11786986_49
M3 - Conference contribution
AN - SCOPUS:33746345004
SN - 3540359044
SN - 9783540359043
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 560
EP - 571
BT - Automata, Languages and Programming - 33rd International Colloquium, ICALP 2006, Proceedings
PB - Springer Verlag
T2 - 33rd International Colloquium on Automata, Languages and Programming, ICALP 2006
Y2 - 10 July 2006 through 14 July 2006
ER -