TY - GEN
T1 - A new class of searchable and provably highly compressible string transformations
AU - Giancarlo, Raffaele
AU - Manzini, Giovanni
AU - Rosone, Giovanna
AU - Sciortino, Marinella
N1 - Publisher Copyright:
© Raffaele Giancarlo, Giovanni Manzini, Giovanna Rosone, and Marinella Sciortino.
PY - 2019/6/1
Y1 - 2019/6/1
N2 - The Burrows-Wheeler Transform is a string transformation that plays a fundamental role for the design of self-indexing compressed data structures. Over the years, researchers have successfully extended this transformation outside the domains of strings. However, efforts to find non-trivial alternatives of the original, now 25 years old, Burrows-Wheeler string transformation have met limited success. In this paper we bring new lymph to this area by introducing a whole new family of transformations that have all the “myriad virtues” of the BWT: they can be computed and inverted in linear time, they produce provably highly compressible strings, and they support linear time pattern search directly on the transformed string. This new family is a special case of a more general class of transformations based on context adaptive alphabet orderings, a concept introduced here. This more general class includes also the Alternating BWT, another invertible string transforms recently introduced in connection with a generalization of Lyndon words.
AB - The Burrows-Wheeler Transform is a string transformation that plays a fundamental role for the design of self-indexing compressed data structures. Over the years, researchers have successfully extended this transformation outside the domains of strings. However, efforts to find non-trivial alternatives of the original, now 25 years old, Burrows-Wheeler string transformation have met limited success. In this paper we bring new lymph to this area by introducing a whole new family of transformations that have all the “myriad virtues” of the BWT: they can be computed and inverted in linear time, they produce provably highly compressible strings, and they support linear time pattern search directly on the transformed string. This new family is a special case of a more general class of transformations based on context adaptive alphabet orderings, a concept introduced here. This more general class includes also the Alternating BWT, another invertible string transforms recently introduced in connection with a generalization of Lyndon words.
KW - Burrows-wheeler transformation
KW - Combinatorics on words
KW - Data indexing and compression
UR - http://www.scopus.com/inward/record.url?scp=85068025041&partnerID=8YFLogxK
U2 - 10.4230/LIPIcs.CPM.2019.12
DO - 10.4230/LIPIcs.CPM.2019.12
M3 - Conference contribution
AN - SCOPUS:85068025041
T3 - Leibniz International Proceedings in Informatics, LIPIcs
BT - 30th Annual Symposium on Combinatorial Pattern Matching, CPM 2019
A2 - Pisanti, Nadia
A2 - Pissis, Solon P.
PB - Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing
T2 - 30th Annual Symposium on Combinatorial Pattern Matching, CPM 2019
Y2 - 18 June 2019 through 20 June 2019
ER -