TY - JOUR
T1 - Wheeler graphs
T2 - A framework for BWT-based data structures
AU - Gagie, Travis
AU - Manzini, Giovanni
AU - Sirén, Jouni
N1 - Publisher Copyright:
© 2017 The Authors
PY - 2017/10/25
Y1 - 2017/10/25
N2 - The famous Burrows–Wheeler Transform (BWT) was originally defined for a single string but variations have been developed for sets of strings, labeled trees, de Bruijn graphs, etc. In this paper we propose a framework that includes many of these variations and that we hope will simplify the search for more. We first define Wheeler graphs and show they have a property we call path coherence. We show that if the state diagram of a finite-state automaton is a Wheeler graph then, by its path coherence, we can order the nodes such that, for any string, the nodes reachable from the initial state or states by processing that string are consecutive. This means that even if the automaton is non-deterministic, we can still store it compactly and process strings with it quickly. We then rederive several variations of the BWT by designing straightforward finite-state automata for the relevant problems and showing that their state diagrams are Wheeler graphs.
AB - The famous Burrows–Wheeler Transform (BWT) was originally defined for a single string but variations have been developed for sets of strings, labeled trees, de Bruijn graphs, etc. In this paper we propose a framework that includes many of these variations and that we hope will simplify the search for more. We first define Wheeler graphs and show they have a property we call path coherence. We show that if the state diagram of a finite-state automaton is a Wheeler graph then, by its path coherence, we can order the nodes such that, for any string, the nodes reachable from the initial state or states by processing that string are consecutive. This means that even if the automaton is non-deterministic, we can still store it compactly and process strings with it quickly. We then rederive several variations of the BWT by designing straightforward finite-state automata for the relevant problems and showing that their state diagrams are Wheeler graphs.
KW - Burrows–Wheeler transform
KW - Compressed data structures
KW - Pattern matching
UR - http://www.scopus.com/inward/record.url?scp=85021759994&partnerID=8YFLogxK
U2 - 10.1016/j.tcs.2017.06.016
DO - 10.1016/j.tcs.2017.06.016
M3 - Article
SN - 0304-3975
VL - 698
SP - 67
EP - 78
JO - Theoretical Computer Science
JF - Theoretical Computer Science
ER -