Compressed spaced suffix arrays

Travis Gagie, Giovanni Manzini, Daniel Valenzuela

Risultato della ricerca: Contributo su rivistaArticolo da conferenzapeer review

Abstract

Spaced seeds are important tools for similarity search in bioinformatics, and using several seeds together often significantly improves their performance. With existing approaches, however, for each seed we keep a separate linear-size data structure, either a hash table or a spaced suffix array (SSA). In this paper we show how to compress SSAs relative to normal suffix arrays (SAs) and still support fast random access to them. We first prove a theoretical upper bound on the space needed to store an SSA when we already have the SA. We then present experiments indicating that our approach works even better in practice.

Lingua originaleInglese
pagine (da-a)37-45
Numero di pagine9
RivistaCEUR Workshop Proceedings
Volume1146
Stato di pubblicazionePubblicato - 2014
Pubblicato esternamente
Evento2nd International Conference on Algorithms for Big Data, ICABD 2014 - Palermo, Italy
Durata: 7 apr 20149 apr 2014

Fingerprint

Entra nei temi di ricerca di 'Compressed spaced suffix arrays'. Insieme formano una fingerprint unica.

Cita questo