Multiple seeds sensitivity using a single seed with threshold

Lavinia Egidi, Giovanni Manzini

Risultato della ricerca: Contributo su rivistaArticolo in rivistapeer review

Abstract

Spaced seeds are a fundamental tool for similarity search in biosequences. The best sensitivity/selectivity trade-offs are obtained using many seeds simultaneously: This is known as the multiple seed approach. Unfortunately, spaced seeds use a large amount of memory and the available RAM is a practical limit to the number of seeds one can use simultaneously. Inspired by some recent results on lossless seeds, we revisit the approach of using a single spaced seed and considering two regions homologous if the seed hits in at least t sufficiently close positions. We show that by choosing the locations of the don't care symbols in the seed using quadratic residues modulo a prime number, we derive single seeds that when used with a threshold t > 1 have competitive sensitivity/selectivity trade-offs, indeed close to the best multiple seeds known in the literature. In addition, the choice of the threshold t can be adjusted to modify sensitivity and selectivity a posteriori, thus enabling a more accurate search in the specific instance at issue. The seeds we propose also exhibit robustness and allow flexibility in usage.

Lingua originaleInglese
Numero di articolo1550011
RivistaJournal of Bioinformatics and Computational Biology
Volume13
Numero di pubblicazione4
DOI
Stato di pubblicazionePubblicato - 7 ago 2015

Fingerprint

Entra nei temi di ricerca di 'Multiple seeds sensitivity using a single seed with threshold'. Insieme formano una fingerprint unica.

Cita questo