Abstract
High-throughput sequencing machines can read millions of DNA molecules in parallel in a short time and at a relatively low cost. As a consequence, researchers have access to databases with millions of genomic samples. Searching and analyzing these large amounts of data require efficient algorithms. Universal hitting sets are sets of words that must be present in any long enough string. Using small universal hitting sets, it is possible to increase the efficiency of many high-throughput sequencing data analyses. But, generating minimum-size universal hitting sets is a hard problem. In this chapter, we cover our algorithmic developments to produce compact universal hitting sets and some of their potential applications.
Original language | English |
---|---|
Title of host publication | Methods in Molecular Biology |
Publisher | Humana Press Inc. |
Pages | 95-105 |
Number of pages | 11 |
DOIs | |
State | Published - 2021 |
Externally published | Yes |
Publication series
Name | Methods in Molecular Biology |
---|---|
Volume | 2243 |
ISSN (Print) | 1064-3745 |
ISSN (Electronic) | 1940-6029 |
Bibliographical note
Publisher Copyright:© 2021, Springer Science+Business Media, LLC, part of Springer Nature.
Keywords
- Minimizers
- Universal hitting sets
- de Bruijn graph