Abstract
The recovery problem is the problem whose input is a corrupted text T that was originally periodic, and where one wishes to recover its original period. The algorithm’s input is T without any information about either the period’s length or the period itself. An algorithm that solves this problem is called a recovery algorithm. In order to make recovery possible, there must be some assumption that not “too many” errors corrupted the initial periodic string. This is called the error bound. In previous recovery algorithms, it was shown that a given error bound of n/(2+ nε)p can lead to O(log1+ε n) period candidates, that are guaranteed to include the original period, where p is the length of the original period (unknown by the algorithm) and ε > 0 is an arbitrary constant. This paper provides the first analysis of the relationship between the error bound and the number of candidates, as well as identification of the error parameters that still guarantee recovery. We improve the previously known upper error bound on the number of corruptions, n/(2+ nε)p, that outputs O(log1+ε n) period candidates. We show how to (1) remove ε from the bound, (2) relax the error bound to allow more errors while keeping the candidates set of size O(log n). It turns out that this relaxation on the previously known upper bound is quite challenging. To achieve this result we provide what, to our knowledge, is the first known non-trivial lower bound on the Hamming distance between two periodic strings. This proof leads to an error bound, that produces a family of period candidates of size 2 log3 n. We show that this result is tight and further provide a compact representation of the period candidates. We call this representation the canonic period seed. In addition to providing less restrictive error bounds that guarantee a smaller candidate set, we also provide a hierarchy of more restrictive upper error bounds that asymptotically reduces the size of the potential period candidate set.
Original language | English |
---|---|
Title of host publication | 28th Annual European Symposium on Algorithms, ESA 2020 |
Editors | Fabrizio Grandoni, Grzegorz Herman, Peter Sanders |
Publisher | Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing |
ISBN (Electronic) | 9783959771627 |
DOIs | |
State | Published - 1 Aug 2020 |
Event | 28th Annual European Symposium on Algorithms, ESA 2020 - Virtual, Pisa, Italy Duration: 7 Sep 2020 → 9 Sep 2020 |
Publication series
Name | Leibniz International Proceedings in Informatics, LIPIcs |
---|---|
Volume | 173 |
ISSN (Print) | 1868-8969 |
Conference
Conference | 28th Annual European Symposium on Algorithms, ESA 2020 |
---|---|
Country/Territory | Italy |
City | Virtual, Pisa |
Period | 7/09/20 → 9/09/20 |
Bibliographical note
Publisher Copyright:© Amihood Amir, Itai Boneh, Michael Itzhaki, and Eitan Kondratovsky
Funding
Partly supported by ISF grant 1475/18 and BSF grant 2018141.
Funders | Funder number |
---|---|
United States-Israel Binational Science Foundation | 2018141 |
Israel Science Foundation | 1475/18 |
Keywords
- Hamming Distance
- Period Recovery
- Period Recovery Hierarchy