Period recovery of strings over the Hamming and edit distances

Amihood Amir, Mika Amit, Gad M. Landau, Dina Sokol

Research output: Contribution to journalArticlepeer-review

7 Scopus citations

Abstract

A string T of length m is periodic in P of length p if P is a substring of T and T[i]=T[i+p] for all 0≤i≤m−p−1 and m≥2p. The shortest such prefix, P, is called the period of T (i.e., P=T[0..p−1]). In this paper we investigate the period recovery problem. Given a string S of length n, find the primitive period(s) P such that the distance between S and a string T that is periodic in P is below a threshold τ. We consider the period recovery problem over both the Hamming distance and the edit distance. For the Hamming distance case, we present an O(nlog⁡n)-time algorithm, where τ is given as ⌊[Formula presented]⌋ for ϵ>0. For the edit distance case, τ=⌊[Formula presented]⌋ and ϵ>0, we provide an O(n4/3)-time algorithm.

Original languageEnglish
Pages (from-to)2-18
Number of pages17
JournalTheoretical Computer Science
Volume710
DOIs
StatePublished - 1 Feb 2018

Bibliographical note

Publisher Copyright:
© 2017 Elsevier B.V.

Funding

We thank the anonymous referees for their comments and suggestions. Amihood Amir was partially supported by the Israel Science Foundation grant 571/14 , and grant No. 2014028 from the United States-Israel Binational Science Foundation (BSF). Mika Amit and Gad M. Landau were partially supported by the Israel Science Foundation grant 571/14 , grant No. 2014028 from the United States-Israel Binational Science Foundation (BSF) and DFG . Dina Sokol was partially supported by the United States-Israel Binational Science Foundation (BSF) grant No. 2014028 .

FundersFunder number
Deutsche Forschungsgemeinschaft
United States-Israel Binational Science Foundation
Israel Science Foundation571/14, 2014028

    Keywords

    • Approximate periodicity
    • Edit distance
    • Hamming distance
    • Period recovery

    Fingerprint

    Dive into the research topics of 'Period recovery of strings over the Hamming and edit distances'. Together they form a unique fingerprint.

    Cite this