Cross-document pattern matching

Tsvi Kopelowitz, Gregory Kucherov, Yakov Nekrich, Tatiana Starikovskaya

Research output: Contribution to journalArticlepeer-review

11 Scopus citations

Abstract

We study a new variant of the pattern matching problem called cross-document pattern matching, which is the problem of indexing a collection of documents to support an efficient search for a pattern in a selected document, where the pattern itself is a substring of another document. Several variants of this problem are considered, and efficient linear space solutions are proposed with query time bounds that either do not depend at all on the pattern size or depend on it in a very limited way (doubly logarithmic). As a side result, we propose an improved solution to the weighted ancestor problem.

Original languageEnglish
Pages (from-to)40-47
Number of pages8
JournalJournal of Discrete Algorithms
Volume24
DOIs
StatePublished - Jan 2014
Externally publishedYes

Bibliographical note

Funding Information:
The authors gratefully acknowledge the help of Travis Gagie who suggested to use a compressed suffix tree in the succinct version of the document reporting problem. T. Starikovskaya has been supported by the mobility grant funded by the French Ministry of Foreign Affairs through the EGIDE agency, by RFBR grant 10-01-93109-CNRS-a , and by Dynasty Foundation .

Keywords

  • Algorithms
  • Document reporting
  • Pattern matching
  • Weighted ancestor problem

Fingerprint

Dive into the research topics of 'Cross-document pattern matching'. Together they form a unique fingerprint.

Cite this