Abstract
The importance of hypertext has been steadily growing over the past decade. The Internet and other information systems use hypertext format, with data organized associatively rather than sequentially or relationally. A myriad of textual problems have been considered in the pattern matching field with many nontrivial results. Nevertheless, surprisingly little work has been done on the natural combination of pattern matching and hypertext. In contrast to regular text, hypertext has a nonlinear structure and the techniques of pattern matching for text cannot be directly applied to hypertext. Manber and Wu (1992, "IAPR Workshop on Structural and Syntactic Pattern Recognition, Bern, Switzerland") pioneered the study of pattern matching in hypertext and defined a hypertext model for pattern matching. Akutsu (1993, "Procedures of the 4th Symposium on Combinatorial Pattern Matching, Podova, Italy," pp. 1-10) developed an algorithm that can be used for exact pattern matching in a tree-structured hypertext. Park and Kim (1995, "6th Symposium on Combinatorial Pattern Matching, Helsinki, Finland") considered regular pattern matching in hypertext. They developed a complex algorithm that works for hypertext with an underlying structure of a DAG. In this paper we present a much simpler algorithm achieving the same complexity which runs on any hypertext graph. We then extend the problem to approximate pattern matching in hypertext, first considering hamming distance and then edit distance. We show that in contrast to regular text, it does make a difference whether the errors occur in the hypertext or the pattern. The approximate pattern matching problem in hypertext with errors in the hypertext turns out to be script N sign℘-complete and the approximate pattern matching problem in hypertext with errors in the pattern has a polynomial time solution.
| Original language | English |
|---|---|
| Pages (from-to) | 82-99 |
| Number of pages | 18 |
| Journal | Journal of Algorithms |
| Volume | 35 |
| Issue number | 1 |
| DOIs | |
| State | Published - Apr 2000 |
Bibliographical note
Funding Information:1A preliminary version of this paper appeared in WADS 97. 2Partially supported by NSF Grant CCR-96-101709, BSF Grant 96-00509, and a Bar-Ilan University Internal Research Grant. 3Partially supported by the Israel Ministry of Science and the Arts Grant 8560. This work is part of this author’s Ph.D. dissertation.
Funding
1A preliminary version of this paper appeared in WADS 97. 2Partially supported by NSF Grant CCR-96-101709, BSF Grant 96-00509, and a Bar-Ilan University Internal Research Grant. 3Partially supported by the Israel Ministry of Science and the Arts Grant 8560. This work is part of this author’s Ph.D. dissertation.
| Funders | Funder number |
|---|---|
| National Science Foundation | CCR-96-101709 |
| United States-Israel Binational Science Foundation | 96-00509 |
| Bar-Ilan University | |
| Ministry of science and technology, Israel |
Keywords
- Combinatorial algorithms on words
- Design and analysis of algorithms
- Hypertext
- Pattern matching
- Pattern matching on hypertext
Fingerprint
Dive into the research topics of 'Pattern Matching in Hypertext'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver