Abstract
XML documents combine features from classical IR systems allowing free text, with explicit structures as in databases. Many query languages have been specially designed for IR applications on XML documents. This work concentrates on a special type of language for which the problem of processing queries including metrical constraints is investigated. The main question is how to define the distance between terms in different locations of the XML tree in an intuitively justifiable way, without jeopardizing the ability to get good retrieval results in terms of recall and precision. A new definition is given and its usefulness is shown on several examples from the INEX collection.
Original language | English |
---|---|
Pages (from-to) | 86-97 |
Number of pages | 12 |
Journal | Journal of the American Society for Information Science and Technology |
Volume | 59 |
Issue number | 1 |
DOIs | |
State | Published - 1 Jan 2008 |