Modeling word occurrences for the compression of concordances

A. Bookstein, S. T. Klein, T. Raita

Research output: Contribution to journalConference articlepeer-review

5 Scopus citations

Abstract

Effective compression of a text based information retrieval system involves compression not only of the text itself but also of the concordance by which one accesses that text and which occupies an amount of storage comparable to the text itself. The concordance can be complicated especially if it permits hierarchical access to the database. But one or more components of the hierarchy can be usually conceptualized as a bitmap. In a given state, a bitmap value of zero or one is generated and governed by the transition probabilities of the model. This model has been referred to as a Hidden Markov Model and is difficult to approximate. Results are obtained and they show that they can represent an improvement to compress concordances.

Original languageEnglish
Pages (from-to)462
Number of pages1
JournalProceedings of the Data Compression Conference
StatePublished - 1995
Externally publishedYes
EventProceedings of the 5th Data Compression Conference - Snowbird, UT, USA
Duration: 28 Mar 199530 Mar 1995

Fingerprint

Dive into the research topics of 'Modeling word occurrences for the compression of concordances'. Together they form a unique fingerprint.

Cite this