Improved hierarchical bit-vector compression in document retrieval systems

Y. Choueka, A. S. Fraenkel, S. T. Klein, E. Segal

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

35 Scopus citations

Abstract

The "concordance" of an information retrieval system can often be stored in form of bit-maps, which are usually very sparse and should be compressed. Hierarchical bit-vector compression consists of partitioning a vector into equi-sized blocks, constructing a new bit- vector Vi+1which points to the non-zero blocks in Vi, dropping the zero-blocks of Viand repeating the process for Vi+1- We refine the method by pruning some of the tree branches if they ultimately point to very few documents; these document numbers are then added to an appended list which is compressed by the prefix-omission technique. The new method was thoroughly tested on the bit-maps of the Responsa Retrieval Project, and gave a relative improvement of about 40% over the conventional hierarchical compression method.

Original languageEnglish
Title of host publicationProceedings of the 9th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 1986
EditorsFausto Rabitti
PublisherAssociation for Computing Machinery, Inc
Pages88-96
Number of pages9
ISBN (Electronic)0897911873, 9780897911870
DOIs
StatePublished - 1 Sep 1986
Event9th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 1986 - Pisa, Italy
Duration: 8 Sep 198610 Sep 1986

Publication series

NameProceedings of the 9th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 1986

Conference

Conference9th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 1986
Country/TerritoryItaly
CityPisa
Period8/09/8610/09/86

Bibliographical note

Publisher Copyright:
© Organization of the 1986-ACM Conference on Research and Development in Information Retrievel.

Fingerprint

Dive into the research topics of 'Improved hierarchical bit-vector compression in document retrieval systems'. Together they form a unique fingerprint.

Cite this