Compression of correlated bit-vectors

A. Bookstein, S. T. Klein

Research output: Contribution to journalArticlepeer-review

39 Scopus citations

Abstract

Bitmaps are data structures occurring often in information retrieval. They are useful, but are also large and expensive to store. For this reason, considerable effort has been devoted to finding techniques for compressing them. These techniques are most effective for sparse bitmaps. We propose a preprocessing stage, in which bitmaps are first clustered and the clusters used to transform their member bitmaps into sparser ones, that can be more effectively compressed. The clustering method efficiently generates a graph structure on the bitmaps. In some situations, it is desired to impose restrictions on the graph; finding the optimal graph satisfying these restrictions is shown to be NP-complete. The results of applying our algorithm to the Bible is presented: for some sets of bitmaps, our method almost doubled in the compression savings.

Original languageEnglish
Pages (from-to)387-400
Number of pages14
JournalInformation Systems
Volume16
Issue number4
DOIs
StatePublished - 1991
Externally publishedYes

Keywords

  • Data compression
  • bitmap compression
  • clustering:application
  • data storage
  • maximum spanning tree:application

Fingerprint

Dive into the research topics of 'Compression of correlated bit-vectors'. Together they form a unique fingerprint.

Cite this