TY - JOUR
T1 - Models of bitmap generation
T2 - A systematic approach to bitmap compression
AU - Bookstein, Abraham
AU - Klein, Shmuel T.
PY - 1992
Y1 - 1992
N2 - In large IR systems, information about word occurrence may be stored in the form of a bit matrix, with rows corresponding to different words and columns to documents. Such a matrix is generally very large and very sparse. New methods for compressing such matrices are presented, which exploit possible correlations between rows and between columns. The methods are based on partitioning the matrix into small blocks and predicting the 1-bit distribution within a block by means of various bit generation models. Each block is then encoded using Huffman or arithmetic coding. The methods also use a new way of enumerating subsets of fixed size from a given superset. Preliminary experimental results indicate improvements over previous methods.
AB - In large IR systems, information about word occurrence may be stored in the form of a bit matrix, with rows corresponding to different words and columns to documents. Such a matrix is generally very large and very sparse. New methods for compressing such matrices are presented, which exploit possible correlations between rows and between columns. The methods are based on partitioning the matrix into small blocks and predicting the 1-bit distribution within a block by means of various bit generation models. Each block is then encoded using Huffman or arithmetic coding. The methods also use a new way of enumerating subsets of fixed size from a given superset. Preliminary experimental results indicate improvements over previous methods.
UR - http://www.scopus.com/inward/record.url?scp=28044448250&partnerID=8YFLogxK
U2 - 10.1016/0306-4573(92)90065-8
DO - 10.1016/0306-4573(92)90065-8
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
AN - SCOPUS:28044448250
SN - 0306-4573
VL - 28
SP - 735
EP - 748
JO - Information Processing and Management
JF - Information Processing and Management
IS - 6
ER -