TY - GEN
T1 - An optimal bloom filter replacement based on matrix solving
AU - Porat, Ely
PY - 2009
Y1 - 2009
N2 - We suggest a method for holding a dictionary data structure, which maps keys to values, in the spirit of Bloom Filters. The space requirements of the dictionary we suggest are much smaller than those of a hashtable. We allow storing n keys, each mapped to value which is a string of k bits. Our suggested method requires nk + o(n) bits space to store the dictionary, and O(n) time to produce the data structure, and allows answering a membership query in O(1) memory probes. The dictionary size does not depend on the size of the keys. However, reducing the space requirements of the data structure comes at a certain cost. Our dictionary has a small probability of a one sided error. When attempting to obtain the value for a key that is stored in the dictionary we always get the correct answer. However, when testing for membership of an element that is not stored in the dictionary, we may get an incorrect answer, and when requesting the value of such an element we may get a certain random value. Our method is based on solving equations in GF(2 k ) and using several hash functions. Another significant advantage of our suggested method is that we do not require using sophisticated hash functions. We only require pairwise independent hash functions. We also suggest a data structure that requires only nk bits space, has O(n 2) preprocessing time, and has a O(logn) query time. However, this data structures requires a uniform hash functions. In order replace a Bloom Filter of n elements with an error proability of 2 - k, we require nk + o(n) memory bits, O(1) query time, O(n) preprocessing time, and only pairwise independent hash function. Even the most advanced previously known Bloom Filter would require nk + O(n) space, and a uniform hash functions, so our method is significantly less space consuming especially when k is small. Our suggested dictionary can replace Bloom Filters, and has many applications. A few application examples are dictionaries for storing bad passwords, differential files in databases, Internet caching and distributed storage systems.
AB - We suggest a method for holding a dictionary data structure, which maps keys to values, in the spirit of Bloom Filters. The space requirements of the dictionary we suggest are much smaller than those of a hashtable. We allow storing n keys, each mapped to value which is a string of k bits. Our suggested method requires nk + o(n) bits space to store the dictionary, and O(n) time to produce the data structure, and allows answering a membership query in O(1) memory probes. The dictionary size does not depend on the size of the keys. However, reducing the space requirements of the data structure comes at a certain cost. Our dictionary has a small probability of a one sided error. When attempting to obtain the value for a key that is stored in the dictionary we always get the correct answer. However, when testing for membership of an element that is not stored in the dictionary, we may get an incorrect answer, and when requesting the value of such an element we may get a certain random value. Our method is based on solving equations in GF(2 k ) and using several hash functions. Another significant advantage of our suggested method is that we do not require using sophisticated hash functions. We only require pairwise independent hash functions. We also suggest a data structure that requires only nk bits space, has O(n 2) preprocessing time, and has a O(logn) query time. However, this data structures requires a uniform hash functions. In order replace a Bloom Filter of n elements with an error proability of 2 - k, we require nk + o(n) memory bits, O(1) query time, O(n) preprocessing time, and only pairwise independent hash function. Even the most advanced previously known Bloom Filter would require nk + O(n) space, and a uniform hash functions, so our method is significantly less space consuming especially when k is small. Our suggested dictionary can replace Bloom Filters, and has many applications. A few application examples are dictionaries for storing bad passwords, differential files in databases, Internet caching and distributed storage systems.
UR - http://www.scopus.com/inward/record.url?scp=70350339307&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-03351-3_25
DO - 10.1007/978-3-642-03351-3_25
M3 - ???researchoutput.researchoutputtypes.contributiontobookanthology.conference???
AN - SCOPUS:70350339307
SN - 3642033504
SN - 9783642033506
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 263
EP - 273
BT - Computer Science - Theory and Applications - 4th International Computer Science Symposium in Russia, CSR 2009, Proceedings
T2 - 4th International Computer Science Symposium in Russia, CSR 2009
Y2 - 18 August 2009 through 23 August 2009
ER -