Abstract
Cardinality estimation algorithms receive a stream of elements that may appear in arbitrary order, with possible repetitions, and return the number of distinct elements. Such algorithms usually seek to minimize the required storage at the price of inaccuracy in their output. This paper shows how to generalize every cardinality estimation algorithm that relies on extreme order statistics (min/max sketches) to a weighted version, where each item is associated with a weight and the goal is to estimate the total sum of weights. The proposed unified scheme uses the unweighted estimator as a black-box, and manipulates the input using properties of the beta distribution.
Original language | English |
---|---|
Pages (from-to) | 336-342 |
Number of pages | 7 |
Journal | Information Processing Letters |
Volume | 115 |
Issue number | 2 |
DOIs | |
State | Published - Feb 2015 |
Externally published | Yes |
Bibliographical note
Funding Information:The research leading to these results has received funding from the European Community's Seventh Framework Programme (FP7/2007–2013) under grant agreement No. 610802 .
Publisher Copyright:
© 2014 Elsevier B.V. All rights reserved.
Keywords
- Algorithms
- Big data processing
- Statistical