A unified scheme for generalizing cardinality estimators to sum aggregation

Reuven Cohen, Liran Katzir, Aviv Yehezkel

Research output: Contribution to journalArticlepeer-review

5 Scopus citations

Abstract

Cardinality estimation algorithms receive a stream of elements that may appear in arbitrary order, with possible repetitions, and return the number of distinct elements. Such algorithms usually seek to minimize the required storage at the price of inaccuracy in their output. This paper shows how to generalize every cardinality estimation algorithm that relies on extreme order statistics (min/max sketches) to a weighted version, where each item is associated with a weight and the goal is to estimate the total sum of weights. The proposed unified scheme uses the unweighted estimator as a black-box, and manipulates the input using properties of the beta distribution.

Original languageEnglish
Pages (from-to)336-342
Number of pages7
JournalInformation Processing Letters
Volume115
Issue number2
DOIs
StatePublished - Feb 2015
Externally publishedYes

Bibliographical note

Funding Information:
The research leading to these results has received funding from the European Community's Seventh Framework Programme (FP7/2007–2013) under grant agreement No. 610802 .

Publisher Copyright:
© 2014 Elsevier B.V. All rights reserved.

Keywords

  • Algorithms
  • Big data processing
  • Statistical

Fingerprint

Dive into the research topics of 'A unified scheme for generalizing cardinality estimators to sum aggregation'. Together they form a unique fingerprint.

Cite this