A unified scheme for generalizing cardinality estimators to sum aggregation

Reuven Cohen, Liran Katzir, Aviv Yehezkel

Research output: Contribution to journalArticlepeer-review

9 Scopus citations

Abstract

Cardinality estimation algorithms receive a stream of elements that may appear in arbitrary order, with possible repetitions, and return the number of distinct elements. Such algorithms usually seek to minimize the required storage at the price of inaccuracy in their output. This paper shows how to generalize every cardinality estimation algorithm that relies on extreme order statistics (min/max sketches) to a weighted version, where each item is associated with a weight and the goal is to estimate the total sum of weights. The proposed unified scheme uses the unweighted estimator as a black-box, and manipulates the input using properties of the beta distribution.

Original languageEnglish
Pages (from-to)336-342
Number of pages7
JournalInformation Processing Letters
Volume115
Issue number2
DOIs
StatePublished - Feb 2015
Externally publishedYes

Bibliographical note

Publisher Copyright:
© 2014 Elsevier B.V. All rights reserved.

Funding

The research leading to these results has received funding from the European Community's Seventh Framework Programme (FP7/2007–2013) under grant agreement No. 610802 .

FundersFunder number
Seventh Framework Programme610802
Seventh Framework Programme

    Keywords

    • Algorithms
    • Big data processing
    • Statistical

    Fingerprint

    Dive into the research topics of 'A unified scheme for generalizing cardinality estimators to sum aggregation'. Together they form a unique fingerprint.

    Cite this