Concentration Inequalities for Random Sets

E. Segal-Halevi, A. Hassidim

Research output: Working paper / PreprintPreprint

Abstract

In a large, possibly infinite population, each subject is colored red with probability p, independently of the others. Then, a finite sub-population is selected, possibly as a function of the coloring. The imbalance in the sub-population is defined as the difference between the number of reds in it and p times its size. This paper presents high-probability upper bounds (tail-bounds) on this imbalance. To present the upper bounds we define the *UI dimension* --- a new measure for the richness of a set-family. We present three simple rules for upper-bounding the UI dimension of a set-family. Our upper bounds on the imbalance in a sub-population depend only on the size of the sub-population and on the UI dimension of its support. We relate our results to known concepts from machine learning, particularly the VC dimension and Rademacher complexity.
Original languageEnglish
Number of pages13
Volume8580
DOIs
StatePublished - 27 Dec 2016

Publication series

NamearXiv preprint arXiv:1612.,

Fingerprint

Dive into the research topics of 'Concentration Inequalities for Random Sets'. Together they form a unique fingerprint.

Cite this