Efficient randomized algorithms for the repeated median line estimator

Jiri Matousek, David M. Mount, Nathan S. Netanyahu

Research output: Contribution to conferencePaperpeer-review

7 Scopus citations

Abstract

The problem of fitting a straight line to a finite collection of points in the plane is an important problem in statistical estimation. Recently there has been a great deal of interests is robust estimators, because of their lack of sensitivity to outlying data points. The basic measure of the robustness of an estimator is its breakdown point, intuitively, the fraction (up to 50% percent) of outlying data points that can corrupt the estimator. One problem with robust estimators is that achieving high breakdown points (near 50%) has proved to be computationally demanding. In this paper we present the best known theoretical algorithm and the first practical subquadratic algorithm for computing a 50% breakdown point estimator: the Siegel, or repeated median, estimator. We first present an O(n log n) randomized expected time algorithm, where n is the number of given points. This algorithm relies, however, on sophisticated data structures. We also present a very simple O(n log2 n) randomized algorithm for this problem, which uses no complex data structures. We provide empirical evidence that, for many realistic input distributions, the running time of this second algorithm is actually O(n log n) expected time.

Original languageEnglish
Pages74-82
Number of pages9
StatePublished - 1993
Externally publishedYes
EventProceedings of the Fourth Annual ACM-SIAM Symposium on Discrete Algorithms - Austin, TX, USA
Duration: 25 Jan 199327 Jan 1993

Conference

ConferenceProceedings of the Fourth Annual ACM-SIAM Symposium on Discrete Algorithms
CityAustin, TX, USA
Period25/01/9327/01/93

Fingerprint

Dive into the research topics of 'Efficient randomized algorithms for the repeated median line estimator'. Together they form a unique fingerprint.

Cite this