Abstract
The linear least trimmed squares (LTS) estimator is a statistical technique for fitting a linear model to a set of points. It was proposed by Rousseeuw as a robust alternative to the classical least squares estimator. Given a set of n points in ℝd, the objective is to minimize the sum of the smallest 50% squared residuals (or more generally any given fraction). There exist practical heuristics for computing the linear LTS estimator, but they provide no guarantees on the accuracy of the final result. Two results are presented. First, a measure of the numerical condition of a set of points is introduced. Based on this measure, a probabilistic analysis of the accuracy of the best LTS fit resulting from a set of random elemental fits is presented. This analysis shows that as the condition of the point set improves, the accuracy of the resulting fit also increases. Second, a new approximation algorithm for LTS, called Adaptive-LTS, is described. Given bounds on the minimum and maximum slope coefficients, this algorithm returns an approximation to the optimal LTS fit whose slope coefficients lie within the given bounds. Empirical evidence of this algorithm's efficiency and effectiveness is provided for a variety of data sets.
Original language | English |
---|---|
Pages (from-to) | 148-170 |
Number of pages | 23 |
Journal | Computational Statistics and Data Analysis |
Volume | 99 |
DOIs | |
State | Published - 1 Jul 2016 |
Bibliographical note
Publisher Copyright:© 2016 Elsevier B.V. All rights reserved.
Funding
We would like to thank Peter Rousseeuw for providing us with the DPOSS data set. Also, we are very grateful to the anonymous referees for their informative and helpful comments. The work of D.M. Mount has been partially supported by NSF grant CCF-1117259 and ONR grant N00014-08-1-1015 . Appendix A
Funders | Funder number |
---|---|
National Science Foundation | CCF-1117259 |
Office of Naval Research | N00014-08-1-1015 |
Keywords
- Approximation algorithms
- Computational geometry
- Least trimmed squares
- Linear estimation
- Robust estimation