TY - GEN

T1 - Boosting optimal logical patterns using noisy data

AU - Goldberg, Noam

AU - Shan, Chung Chieh

PY - 2007

Y1 - 2007

N2 - We consider the supervised learning of a binary classifier from noisy observations. We use smooth boosting to linearly combine abstaining hypotheses, each of which maps a subcube of the attribute space to one of the two classes. We introduce a new branch-and-bound weak learner to maximize the agreement rate of each hypothesis. Dobkin et al. give an algorithm for maximizing agreement with real-valued attributes [9]. Our algorithm improves on the time complexity of Dobkin et al.'s as long as the data can be binarized so that the number of binary attributes is o(log of the number of observations x number of real-valued attributes). Furthermore, we have fine-tuned our branch-and-bound algorithm with a queuing discipline and optimality gap to make it fast in practice. Finally, since logical patterns in Hammer et al.'s Logical Analysis of Data (LAD) framework [8, 6] are equivalent to abstaining monomial hypotheses, any boosting algorithm can be combined with our proposed weak learner to construct LAD models. On various data sets, our method outperforms state-of-the-art methods that use suboptimal or heuristic weak learners, such as SLIPPER. It is competitive with other optimizing classifiers that combine monomials, such as LAD. Compared to LAD, our method eliminates many free parameters that restrict the hypothesis space and require extensive fine-tuning by cross-validation.

AB - We consider the supervised learning of a binary classifier from noisy observations. We use smooth boosting to linearly combine abstaining hypotheses, each of which maps a subcube of the attribute space to one of the two classes. We introduce a new branch-and-bound weak learner to maximize the agreement rate of each hypothesis. Dobkin et al. give an algorithm for maximizing agreement with real-valued attributes [9]. Our algorithm improves on the time complexity of Dobkin et al.'s as long as the data can be binarized so that the number of binary attributes is o(log of the number of observations x number of real-valued attributes). Furthermore, we have fine-tuned our branch-and-bound algorithm with a queuing discipline and optimality gap to make it fast in practice. Finally, since logical patterns in Hammer et al.'s Logical Analysis of Data (LAD) framework [8, 6] are equivalent to abstaining monomial hypotheses, any boosting algorithm can be combined with our proposed weak learner to construct LAD models. On various data sets, our method outperforms state-of-the-art methods that use suboptimal or heuristic weak learners, such as SLIPPER. It is competitive with other optimizing classifiers that combine monomials, such as LAD. Compared to LAD, our method eliminates many free parameters that restrict the hypothesis space and require extensive fine-tuning by cross-validation.

UR - http://www.scopus.com/inward/record.url?scp=49549092654&partnerID=8YFLogxK

U2 - 10.1137/1.9781611972771.21

DO - 10.1137/1.9781611972771.21

M3 - ???researchoutput.researchoutputtypes.contributiontobookanthology.conference???

AN - SCOPUS:49549092654

SN - 9780898716306

T3 - Proceedings of the 7th SIAM International Conference on Data Mining

SP - 228

EP - 236

BT - Proceedings of the 7th SIAM International Conference on Data Mining

PB - Society for Industrial and Applied Mathematics Publications

T2 - 7th SIAM International Conference on Data Mining

Y2 - 26 April 2007 through 28 April 2007

ER -