TY - JOUR

T1 - An improved branch-and-bound method for maximum monomial agreement

AU - Eckstein, Jonathan

AU - Goldberg, Noam

PY - 2012/3

Y1 - 2012/3

N2 - The NP-hard maximum monomial agreement problem consists of finding a single logical conjunction that is most consistent with or "best fits" a weighted data set of "positive" and "negative" binary vectors. Computing weighted voting classifiers using boosting methods involves a maximum agreement subproblem at each iteration, although such subproblems are typically solved in practice by heuristic methods. Here, we describe an exact branch-and-bound method for maximum agreement over Boolean monomials, improving on the earlier work of Goldberg and Shan [Goldberg, N., C. Shan. 2007. Boosting optimal logical patterns. Proc. 7th SIAM Internat. Conf. Data Mining, SIAM, Philadelphia, 228-236]. Specifically, we develop a tighter upper bounding function and an improved branching procedure that exploits knowledge of the bound and the particular data set, while having a lower branching factor. Experimental results show that the new method is able to solve larger problem instances and runs faster within a linear programming boosting procedure applied to medium-sized data sets from the UCI Machine Learning Repository. The new algorithm also runs much faster than applying a commercial mixed-integer programming solver, which uses linear programming relaxation-based bounds, to an integer linear programming formulation of the problem.

AB - The NP-hard maximum monomial agreement problem consists of finding a single logical conjunction that is most consistent with or "best fits" a weighted data set of "positive" and "negative" binary vectors. Computing weighted voting classifiers using boosting methods involves a maximum agreement subproblem at each iteration, although such subproblems are typically solved in practice by heuristic methods. Here, we describe an exact branch-and-bound method for maximum agreement over Boolean monomials, improving on the earlier work of Goldberg and Shan [Goldberg, N., C. Shan. 2007. Boosting optimal logical patterns. Proc. 7th SIAM Internat. Conf. Data Mining, SIAM, Philadelphia, 228-236]. Specifically, we develop a tighter upper bounding function and an improved branching procedure that exploits knowledge of the bound and the particular data set, while having a lower branching factor. Experimental results show that the new method is able to solve larger problem instances and runs faster within a linear programming boosting procedure applied to medium-sized data sets from the UCI Machine Learning Repository. The new algorithm also runs much faster than applying a commercial mixed-integer programming solver, which uses linear programming relaxation-based bounds, to an integer linear programming formulation of the problem.

KW - Branch and bound

KW - Combinatorial optimization

KW - Machine Learning

UR - http://www.scopus.com/inward/record.url?scp=84862020960&partnerID=8YFLogxK

U2 - 10.1287/ijoc.1110.0459

DO - 10.1287/ijoc.1110.0459

M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???

AN - SCOPUS:84862020960

SN - 1091-9856

VL - 24

SP - 328

EP - 341

JO - INFORMS Journal on Computing

JF - INFORMS Journal on Computing

IS - 2

ER -