TY - JOUR
T1 - An improved branch-and-bound method for maximum monomial agreement
AU - Eckstein, Jonathan
AU - Goldberg, Noam
PY - 2012/3
Y1 - 2012/3
N2 - The NP-hard maximum monomial agreement problem consists of finding a single logical conjunction that is most consistent with or "best fits" a weighted data set of "positive" and "negative" binary vectors. Computing weighted voting classifiers using boosting methods involves a maximum agreement subproblem at each iteration, although such subproblems are typically solved in practice by heuristic methods. Here, we describe an exact branch-and-bound method for maximum agreement over Boolean monomials, improving on the earlier work of Goldberg and Shan [Goldberg, N., C. Shan. 2007. Boosting optimal logical patterns. Proc. 7th SIAM Internat. Conf. Data Mining, SIAM, Philadelphia, 228-236]. Specifically, we develop a tighter upper bounding function and an improved branching procedure that exploits knowledge of the bound and the particular data set, while having a lower branching factor. Experimental results show that the new method is able to solve larger problem instances and runs faster within a linear programming boosting procedure applied to medium-sized data sets from the UCI Machine Learning Repository. The new algorithm also runs much faster than applying a commercial mixed-integer programming solver, which uses linear programming relaxation-based bounds, to an integer linear programming formulation of the problem.
AB - The NP-hard maximum monomial agreement problem consists of finding a single logical conjunction that is most consistent with or "best fits" a weighted data set of "positive" and "negative" binary vectors. Computing weighted voting classifiers using boosting methods involves a maximum agreement subproblem at each iteration, although such subproblems are typically solved in practice by heuristic methods. Here, we describe an exact branch-and-bound method for maximum agreement over Boolean monomials, improving on the earlier work of Goldberg and Shan [Goldberg, N., C. Shan. 2007. Boosting optimal logical patterns. Proc. 7th SIAM Internat. Conf. Data Mining, SIAM, Philadelphia, 228-236]. Specifically, we develop a tighter upper bounding function and an improved branching procedure that exploits knowledge of the bound and the particular data set, while having a lower branching factor. Experimental results show that the new method is able to solve larger problem instances and runs faster within a linear programming boosting procedure applied to medium-sized data sets from the UCI Machine Learning Repository. The new algorithm also runs much faster than applying a commercial mixed-integer programming solver, which uses linear programming relaxation-based bounds, to an integer linear programming formulation of the problem.
KW - Branch and bound
KW - Combinatorial optimization
KW - Machine Learning
UR - http://www.scopus.com/inward/record.url?scp=84862020960&partnerID=8YFLogxK
U2 - 10.1287/ijoc.1110.0459
DO - 10.1287/ijoc.1110.0459
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
AN - SCOPUS:84862020960
SN - 1091-9856
VL - 24
SP - 328
EP - 341
JO - INFORMS Journal on Computing
JF - INFORMS Journal on Computing
IS - 2
ER -