Direct loss minimization for structured prediction

David McAllester, Tamir Hazan, Joseph Keshet

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

81 Scopus citations

Abstract

In discriminative machine learning one is interested in training a system to optimize a certain desired measure of performance, or loss. In binary classification one typically tries to minimizes the error rate. But in structured prediction each task often has its own measure of performance such as the BLEU score in machine translation or the intersection-over-union score in PASCAL segmentation. The most common approaches to structured prediction, structural SVMs and CRFs, do not minimize the task loss: the former minimizes a surrogate loss with no guarantees for task loss and the latter minimizes log loss independent of task loss. The main contribution of this paper is a theorem stating that a certain perceptronlike learning rule, involving features vectors derived from loss-adjusted inference, directly corresponds to the gradient of task loss. We give empirical results on phonetic alignment of a standard test set from the TIMIT corpus, which surpasses all previously reported results on this problem.

Original languageEnglish
Title of host publicationAdvances in Neural Information Processing Systems 23
Subtitle of host publication24th Annual Conference on Neural Information Processing Systems 2010, NIPS 2010
PublisherNeural Information Processing Systems
ISBN (Print)9781617823800
StatePublished - 2010
Externally publishedYes
Event24th Annual Conference on Neural Information Processing Systems 2010, NIPS 2010 - Vancouver, BC, Canada
Duration: 6 Dec 20109 Dec 2010

Publication series

NameAdvances in Neural Information Processing Systems 23: 24th Annual Conference on Neural Information Processing Systems 2010, NIPS 2010

Conference

Conference24th Annual Conference on Neural Information Processing Systems 2010, NIPS 2010
Country/TerritoryCanada
CityVancouver, BC
Period6/12/109/12/10

Fingerprint

Dive into the research topics of 'Direct loss minimization for structured prediction'. Together they form a unique fingerprint.

Cite this