SVM Model Tampering and Anchored Learning: A case study in Hebrew NP chunking

Yoav Goldberg, Michael Elhadad

Research output: Contribution to journalArticlepeer-review

8 Scopus citations

Abstract

We study the issue of porting a known NLP method to a language with little existing NLP resources, specifically Hebrew SVM-based chunking. We introduce two SVM-based methods - Model Tampering and Anchored Learning. These allow fine grained analysis of the learned SVM models, which provides guidance to identify errors in the training corpus, distinguish the role and interaction of lexical features and eventually construct a model with ∼10% error reduction. The resulting chunker is shown to be robust in the presence of noise in the training corpus, relies on less lexical features than was previously understood and achieves an F-measure performance of 92.2 on automatically PoS-tagged text. The SVM analysis methods also provide general insight on SVM-based chunking. © 2007 Association for Computational Linguistics.

Fingerprint

Dive into the research topics of 'SVM Model Tampering and Anchored Learning: A case study in Hebrew NP chunking'. Together they form a unique fingerprint.

Cite this