A corpus-independent feature set for style-based text categorization

M. Koppel, Navot Akiva, I. Dagan

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

We suggest a corpus-independent feature set appropriate for style-based text categorization problems. To achieve this, we introduce a new measure on linguistic features, called stability, which captures the extent to which a language element, such as a word or syntactic construct, is replaceable by semantically equivalent elements. This measure may be perceived as quantifying the degree of available “synonymy” for a language item. We show that frequent but unstable features are especially useful for stylebased text categorization
Original languageAmerican English
Title of host publicationIJCAI
StatePublished - 2003

Bibliographical note

Place of conference:Mexico

Fingerprint

Dive into the research topics of 'A corpus-independent feature set for style-based text categorization'. Together they form a unique fingerprint.

Cite this