Abstract
In this paper, we show that stylistic text features can be exploited to determine an anonymous author's native language with high accuracy. Specifically, we first use automatic tools to ascertain frequencies of various stylistic idiosyncrasies in a text. These frequencies then serve as features for support vector machines that learn to classify texts according to author native language.
Original language | English |
---|---|
Pages | 624-628 |
Number of pages | 5 |
DOIs | |
State | Published - 2005 |
Event | KDD-2005: 11th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - Chicago, IL, United States Duration: 21 Aug 2005 → 24 Aug 2005 |
Conference
Conference | KDD-2005: 11th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining |
---|---|
Country/Territory | United States |
City | Chicago, IL |
Period | 21/08/05 → 24/08/05 |
Keywords
- Author profiling
- Text mining