Automatically Categorizing Written Texts by Author Gender

Moshe Koppel, Shlomo Argamon, Anat Rachel Shimoni

Research output: Contribution to journalArticlepeer-review

439 Scopus citations


The problem of automatically determining the gender of a document’s author would appear to be a more subtle problem than those of categorization by topic or authorship attribution. Nevertheless, it is shown that automated text categorization techniques can exploit combinations of simple lexical and syntactic features to infer the gender of the author of an unseen formal written document with approximately 80 per cent accuracy. The same techniques can be used to determine if a document is fiction or non-fiction with approximately 98 per cent accuracy.

Original languageEnglish
Pages (from-to)401-412
Number of pages12
JournalLiterary and Linguistic Computing
Issue number4
StatePublished - Nov 2002


Dive into the research topics of 'Automatically Categorizing Written Texts by Author Gender'. Together they form a unique fingerprint.

Cite this