Abstract
We discuss a real-world application of a recently proposed machine learning method for authorship verification. Authorship verification is considered an extremely difficult task in computational text classification, because it does not assume that the correct author of an anonymous text is included in the candidate authors available. To determine whether 2 documents have been written by the same author, the verification method discussed uses repeated feature subsampling and a pool of impostor authors. We use this technique to attribute a newly discovered Latin text from antiquity (the Compendiosa expositio) to Apuleius. This North African writer was one of the most important authors of the Roman Empire in the 2nd century and authored one of the world's first novels. This attribution has profound and wide-reaching cultural value, because it has been over a century since a new text by a major author from antiquity was discovered. This research therefore illustrates the rapidly growing potential of computational methods for studying the global textual heritage.
Original language | English |
---|---|
Pages (from-to) | 239-242 |
Number of pages | 4 |
Journal | Journal of the Association for Information Science and Technology |
Volume | 67 |
Issue number | 1 |
DOIs | |
State | Published - 1 Jan 2016 |
Bibliographical note
Publisher Copyright:© 2015 ASIS&T
Keywords
- authorship
- automatic classification
- computational linguistics
- machine learning
- verification