Abstract
This article presents a unique method in text and data mining for finding the era, i.e., mining temporal data, in which an anonymous author was living. Finding this era can assist in the examination of a fake document or extracting the time period in which a writer lived. The study and the experiments concern Hebrew, and in some parts, Aramaic and Yiddish rabbinic texts. The rabbinic texts are undated and contain no bibliographic sections, posing an interesting challenge. This work proposes algorithms using key phrases and key words that allowthe temporal organization of citations together with linguistic patterns. Based on these key phrases, key words, and the references, we established several types of "Iron-clad," Heuristic and Greedy rules for estimating the years of birth and death of a writer in an interesting classification task. Experiments were conducted on corpora, including documents authored by 12, 24, and 36 rabbinic writers and demonstrated promising results.
Original language | English |
---|---|
Article number | A7 |
Journal | ACM Transactions on Knowledge Discovery from Data |
Volume | 13 |
Issue number | 1 |
DOIs | |
State | Published - Jan 2019 |
Bibliographical note
Publisher Copyright:© 2019 Copyright is held by the owner/author(s).
Keywords
- Hebrew-Aramaic documents
- Temporal-data
- key-phrases
- knowledge discovery
- text and data mining
- time analysis
- undated documents