Mining and using key-words and key-phrases to identify the era of an anonymous text

Dror Mughaz, Yaakov Hacohen-Kerner, Dov Gabbay

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

4 Scopus citations

Abstract

This study is trying to determine the time-frame in which the author of a given document lived. The documents are rabbinic documents written in Hebrew-Aramaic languages. The documents are undated and do not contain a bibliographic section, which leaves us with an interesting challenge. To do this, we define a set of key-phrases and formulate various types of rules: “Iron-clad”, Heuristic and Greedy, to define the time-frame. These rules are based on key-phrases and key-words in the documents of the authors. Identifying the time-frame of an author can help us determine the generation in which specific documents were written, can help in the examination of documents, i.e., to conclude if documents were edited, and can also help us identify an anonymous author. We tested these rules on two corpora containing responsa documents. The results are promising and are better for the larger corpus than for the smaller corpus.

Original languageEnglish
Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
EditorsNgoc Thanh Nguyen, Ryszard Kowalczyk, Alexandre Miguel Pinto, Jorge Cardoso, Jorge Cardoso
PublisherSpringer Verlag
Pages119-143
Number of pages25
ISBN (Print)9783319592671
DOIs
StatePublished - 2017
Event1st International KEYSTONE Conference, IKC 2015 - Coimbra, Portugal
Duration: 8 Sep 20159 Sep 2015

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume10190
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference1st International KEYSTONE Conference, IKC 2015
Country/TerritoryPortugal
CityCoimbra
Period8/09/159/09/15

Bibliographical note

Publisher Copyright:
© Springer International Publishing AG 2017.

Keywords

  • Hebrew-Aramaic documents
  • Key-phrases
  • Key-words Knowledge discovery
  • Text mining
  • Time analysis
  • Undated documents Undated references

Fingerprint

Dive into the research topics of 'Mining and using key-words and key-phrases to identify the era of an anonymous text'. Together they form a unique fingerprint.

Cite this