Semiautomatic construction of cross-period thesaurus

Chaya Liebeskind, Ido Dagan, Jonathan Schler

Research output: Contribution to journalArticlepeer-review

6 Scopus citations

Abstract

A cross-period (diachronic) thesaurus enables users to search for information using modern terminology and obtain semantically related terms from earlier historical periods. The complex task of supporting the construction of a diachronic thesaurus by a domain expert lexicographer has hardly been addressed computationally until now. In this article, we introduce a semiautomatic iterative Query Expansion (QE) scheme for supporting diachronic thesaurus construction, which identifies candidate related terms based on statistical corpus-based measures. We use ancient-modern period classification to increase the performance of the statistical cooccurrence measures and extend our methods to deal with Multi-Word Expressions (MWEs). We demonstrate the empirical benefit of our scheme for a Jewish cross-period thesaurus and evaluate its impact on recall and on the effectiveness of the lexicographer's manual efforts.

Original languageEnglish
Article number22
JournalJournal on Computing and Cultural Heritage
Volume9
Issue number4
DOIs
StatePublished - Dec 2016

Bibliographical note

Publisher Copyright:
© 2016 ACM 1556-4673/2016/12-ART22 $15.00.

Keywords

  • Cultural heritage
  • Diachronic thesaurus
  • Hebrew
  • Semantic similarity

Fingerprint

Dive into the research topics of 'Semiautomatic construction of cross-period thesaurus'. Together they form a unique fingerprint.

Cite this