TY - JOUR
T1 - Extraction of time-related expressions using text mining with application to Hebrew
AU - Mughaz, Dror
AU - HaCohen-Kerner, Yaakov
AU - Gabbay, Dov
N1 - Publisher Copyright:
© 2024 Public Library of Science. All rights reserved.
PY - 2024/2
Y1 - 2024/2
N2 - In this research, we extract time-related expressions from a rabbinic text in a semi-automatic manner. These expressions usually appear next to rabbinic references (name / nickname / acronym / book-name). The first step toward our goal is to find all the expressions near references in the corpus. However, not all of the phrases around the references are time-related expressions. Therefore, these phrases are initially considered to be potential time-related expressions. To extract the time-related expressions, we formulate two new statistical functions, and we use screening and heuristic methods. We tested these statistical functions, grammatical screenings, and heuristic methods on a corpus containing responsa documents. In this corpus, many rabbinic citations are known and marked. The statistical functions and the screening methods filtered the potential time-related expressions and reduced 99.88% of the initial expressions (from 484,681 to 575).
AB - In this research, we extract time-related expressions from a rabbinic text in a semi-automatic manner. These expressions usually appear next to rabbinic references (name / nickname / acronym / book-name). The first step toward our goal is to find all the expressions near references in the corpus. However, not all of the phrases around the references are time-related expressions. Therefore, these phrases are initially considered to be potential time-related expressions. To extract the time-related expressions, we formulate two new statistical functions, and we use screening and heuristic methods. We tested these statistical functions, grammatical screenings, and heuristic methods on a corpus containing responsa documents. In this corpus, many rabbinic citations are known and marked. The statistical functions and the screening methods filtered the potential time-related expressions and reduced 99.88% of the initial expressions (from 484,681 to 575).
UR - http://www.scopus.com/inward/record.url?scp=85185786417&partnerID=8YFLogxK
U2 - 10.1371/journal.pone.0293196
DO - 10.1371/journal.pone.0293196
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
C2 - 38394097
AN - SCOPUS:85185786417
SN - 1932-6203
VL - 19
JO - PLoS ONE
JF - PLoS ONE
IS - 2 February
M1 - e0293196
ER -