Full text document retrieval: Hebrew legal texts (Report on the first phase of the responsa retrieval project)

Y. Choueka, M. Cohen, J. Dueck, A. S. Fraenkel, M. Slae

Research output: Contribution to conferencePaperpeer-review

18 Scopus citations

Abstract

A full text retrieval system was designed for the responsa literature, which is a large corpus of Hebrew legal cases. The unique problems of the data base - mixture of Hebrew, Aramaic and vernaculars, lack of vowels and punctuation, extreme language inflection problems, homographs, existence of thousands of grammatical variants of any given keyword - dictated development of new methods. Among them we list "grammatical synthesis", which synthesizes all grammatical variants of a given keyword; "Compact KWIC", which enables the user to have a glimpse of the nature of the search before having performed it; effective citation index imbedded in full text searches; and, in general, extensive use of both positive and negative feedback within a single search run. A number of searches performed on a relatively small data base gave in each case a recall of 100%. The average precision was 34%. A KWIC of strategic portions of retrieved documents usually enables a quick disposal of non-relevant material.

Original languageEnglish
Pages61-79
Number of pages19
StatePublished - 1 Apr 1971
Event1971 Annual ACM Conference on Research and Development in Information Retrieval, SIGIR 1971 - College Park, United States
Duration: 1 Apr 19712 Apr 1971

Conference

Conference1971 Annual ACM Conference on Research and Development in Information Retrieval, SIGIR 1971
Country/TerritoryUnited States
CityCollege Park
Period1/04/712/04/71

Bibliographical note

Publisher Copyright:
© 1971 ACM.

Funding

(i) Department of Mathematics, Bar-llan University. (2) The Inter-Kibbutz Computer Center, Tel-Avlv. (3) Institute of Research in Jewish Law, The Hebrew University of Jerusalem. (4) Department of Applied Mathematics, The Welzmann Institute of Sclence, and Depart-ment of Mathematics, Bar-llan University. Work supported, in part, by the U.S. National Bureau of Standards. Reproduction in whole or in part is permitted for any purpose of the U.S. Government The surviving remnants of this vast literature consist of thousands of volumes, written by thousands of authors, containing hundreds of millions of words. Only very partial and incomplete indexes exist. Among the modern-day indexing proJects we mention the classification of selected responsa based on the code Shulchan Aruch ("Otzar Haposkim" Vols. i-ii, Jerusalem, 1968); the hierarchical index being constructed now by the Institute of Research in Jewish Law at the Hebrew University ("Index to the Responsa of Rabbi Asher ben Yechlel", M. Elon editor, Inst. Res. Jewish Law, Hebrew University, Jerusalem, 1963); and the card catalog index to the historical material of Middle-Eastern responsa being prepared in the Department of Jewish History at Bar-llan University. (For other references, see S. B. Freehof, "The Responsa Literature", The Jewish Publication Society of America, Philadelphia, 1959).

FundersFunder number
U.S. National Bureau of Standards

    Keywords

    • Case law retrieval
    • Feedback
    • Full text retrieval
    • Grammatical synthesis
    • Hebrew computational linguistics
    • Legal cases
    • Metrical operators
    • Responsa

    Fingerprint

    Dive into the research topics of 'Full text document retrieval: Hebrew legal texts (Report on the first phase of the responsa retrieval project)'. Together they form a unique fingerprint.

    Cite this