Minimal Supervision for Morphological Inflection

Omer Goldman, Reut Tsarfaty

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

Neural models for the various flavours of morphological reinflection tasks have proven to be extremely accurate given ample labeled data, yet labeled data may be slow and costly to obtain. In this work we aim to overcome this annotation bottleneck by bootstrapping labeled data from a seed as small as five labeled inflection tables, accompanied by a large bulk of unlabeled text. Our bootstrapping method exploits the orthographic and semantic regularities in morphological systems in a two-phased setup, where word tagging based on analogies is followed by word pairing based on distances. Our experiments with the Paradigm Cell Filling Problem over eight typologically different languages show that in languages with relatively simple morphology, orthographic regularities on their own allow inflection models to achieve respectable accuracy. Combined orthographic and semantic regularities alleviate difficulties with particularly complex morpho-phonological systems. We further show that our bootstrapping methods substantially outperform hallucination-based methods commonly used for overcoming the annotation bottleneck in morphological reinflection tasks.

Original languageEnglish
Title of host publicationEMNLP 2021 - 2021 Conference on Empirical Methods in Natural Language Processing, Proceedings
PublisherAssociation for Computational Linguistics (ACL)
Pages2078-2088
Number of pages11
ISBN (Electronic)9781955917094
StatePublished - 2021
Event2021 Conference on Empirical Methods in Natural Language Processing, EMNLP 2021 - Virtual, Punta Cana, Dominican Republic
Duration: 7 Nov 202111 Nov 2021

Publication series

NameEMNLP 2021 - 2021 Conference on Empirical Methods in Natural Language Processing, Proceedings

Conference

Conference2021 Conference on Empirical Methods in Natural Language Processing, EMNLP 2021
Country/TerritoryDominican Republic
CityVirtual, Punta Cana
Period7/11/2111/11/21

Bibliographical note

Publisher Copyright:
© 2021 Association for Computational Linguistics

Funding

We thank Jonathan Berant for the helpful advice and discussion all throughout. We also thank the audience of the BIU-NLP seminar, the 18th SIG-MORPHON meeting, and the 1st UniMorph Meeting, for comments and discussion. This research is funded by an ERC-StG Grant 677352 by the European Research Council, and an ISF grant 1739/26 by the Israeli Science Foundation, for which we are grateful.

FundersFunder number
ERC-STG677352
European Research Council
Israel Science Foundation1739/26

    Fingerprint

    Dive into the research topics of 'Minimal Supervision for Morphological Inflection'. Together they form a unique fingerprint.

    Cite this