Abstract
How do typological properties such as word order and morphological case marking affect the ability of neural sequence models to acquire the syntax of a language? Cross-linguistic comparisons of RNNs' syntactic performance (e.g., on subject-verb agreement prediction) are complicated by the fact that any two languages differ in multiple typological properties, as well as by differences in training corpus. We propose a paradigm that addresses these issues: we create synthetic versions of English, which differ from English in one or more typological parameters, and generate corpora for those languages based on a parsed English corpus. We report a series of experiments in which RNNs were trained to predict agreement features for verbs in each of those synthetic languages. Among other findings, (1) performance was higher in subject-verb-object order (as in English) than in subject-object-verb order (as in Japanese), suggesting that RNNs have a recency bias; (2) predicting agreement with both subject and object (polypersonal agreement) improves over predicting each separately, suggesting that underlying syntactic knowledge transfers across the two tasks; and (3) overt morphological case makes agreement prediction significantly easier, regardless of word order.
Original language | English |
---|---|
Title of host publication | Long and Short Papers |
Publisher | Association for Computational Linguistics (ACL) |
Pages | 3532-3542 |
Number of pages | 11 |
ISBN (Electronic) | 9781950737130 |
State | Published - 2019 |
Event | 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2019 - Minneapolis, United States Duration: 2 Jun 2019 → 7 Jun 2019 |
Publication series
Name | NAACL HLT 2019 - 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies - Proceedings of the Conference |
---|---|
Volume | 1 |
Conference
Conference | 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2019 |
---|---|
Country/Territory | United States |
City | Minneapolis |
Period | 2/06/19 → 7/06/19 |
Bibliographical note
Publisher Copyright:© 2019 Association for Computational Linguistics