Abstract
Multilingual models have been widely used for cross-lingual transfer to low-resource languages. However, the performance on these languages is hindered by their underrepresentation in the pretraining data. To alleviate this problem, we propose a novel multilingual training technique based on teacherstudent knowledge distillation. In this setting, we utilize monolingual teacher models optimized for their language. We use those teachers along with balanced (sub-sampled) data to distill the teachers knowledge into a single multilingual student. Our method outperforms standard training methods in lowresource languages and retains performance on high-resource languages.
Original language | English |
---|---|
Title of host publication | SIGTYP 2023 - 5th Workshop on Research in Computational Linguistic Typology and Multilingual NLP, Proceedings of the Workshop |
Editors | Lisa Beinborn, Koustava Goswami, Saliha Muradoglu, Alexey Sorokin, Ritesh Kumar, Andreas Shcherbakov, Edoardo M. Ponti, Ryan Cotterell, Ekaterina Vylomova |
Publisher | Association for Computational Linguistics |
Pages | 1-11 |
Number of pages | 11 |
ISBN (Electronic) | 9781959429562 |
State | Published - 2023 |
Externally published | Yes |
Event | 5th Workshop on Research in Computational Linguistic Typology and Multilingual NLP, SIGTYP 2023, co-located with the 17th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2023 - Hybrid, Dubrovnik, Croatia Duration: 6 May 2023 → … |
Publication series
Name | SIGTYP 2023 - 5th Workshop on Research in Computational Linguistic Typology and Multilingual NLP, Proceedings of the Workshop |
---|
Conference
Conference | 5th Workshop on Research in Computational Linguistic Typology and Multilingual NLP, SIGTYP 2023, co-located with the 17th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2023 |
---|---|
Country/Territory | Croatia |
City | Hybrid, Dubrovnik |
Period | 6/05/23 → … |
Bibliographical note
Publisher Copyright:© 2023 Association for Computational Linguistics.
Funding
We thank anonymous reviewers for their valuable comments on the previous versions of this article. This work was supported in part by a research gift from the Allen Institute for AI, and a research grant 2336 from the Israeli Ministry of Science and Technology. Tomasz Limisiewicz’s visit to the Hebrew University has been supported by grant 338521 of the Charles University Grant Agency and the Mobility Fund of Charles University.
Funders | Funder number |
---|---|
Mobility Fund of Charles University | |
Grantová Agentura, Univerzita Karlova | |
Ministry of science and technology, Israel | 338521 |