Abstract
This paper presents work on novel machine translation (MT) systems between spoken and signed languages, where signed languages are represented in SignWriting, a sign language writing system. Our work1 seeks to address the lack of out-of-the-box support for signed languages in current MT systems and is based on the SignBank dataset, which contains pairs of spoken language text and SignWriting content. We introduce novel methods to parse, factorize, decode, and evaluate SignWriting, leveraging ideas from neural factored MT. In a bilingual setup—translating from American Sign Language to (American) English—our method achieves over 30 BLEU, while in two multilingual setups— translating in both directions between spoken languages and signed languages—we achieve over 20 BLEU. We find that common MT techniques used to improve spoken language translation similarly affect the performance of sign language translation. These findings validate our use of an intermediate text representation for signed languages to include them in NLP research.
Original language | English |
---|---|
Title of host publication | EACL 2023 - 17th Conference of the European Chapter of the Association for Computational Linguistics, Findings of EACL 2023 |
Publisher | Association for Computational Linguistics (ACL) |
Pages | 1661-1679 |
Number of pages | 19 |
ISBN (Electronic) | 9781959429470 |
State | Published - 2023 |
Event | 17th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2023 - Findings of EACL 2023 - Dubrovnik, Croatia Duration: 2 May 2023 → 6 May 2023 |
Publication series
Name | EACL 2023 - 17th Conference of the European Chapter of the Association for Computational Linguistics, Findings of EACL 2023 |
---|
Conference
Conference | 17th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2023 - Findings of EACL 2023 |
---|---|
Country/Territory | Croatia |
City | Dubrovnik |
Period | 2/05/23 → 6/05/23 |
Bibliographical note
Publisher Copyright:© 2023 Association for Computational Linguistics.
Funding
This work is funded by the following projects: EASIER (Grant agreement number 101016982) and IICT (Grant agreement number PFFS-21-47). We are grateful for their support. We also thank Rico Sennrich for his suggestions.