Abstract
Large language models (LLMs) excel in language understanding and generation, especially in English which has ample public benchmarks for various natural language processing (NLP) tasks. Nevertheless, their reliability across different languages and domains remains uncertain. Our new shared task introduces a novel benchmark to assess the ability of multilingual LLMs to comprehend and produce language under sparse settings, particularly in scenarios with under-resourced languages, with an emphasis on the ability to capture logical, factual, or causal relationships within lengthy text contexts. The shared task consists of two subtasks crucial to information retrieval: Named Entity Recognition (NER) and Reading Comprehension (RC), in 7 data-scarce languages: Azerbaijani, Igbo, Indonesian, Swiss German, Turkish, Uzbek and Yorùbá, which previously lacked annotated resources in information retrieval tasks. Our evaluation of leading LLMs reveals that, despite their competitive performance, they still have notable weaknesses such as producing output in the non-target language or providing counterfactual information that cannot be inferred from the context. As more advanced models emerge, the benchmark will remain essential for supporting fairness and applicability in information retrieval systems.
Original language | English |
---|---|
Title of host publication | MRL 2023 - 3rd Workshop on Multi-Lingual Representation Learning, Proceedings of the Workshop |
Editors | Duygu Ataman |
Publisher | Association for Computational Linguistics (ACL) |
Pages | 106-117 |
Number of pages | 12 |
ISBN (Electronic) | 9798891760561 |
State | Published - 2023 |
Event | 3rd Workshop on Multi-lingual Representation Learning, MRL 2023 - Singapore, Singapore Duration: 7 Dec 2023 → … |
Publication series
Name | MRL 2023 - 3rd Workshop on Multi-Lingual Representation Learning, Proceedings of the Workshop |
---|
Conference
Conference | 3rd Workshop on Multi-lingual Representation Learning, MRL 2023 |
---|---|
Country/Territory | Singapore |
City | Singapore |
Period | 7/12/23 → … |
Bibliographical note
Publisher Copyright:© 2023 Association for Computational Linguistics.
Funding
We thank our sponsors Google Deepmind and Bloomberg to make this shared task possible. We also thank HumanSignal for providing us access to Label Studio’s Enterprise version which allowed us execute the large-scale collaboration to perform human annotations in multiple tasks.