Abstract
Annotated in-domain corpora are crucial to the successful development of dialogue systems of automated agents, and in particular for developing natural language understanding (NLU) components of such systems. Unfortunately, such important resources are scarce. In this work, we introduce an annotated natural language human-agent dialogue corpus in the negotiation domain. The corpus was collected using Amazon Mechanical Turk following the 'Wizard-Of-Oz' approach, where a 'wizard' human translates the participants' natural language utterances in real time into a semantic language. Once dialogue collection was completed, utterances were annotated with intent labels by two independent annotators, achieving high inter-annotator agreement. Our initial experiments with an SVM classifier show that automatically inferring such labels from the utterances is far from trivial. We make our corpus publicly available to serve as an aid in the development of dialogue systems for negotiation agents, and suggest that analogous corpora can be created following our methodology and using our available source code. To the best of our knowledge this is the first publicly available negotiation dialogue corpus.
Original language | English |
---|---|
Title of host publication | Proceedings of the 10th International Conference on Language Resources and Evaluation, LREC 2016 |
Editors | Nicoletta Calzolari, Khalid Choukri, Helene Mazo, Asuncion Moreno, Thierry Declerck, Sara Goggi, Marko Grobelnik, Jan Odijk, Stelios Piperidis, Bente Maegaard, Joseph Mariani |
Publisher | European Language Resources Association (ELRA) |
Pages | 3141-3145 |
Number of pages | 5 |
ISBN (Electronic) | 9782951740891 |
State | Published - 2016 |
Event | 10th International Conference on Language Resources and Evaluation, LREC 2016 - Portoroz, Slovenia Duration: 23 May 2016 → 28 May 2016 |
Publication series
Name | Proceedings of the 10th International Conference on Language Resources and Evaluation, LREC 2016 |
---|
Conference
Conference | 10th International Conference on Language Resources and Evaluation, LREC 2016 |
---|---|
Country/Territory | Slovenia |
City | Portoroz |
Period | 23/05/16 → 28/05/16 |
Bibliographical note
Funding Information:We thank Sarit Kraus, Avi Rosenfeld, Erel Segal-Halevi, Osnat Drein and Inon Zuckerman for their assistance and contribution. This work was partly supported by ERC Grant #267523.
Funding
We thank Sarit Kraus, Avi Rosenfeld, Erel Segal-Halevi, Osnat Drein and Inon Zuckerman for their assistance and contribution. This work was partly supported by ERC Grant #267523.
Funders | Funder number |
---|---|
European Commission | 267523 |
Keywords
- Crowdsourcing
- Dialogue systems
- Negotiation corpora