A Dataset for N-ary Relation Extraction of Drug Combinations

Aryeh Tiktinsky, Vijay Viswanathan, Danna Niezni, Dana Meron Azagury, Yosi Shamay, Hillel Taub-Tabib, Tom Hope, Yoav Goldberg

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

4 Scopus citations

Abstract

Combination therapies have become the standard of care for diseases such as cancer, tuberculosis, malaria and HIV. However, the combinatorial set of available multi-drug treatments creates a challenge in identifying effective combination therapies available in a situation. To assist medical professionals in identifying beneficial drug-combinations, we construct an expert-annotated dataset for extracting information about the efficacy of drug combinations from the scientific literature. Beyond its practical utility, the dataset also presents a unique NLP challenge, as the first relation extraction dataset consisting of variable-length relations. Furthermore, the relations in this dataset predominantly require language understanding beyond the sentence level, adding to the challenge of this task. We provide a promising baseline model and identify clear areas for further improvement. We release our dataset, code, and baseline models publicly to encourage the NLP community to participate in this task.

Original languageEnglish
Title of host publicationNAACL 2022 - 2022 Conference of the North American Chapter of the Association for Computational Linguistics
Subtitle of host publicationHuman Language Technologies, Proceedings of the Conference
PublisherAssociation for Computational Linguistics (ACL)
Pages3190-3203
Number of pages14
ISBN (Electronic)9781955917711
StatePublished - 2022
Event2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL 2022 - Seattle, United States
Duration: 10 Jul 202215 Jul 2022

Publication series

NameNAACL 2022 - 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference

Conference

Conference2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL 2022
Country/TerritoryUnited States
CitySeattle
Period10/07/2215/07/22

Bibliographical note

Publisher Copyright:
© 2022 Association for Computational Linguistics.

Funding

This project has received funding in part from the European Research Council (ERC) under the European Union's Horizon2020 research and innovation programme, grant agreement 802774 (iEXTRACT), and in part from the NSF Convergence Accelerator Award #2132318. We would also like to thank our annotators from the Shamay lab at the Faculty of Biomedical Engineering, Technion, including Shaked Launer-Wachs, Yuval Harris, Maytal Avrashami, Hagit Sason-Bauer and Yakir Amrusi This project has received funding in part from the European Research Council (ERC) under the European Union’s Horizon2020 research and innovation programme, grant agreement 802774 (iEX-TRACT), and in part from the NSF Convergence Accelerator Award #2132318. We would also like to thank our annotators from the Shamay lab at the Faculty of Biomedical Engineering, Technion, including Shaked Launer-Wachs, Yuval Harris, May-tal Avrashami, Hagit Sason-Bauer and Yakir Am-rusi

FundersFunder number
European Union's Horizon2020 research and innovation programme
European Union’s Horizon2020 research and innovation programme802774
Yakir Amrusi
National Science Foundation2132318
European Research Council
Technion-Israel Institute of Technology

    Fingerprint

    Dive into the research topics of 'A Dataset for N-ary Relation Extraction of Drug Combinations'. Together they form a unique fingerprint.

    Cite this