Abstract
We develop Process Execution Graphs (PEG), a document-level representation of real-world wet lab biochemistry protocols, addressing challenges such as cross-sentence relations, long-range coreference, grounding, and implicit arguments. We manually annotate PEGs in a corpus of complex lab protocols with a novel interactive textual simulator that keeps track of entity traits and semantic constraints during annotation. We use this data to develop graph-prediction models, finding them to be good at entity identification and local relation extraction, while our corpus facilitates further exploration of challenging long-range relations.
Original language | English |
---|---|
Title of host publication | EACL 2021 - 16th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings of the Conference |
Publisher | Association for Computational Linguistics (ACL) |
Pages | 2190-2202 |
Number of pages | 13 |
ISBN (Electronic) | 9781954085022 |
DOIs | |
State | Published - 2021 |
Externally published | Yes |
Event | 16th Conference of the European Chapter of the Associationfor Computational Linguistics, EACL 2021 - Virtual, Online Duration: 19 Apr 2021 → 23 Apr 2021 |
Publication series
Name | EACL 2021 - 16th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings of the Conference |
---|
Conference
Conference | 16th Conference of the European Chapter of the Associationfor Computational Linguistics, EACL 2021 |
---|---|
City | Virtual, Online |
Period | 19/04/21 → 23/04/21 |
Bibliographical note
Publisher Copyright:© 2021 Association for Computational Linguistics
Funding
We would like to thank Peter Clark, Noah Smith, Yoav Goldberg, Dafna Shahaf, and Reut Tsarfaty for many fruitful discussions and helpful comments, as well as the X-WLP annotators: Pranay Methuku, Rider Osentoski, Noah Zhang and Michael Zhan. This work was partially supported by an Allen Institute for AI Research Gift to Gabriel Stanovsky. This material is based upon work supported by the NSF (IIS-1845670) and the Defense Advanced Research Projects Agency (DARPA) under Contract No. HR001119C0108. The views, opinions, and/or findings expressed are those of the author(s) and should not be interpreted as representing the official views or policies of the Department of Defense or the U.S. Government.
Funders | Funder number |
---|---|
National Science Foundation | IIS-1845670 |
U.S. Department of Defense | |
Defense Advanced Research Projects Agency | HR001119C0108 |
ALLEN INSTITUTE |