Abstract
Domain experts often need to extract structured information from large corpora. We advocate for a search paradigm called “extractive search”, in which a search query is enriched with capture-slots, to allow for such rapid extraction. Such an extractive search system can be built around syntactic structures, resulting in high-precision, low-recall results. We show how the recall can be improved using neural retrieval and alignment. The goals of this paper are to concisely introduce the extractive-search paradigm; and to demonstrate a prototype neural retrieval system for extractive search and its benefits and potential. Our prototype is available at https://spike.neural-sim.apps.allenai.org/ and a video demonstration is available at https://vimeo.com/559586687.
Original language | English |
---|---|
Title of host publication | ACL-IJCNLP 2021 - 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Proceedings of the System Demonstrations |
Publisher | Association for Computational Linguistics (ACL) |
Pages | 210-217 |
Number of pages | 8 |
ISBN (Electronic) | 9781954085565 |
DOIs | |
State | Published - 2021 |
Event | Joint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing: System Demonstration, ACL-IJCNLP 2021 - Virtual, Online Duration: 1 Aug 2021 → 6 Aug 2021 |
Publication series
Name | ACL-IJCNLP 2021 - 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Proceedings of the System Demonstrations |
---|
Conference
Conference | Joint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing: System Demonstration, ACL-IJCNLP 2021 |
---|---|
City | Virtual, Online |
Period | 1/08/21 → 6/08/21 |
Bibliographical note
Publisher Copyright:© 2021 Association for Computational Linguistics
Funding
This project received funding from the Europoean Research Council (ERC) under the Europoean Union’s Horizon 2020 research and innovation programme, grant agreement No. 802774 (iEX-TRACT).
Funders | Funder number |
---|---|
Europoean Union’s Horizon 2020 research and innovation programme | |
Horizon 2020 Framework Programme | 802774 |
European Commission |