Abstract
Keyphrase extraction aims at automatically selecting small set of phrases in a document, that best describe its main ideas. There is great need for better methods of keyphrase extraction in the absence of labeled data, as currently unsupervised algorithms fail to achieve adequate performance, compared to their supervised counterparts. In this paper we suggest a widely applicable distant supervision framework based on auxiliary data from query logs. By propagating information from queries and subsequent consumption of content, weak labels are produced, transforming the problem into the easier supervised task. Evaluation on a large dataset shows the superiority of this approach over unsupervised alternatives.
Original language | English |
---|---|
Title of host publication | Proceedings - 2020 IEEE 6th International Conference on Big Data Computing Service and Applications, BigDataService 2020 |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Pages | 70-77 |
Number of pages | 8 |
ISBN (Electronic) | 9781728170220 |
DOIs | |
State | Published - Aug 2020 |
Externally published | Yes |
Event | 6th IEEE International Conference on Big Data Computing Service and Applications, BigDataService 2020 - Oxford, United Kingdom Duration: 3 Aug 2020 → 6 Aug 2020 |
Publication series
Name | Proceedings - 2020 IEEE 6th International Conference on Big Data Computing Service and Applications, BigDataService 2020 |
---|
Conference
Conference | 6th IEEE International Conference on Big Data Computing Service and Applications, BigDataService 2020 |
---|---|
Country/Territory | United Kingdom |
City | Oxford |
Period | 3/08/20 → 6/08/20 |
Bibliographical note
Publisher Copyright:© 2020 IEEE.
Keywords
- Document Analysis
- Keyphrase Extraction
- Knowledge Extraction