Abstract
Mining commonsense knowledge from corpora suffers from reporting bias, over-representing the rare at the expense of the trivial (Gordon and Van Durme, 2013). We study to what extent pre-trained language models overcome this issue. We find that while their generalization capacity allows them to better estimate the plausibility of frequent but unspoken of actions, outcomes, and properties, they also tend to overestimate that of the very rare, amplifying the bias that already exists in their training corpus.
Original language | English |
---|---|
Title of host publication | COLING 2020 - 28th International Conference on Computational Linguistics, Proceedings of the Conference |
Editors | Donia Scott, Nuria Bel, Chengqing Zong |
Publisher | Association for Computational Linguistics (ACL) |
Pages | 6863-6870 |
Number of pages | 8 |
ISBN (Electronic) | 9781952148279 |
DOIs | |
State | Published - 2020 |
Externally published | Yes |
Event | 28th International Conference on Computational Linguistics, COLING 2020 - Virtual, Online, Spain Duration: 8 Dec 2020 → 13 Dec 2020 |
Publication series
Name | COLING 2020 - 28th International Conference on Computational Linguistics, Proceedings of the Conference |
---|
Conference
Conference | 28th International Conference on Computational Linguistics, COLING 2020 |
---|---|
Country/Territory | Spain |
City | Virtual, Online |
Period | 8/12/20 → 13/12/20 |
Bibliographical note
Publisher Copyright:© 2020 COLING 2020 - 28th International Conference on Computational Linguistics, Proceedings of the Conference. All rights reserved.
Funding
This research was supported in part by NSF (IIS-1524371, IIS-1714566), DARPA under the CwC program through the ARO (W911NF-15-1-0543), and DARPA under the MCS program through NIWC Pacific (N66001-19-2-4031).
Funders | Funder number |
---|---|
National Science Foundation | IIS-1714566, IIS-1524371 |
Army Research Office | W911NF-15-1-0543 |
Defense Advanced Research Projects Agency | |
Naval Information Warfare Center Pacific | N66001-19-2-4031 |