Abstract
The ability to control for the kinds of information encoded in neural representation has a variety of use cases, especially in light of the challenge of interpreting these models. We present Iterative Null-space Projection (INLP), a novel method for removing information from neural representations. Our method is based on repeated training of linear classifiers that predict a certain property we aim to remove, followed by projection of the representations on their null-space. By doing so, the classifiers become oblivious to that target property, making it hard to linearly separate the data according to it. While applicable for multiple uses, we evaluate our method on bias and fairness use-cases, and show that our method is able to mitigate bias in word embeddings, as well as to increase fairness in a setting of multi-class classification.
| Original language | English |
|---|---|
| Title of host publication | ACL 2020 - 58th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference |
| Publisher | Association for Computational Linguistics (ACL) |
| Pages | 7237-7256 |
| Number of pages | 20 |
| ISBN (Electronic) | 9781952148255 |
| State | Published - 2020 |
| Event | 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020 - Virtual, Online, United States Duration: 5 Jul 2020 → 10 Jul 2020 |
Publication series
| Name | Proceedings of the Annual Meeting of the Association for Computational Linguistics |
|---|---|
| ISSN (Print) | 0736-587X |
Conference
| Conference | 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020 |
|---|---|
| Country/Territory | United States |
| City | Virtual, Online |
| Period | 5/07/20 → 10/07/20 |
Bibliographical note
Publisher Copyright:© 2020 Association for Computational Linguistics
Funding
We thank Jacob Goldberger and Jonathan Berant for fruitful discussions. This project received funding from the Europoean Research Council (ERC) under the Europoean Union’s Horizon 2020 research and innovation programme, grant agreement No. 802774 (iEXTRACT).
| Funders | Funder number |
|---|---|
| Europoean Union’s Horizon 2020 research and innovation programme | |
| Horizon 2020 Framework Programme | 802774 |
| European Commission |