Is Probing All You Need? Indicator Tasks as an Alternative to Probing Embedding Spaces

Tal Levy, Omer Goldman, Reut Tsarfaty

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

The ability to identify and control different kinds of linguistic information encoded in vector representations of words has many use cases, especially for explainability and bias removal. This is usually done via a set of simple classification tasks, termed probes, to evaluate the information encoded in the embedding space. However, the involvement of a trainable classifier leads to entanglement between the probe's results and the classifier's nature. As a result, contemporary works on probing include tasks that do not involve training of auxiliary models. In this work we introduce the term indicator tasks for non-trainable tasks which are used to query embedding spaces for the existence of certain properties, and claim that this kind of tasks may point to a direction opposite to probes, and that this contradiction complicates the decision on whether a property exists in an embedding space. We demonstrate our claims with two test cases, one dealing with gender debiasing and another with the erasure of morphological information from embedding spaces. We show that the application of a suitable indicator provides a more accurate picture of the information captured and removed compared to probes. We thus conclude that indicator tasks should be implemented and taken into consideration when eliciting information from embedded representations.

Original languageEnglish
Title of host publicationFindings of the Association for Computational Linguistics
Subtitle of host publicationEMNLP 2023
PublisherAssociation for Computational Linguistics (ACL)
Pages5243-5254
Number of pages12
ISBN (Electronic)9798891760615
StatePublished - 2023
Event2023 Findings of the Association for Computational Linguistics: EMNLP 2023 - Singapore, Singapore
Duration: 6 Dec 202310 Dec 2023

Publication series

NameFindings of the Association for Computational Linguistics: EMNLP 2023

Conference

Conference2023 Findings of the Association for Computational Linguistics: EMNLP 2023
Country/TerritorySingapore
CitySingapore
Period6/12/2310/12/23

Bibliographical note

Publisher Copyright:
© 2023 Association for Computational Linguistics.

Funding

We thank Shauli Ravfogel and an anonymous reviewer (VEsr in openreview) for insightful comments and fruitful discussions. This research was funded by the European Research Council (ERC-StG grant number 677352), the Israeli Ministry of Science and Technology (MOST grant number 3-17992), and the Israeli Innovation Authority (IIA KAMIN grant), for which we are grateful. We thank Shauli Ravfogel and an anonymous reviewer (VEsr in openreview) for insightful comments and fruitful discussions. This research was funded by the European Research Council (ERCStG grant number 677352), the Israeli Ministry of Science and Technology (MOST grant number 3-17992), and the Israeli Innovation Authority (IIA KAMIN grant), for which we are grateful.

FundersFunder number
ERC-STG677352
ERCStG
Shauli Ravfogel
European Research Council
Ministry of Science, Technology and Space3-17992
Ministry of science and technology, Israel
Israel Innovation Authority

    Fingerprint

    Dive into the research topics of 'Is Probing All You Need? Indicator Tasks as an Alternative to Probing Embedding Spaces'. Together they form a unique fingerprint.

    Cite this