Demystifying Prompts in Language Models via Perplexity Estimation

Hila Gonen, Srini Iyer, Terra Blevins, Noah A. Smith, Luke Zettlemoyer

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Scopus citations

Abstract

Language models can be prompted to perform a wide variety of tasks with zero- and few-shot in-context learning. However, performance varies significantly with the choice of prompt, and we do not yet understand why this happens. In this paper, we analyze the factors that contribute to this variance and establish a new empirical hypothesis: the performance of a prompt is predicted by the extent to which the model is familiar with the language it contains. Over a wide range of tasks, we show that the lower the perplexity of the prompt, the better it is able to perform the task, when considering reasonable prompts that are related to it. As part of our analysis, we also devise a method to automatically extend a small seed set of manually written prompts by paraphrasing with GPT3 and backtranslation. This larger set allows us to verify that perplexity is a strong predictor of the success of a prompt and we show that the lowest perplexity prompts are consistently effective.

Original languageEnglish
Title of host publicationFindings of the Association for Computational Linguistics
Subtitle of host publicationEMNLP 2023
PublisherAssociation for Computational Linguistics (ACL)
Pages10136-10148
Number of pages13
ISBN (Electronic)9798891760615
StatePublished - 2023
Externally publishedYes
Event2023 Findings of the Association for Computational Linguistics: EMNLP 2023 - Singapore, Singapore
Duration: 6 Dec 202310 Dec 2023

Publication series

NameFindings of the Association for Computational Linguistics: EMNLP 2023

Conference

Conference2023 Findings of the Association for Computational Linguistics: EMNLP 2023
Country/TerritorySingapore
CitySingapore
Period6/12/2310/12/23

Bibliographical note

Publisher Copyright:
© 2023 Association for Computational Linguistics.

Fingerprint

Dive into the research topics of 'Demystifying Prompts in Language Models via Perplexity Estimation'. Together they form a unique fingerprint.

Cite this