Abstract
Large language models can perform downstream tasks in a zero-shot fashion, given natural language prompts that specify the desired behavior. Such prompts are typically hand engineered, but can also be learned with gradient-based methods from labeled data. However, it is underexplored what factors make the prompts effective, especially when the prompts are in natural language. In this paper, we investigate common attributes shared by effective prompts in classification problems. We first propose a human readable prompt tuning method (FLUENTPROMPT) based on Langevin dynamics that incorporates a fluency constraint to find a distribution of effective and fluent prompts. Our analysis reveals that effective prompts are topically related to the task domain and calibrate the prior probability of output labels. Based on these findings, we also propose a method for generating prompts using only unlabeled data, outperforming strong baselines by an average of 7.0% accuracy across three tasks. We release our code and data in github.com/swj0419/FluentPrompt.
| Original language | English |
|---|---|
| Title of host publication | Findings of the Association for Computational Linguistics |
| Subtitle of host publication | EMNLP 2023 |
| Publisher | Association for Computational Linguistics (ACL) |
| Pages | 10994-11005 |
| Number of pages | 12 |
| ISBN (Electronic) | 9798891760615 |
| DOIs | |
| State | Published - 2023 |
| Externally published | Yes |
| Event | 2023 Findings of the Association for Computational Linguistics: EMNLP 2023 - Hybrid, Singapore Duration: 6 Dec 2023 → 10 Dec 2023 |
Publication series
| Name | Findings of the Association for Computational Linguistics: EMNLP 2023 |
|---|
Conference
| Conference | 2023 Findings of the Association for Computational Linguistics: EMNLP 2023 |
|---|---|
| Country/Territory | Singapore |
| City | Hybrid |
| Period | 6/12/23 → 10/12/23 |
Bibliographical note
Publisher Copyright:© 2023 Association for Computational Linguistics.
Funding
We gratefully acknowledge support from NSF CAREER Grant No. IIS2142739. This material is funded in part by the DARPA Grant under Contract No. HR001120C0124. This research is supported in part by the Office of the Director of National Intelligence (ODNI), Intelligence Advanced Research Projects Activity (IARPA), via the HIATUS Program contract #2022-22072200004. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies, either expressed or implied, of ODNI, IARPA, or the U.S. Government. The U.S. Government is authorized to reproduce and distribute reprints for governmental purposes notwithstanding any copyright annotation therein.
| Funders | Funder number |
|---|---|
| National Science Foundation | IIS2142739 |
| Defense Advanced Research Projects Agency | HR001120C0124 |
| Office of the Director of National Intelligence | |
| Intelligence Advanced Research Projects Activity | 2022-22072200004 |