Abstract
Text-to-image (T2I) personalization allows users to guide the creative image generation process by combining their own visual concepts in natural language prompts. Recently, encoder-based techniques have emerged as a new effective approach for T2I personalization, reducing the need for multiple images and long training times. However, most existing encoders are limited to a single-class domain, which hinders their ability to handle diverse concepts. In this work, we propose a domain-agnostic method that does not require any specialized dataset or prior information about the personalized concepts. We introduce a novel contrastive-based regularization technique to maintain high fidelity to the target concept characteristics while keeping the predicted embeddings close to editable regions of the latent space, by pushing the predicted tokens toward their nearest existing CLIP tokens. Our experimental results demonstrate the effectiveness of our approach and show how the learned tokens are more semantic than tokens predicted by unregularized models. This leads to a better representation that achieves state-of-the-art performance while being more flexible than previous methods.
Original language | English |
---|---|
Title of host publication | Proceedings - SIGGRAPH Asia 2023 Conference Papers, SA 2023 |
Editors | Stephen N. Spencer |
Publisher | Association for Computing Machinery, Inc |
ISBN (Electronic) | 9798400703157 |
DOIs | |
State | Published - 10 Dec 2023 |
Externally published | Yes |
Event | 2023 SIGGRAPH Asia 2023 Conference Papers, SA 2023 - Sydney, Australia Duration: 12 Dec 2023 → 15 Dec 2023 |
Publication series
Name | Proceedings - SIGGRAPH Asia 2023 Conference Papers, SA 2023 |
---|
Conference
Conference | 2023 SIGGRAPH Asia 2023 Conference Papers, SA 2023 |
---|---|
Country/Territory | Australia |
City | Sydney |
Period | 12/12/23 → 15/12/23 |
Bibliographical note
Publisher Copyright:© 2023 ACM.
Funding
The first author is supported by the Miriam and Aaron Gutwirth scholarship. This work was partially supported by Len Blavatnik and the Blavatnik family foundation, the Deutsch Foundation, the Yandex Initiative in Machine Learning, BSF (grant 2020280) and ISF (grants 2492/20 and 3441/21).
Funders | Funder number |
---|---|
Deutsch Foundation | |
Miriam and Aaron Gutwirth | |
Yandex Initiative in Machine Learning | |
Blavatnik Family Foundation | |
United States-Israel Binational Science Foundation | 2020280 |
Israel Science Foundation | 2492/20, 3441/21 |
Keywords
- Encoders
- Inversion
- Personalization