Abstract
Image collection summarization techniques aim to present a compact representation of an image gallery through a carefully selected subset of images that captures its semantic content. When it comes to web content, however, the ideal selection can vary based on the user's specific intentions and preferences. This is particularly relevant at Booking.com, where presenting properties and their visual summaries that align with users' expectations is crucial. To address this challenge, we consider user intentions in the summarization of property visuals by analyzing property reviews and extracting the most significant aspects mentioned by users. By incorporating the insights from reviews in our visual summaries, we enhance the summaries by presenting the relevant content to a user. Moreover, we achieve it without the need for costly annotations. Our experiments, including human perceptual studies, demonstrate the superiority of our cross-modal approach, which we coin as CrossSummarizer over the no-personalization and image-based clustering baselines.
Original language | English |
---|---|
Pages (from-to) | 22983-22989 |
Number of pages | 7 |
Journal | Proceedings of the AAAI Conference on Artificial Intelligence |
Volume | 38 |
Issue number | 21 |
DOIs | |
State | Published - 25 Mar 2024 |
Externally published | Yes |
Event | 38th AAAI Conference on Artificial Intelligence, AAAI 2024 - Vancouver, Canada Duration: 20 Feb 2024 → 27 Feb 2024 |
Bibliographical note
Publisher Copyright:Copyright © 2024, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.