Abstract
As conversational agents become increasingly multimodal, they invite human-like evaluations—especially in trust-sensitive contexts. Building on the human tendency to form rapid judgments from subtle visual and auditory cues, we explore how trust is constructed from faces and voices. In a behavioral experiment, 150 participants rated bimodal stimuli across four trust congruence conditions. We then trained a multimodal model using HuBERT and ResNet-50 with late fusion to predict trust scores. To examine alignment between human and model judgments, we applied Permutation Feature Importance (PFI) to compare the most influential features. Our results highlight the dominance of auditory cues in both human and model trust evaluations, while revealing subtle but meaningful differences in feature weighting across modalities and conditions.
| Original language | English |
|---|---|
| Title of host publication | CUI 2025 - Proceedings of the 2025 ACM Conference on Conversational User Interfaces |
| Publisher | Association for Computing Machinery, Inc |
| ISBN (Electronic) | 9798400715273 |
| DOIs | |
| State | Published - 7 Jul 2025 |
| Externally published | Yes |
| Event | 7th Conference on Conversational User Interfaces, CUI 2025 - Waterloo, Canada Duration: 8 Jul 2025 → 10 Jul 2025 |
Publication series
| Name | CUI 2025 - Proceedings of the 2025 ACM Conference on Conversational User Interfaces |
|---|
Conference
| Conference | 7th Conference on Conversational User Interfaces, CUI 2025 |
|---|---|
| Country/Territory | Canada |
| City | Waterloo |
| Period | 8/07/25 → 10/07/25 |
Bibliographical note
Publisher Copyright:© 2025 Copyright held by the owner/author(s).
Keywords
- Behavioral Experiment
- Feature Importance
- Human-AI Alignment
- Multimodal Trust Perception
Fingerprint
Dive into the research topics of 'Beyond Face Value: Visual and Auditory Signals in Human and Machine Trust Judgments'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver