IlluSign: Illustrating Sign Language Videos by Leveraging the Attention Mechanism

  • Janna Bruner
  • , Amit Moryossef
  • , Lior Wolf

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Sign languages are dynamic visual languages that involve hand gestures, in combination with non-manual elements such as facial expressions. While video recordings of sign language are commonly used for education and documentation, the dynamic nature of signs can make it challenging to study them in detail, especially for new learners. This work aims to convert sign language video footage into static illustrations, which serve as an additional educational resource to complement video content. This process is usually done by an artist, and is therefore quite costly. We propose a method to illustrate sign language videos by leveraging generative models to capture both semantic and geometric image features. Our approach transfers a sketch-like style to keyframes and combines the start and end poses into a single image, enhanced with arrows to indicate hand motion.While many style transfer methods address domain adaptation at various levels of abstraction, applying a sketch-like style to sign language, particularly to detailed hand gestures, remains a significant challenge. To tackle this, we intervene in the denoising process of a diffusion model, injecting style as keys and values into high-resolution attention layers, and fusing geometric information from the image and edges as queries. For the final illustration, we use the attention mechanism to combine the attention weights from both the start and end illustrations, resulting in a soft combination. Our method offers a costeffective solution for generating sign language illustrations at inference time, addressing the lack of such resources in educational materials.11Watermarks, such as the SGB-FSS mark in the SignSuisse dataset [35], are ignored in this work.

Original languageEnglish
Title of host publication2025 IEEE 19th International Conference on Automatic Face and Gesture Recognition, FG 2025
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9798331553418
DOIs
StatePublished - 2025
Externally publishedYes
Event19th IEEE International Conference on Automatic Face and Gesture Recognition, FG 2025 - Tampa, United States
Duration: 26 May 202530 May 2025

Publication series

Name2025 IEEE 19th International Conference on Automatic Face and Gesture Recognition, FG 2025

Conference

Conference19th IEEE International Conference on Automatic Face and Gesture Recognition, FG 2025
Country/TerritoryUnited States
CityTampa
Period26/05/2530/05/25

Bibliographical note

Publisher Copyright:
© 2025 IEEE.

Fingerprint

Dive into the research topics of 'IlluSign: Illustrating Sign Language Videos by Leveraging the Attention Mechanism'. Together they form a unique fingerprint.

Cite this