Skip to main navigation Skip to search Skip to main content

Few-Shot Speech Deepfake Detection Adaptation with Gaussian Processes

  • Bar-Ilan University

Research output: Contribution to journalConference articlepeer-review

Abstract

Recent advancements in Text-to-Speech (TTS) models, particularly in voice cloning, have intensified the demand for adaptable and efficient deepfake detection methods. As TTS systems continue to evolve, detection models must be able to efficiently adapt to previously unseen generation models with minimal data. This paper introduces ADD-GP, a few-shot adaptive framework based on a Gaussian Process (GP) classifier for Audio Deepfake Detection (ADD). We show how the combination of a powerful deep embedding model with the Gaussian processes flexibility can achieve strong performance and adaptability. Additionally, we show this approach can also be used for personalized detection, with greater robustness to new TTS models and one-shot adaptability. To support our evaluation, a benchmark dataset is constructed for this task using new state-of-the-art voice cloning models.

Original languageEnglish
Pages (from-to)2240-2244
Number of pages5
JournalProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
DOIs
StatePublished - 2025
Event26th Interspeech Conference 2025 - Rotterdam, Netherlands
Duration: 17 Aug 202521 Aug 2025

Bibliographical note

Publisher Copyright:
© 2025 International Speech Communication Association. All rights reserved.

Keywords

  • Audio Deepfake Detection
  • Few-Shot Adaptation
  • Gaussian Processes

Fingerprint

Dive into the research topics of 'Few-Shot Speech Deepfake Detection Adaptation with Gaussian Processes'. Together they form a unique fingerprint.

Cite this