Fully automatic speaker separation system, with automatic enrolling of recurrent speakers

Raphael Cohen, Orgad Keller, Jason Levy, Amit Ashkenazi, Russell Levy, Micha Breakstone

Research output: Contribution to journalConference articlepeer-review

Abstract

We present a system to enable speaker separation and identification, designed to operate without requiring any effort from the end-user. In the system, single channel conversations are transformed into i-vectors, clustered into speakers and matched to a database of known speakers. Enrollment is automatic and a voice print is constructed for the recording user, taking advantage of the meta-data identifying that user's conversations. Further information is used when available from other information sources such as video and the ASR transcribed content to identify speakers. We describe the system architecture, novel unsupervised enrollment algorithm and describe the difficulties encountered in solving this problem.

Original languageEnglish
Pages (from-to)1964-1965
Number of pages2
JournalProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
Volume2018-September
DOIs
StatePublished - 2018
Externally publishedYes
Event19th Annual Conference of the International Speech Communication, INTERSPEECH 2018 - Hyderabad, India
Duration: 2 Sep 20186 Sep 2018

Bibliographical note

Publisher Copyright:
© 2018 International Speech Communication Association. All rights reserved.

Keywords

  • Diarization
  • Speaker separation
  • Speech recognition

Fingerprint

Dive into the research topics of 'Fully automatic speaker separation system, with automatic enrolling of recurrent speakers'. Together they form a unique fingerprint.

Cite this