Training strategies for deep latent models and applications to speech presence probability estimation

Shlomo E. Chazan, Sharon Gannot, Jacob Goldberger

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

In this study we address models with latent variable in the context of neural networks. We analyze a neural network architecture, mixture of deep experts (MoDE), that models latent variables using the mixture of expert paradigm. Learning the parameters of latent variable models is usually done by the expectation-maximization (EM) algorithm. However, it is well known that back-propagation gradient-based algorithms are the preferred strategy for training neural networks. We show that in the case of neural networks with latent variables, the back-propagation algorithm is actually a recursive variant of the EM that is more suitable for training neural networks. To demonstrate the viability of the proposed MoDE network it is applied to the task of speech presence probability estimation, widely applicable to many speech processing problem, e.g. speaker diarization and separation, speech enhancement and noise reduction. Experimental results show the benefits of the proposed architecture over standard fully-connected networks with the same number of parameters.

Original languageEnglish
Title of host publicationLatent Variable Analysis and Signal Separation - 14th International Conference, LVA/ICA 2018, Proceedings
EditorsSharon Gannot, Yannick Deville, Russell Mason, Mark D. Plumbley, Dominic Ward
PublisherSpringer Verlag
Pages319-328
Number of pages10
ISBN (Print)9783319937632
DOIs
StatePublished - 2018
Event14th International Conference on Latent Variable Analysis and Signal Separation, LVA/ICA 2018 - Guildford, United Kingdom
Duration: 2 Jul 20185 Jul 2018

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume10891 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference14th International Conference on Latent Variable Analysis and Signal Separation, LVA/ICA 2018
Country/TerritoryUnited Kingdom
CityGuildford
Period2/07/185/07/18

Bibliographical note

Publisher Copyright:
© Springer International Publishing AG, part of Springer Nature 2018.

Keywords

  • DNNs
  • Expectation-maximization
  • Mixture of experts

Fingerprint

Dive into the research topics of 'Training strategies for deep latent models and applications to speech presence probability estimation'. Together they form a unique fingerprint.

Cite this