An inverse-gamma source variance prior with factorized parameterization for audio source separation

Dionyssos Kounades-Bastian, Laurent Girin, Xavier Alameda-Pineda, Sharon Gannot, Radu Horaud

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

6 Scopus citations

Abstract

In this paper we present a new statistical model for the power spectral density (PSD) of an audio signal and its application to multichannel audio source separation (MASS). The source signal is modeled with the local Gaussian model (LGM) and we propose to model its variance with an inverse-Gamma distribution, whose scale parameter is factorized as a rank-1 model. We discuss the interest of this approach and evaluate it in a MASS task with underdetermined convolutive mixtures. For this aim, we derive a variational EM algorithm for parameter estimation and source inference. The proposed model shows a benefit in source separation performance compared to a state-of-the-art LGM NMF-based technique.

Original languageEnglish
Title of host publication2016 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2016 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages136-140
Number of pages5
ISBN (Electronic)9781479999880
DOIs
StatePublished - 18 May 2016
Event41st IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2016 - Shanghai, China
Duration: 20 Mar 201625 Mar 2016

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Volume2016-May
ISSN (Print)1520-6149

Conference

Conference41st IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2016
Country/TerritoryChina
CityShanghai
Period20/03/1625/03/16

Bibliographical note

Publisher Copyright:
© 2016 IEEE.

Keywords

  • Audio modeling
  • PSD model
  • audio source separation
  • local Gaussian model

Fingerprint

Dive into the research topics of 'An inverse-gamma source variance prior with factorized parameterization for audio source separation'. Together they form a unique fingerprint.

Cite this