The problem of source separation using an array of microphones in reverberant and noisy conditions is addressed. We consider applying the well-known linearly constrained minimum variance (LCMV) beamformer (BF) for extracting individual speakers. Constraints are defined using relative transfer functions (RTFs) for the sources, which are ratios of acoustic transfer functions (ATFs) between any microphone and a reference microphone. The latter are usually estimated by methods that rely on single-Talk time segments where only a single source is active and on reliable knowledge of the source activity. Two novel algorithms for estimation of RTFs using the 'Triple N' ICA for convolutive mixtures (TRINICON) framework are proposed, not resorting to the usually unavailable source activity pattern. The first algorithm estimates the RTFs of the sources by applying multiple two-channel geometrically constrained (GC) TRINICON units, where approximate direction of arrival information for the sources is utilized for ensuring convergence to the desired solution. The GC-TRINICON is applied to all microphone pairs using a common reference microphone. In the second algorithm, we propose to estimate RTFs iteratively using GC-TRINICON, where instead of using a fixed reference microphone as before, we suggest to use the output signals of LCMV-BFs from the previous iteration as spatially processed references with improved signal-To-interference-And-noise ratio. For both algorithms, a simple detection of noise-only time segments is required for estimating the covariance matrix of noise and interference. We conduct an experimental study in which the performance of the proposed methods is confirmed and compared to corresponding supervised methods.
|Number of pages||13|
|Journal||IEEE/ACM Transactions on Audio Speech and Language Processing|
|State||Published - Feb 2017|
Bibliographical notePublisher Copyright:
© 2014 IEEE.
- Blind source separation
- relative transfer function
- voice activity