Display options
Share it on

Neural Comput. 2015 Oct;27(10):2231-59. doi: 10.1162/NECO_a_00776. Epub 2015 Aug 27.

Linear Methods for Efficient and Fast Separation of Two Sources Recorded with a Single Microphone.

Neural computation

Saurabh Bhargava, Florian Blättler, Sepp Kollmorgen, Shih-Chii Liu, Richard H R Hahnloser

Affiliations

  1. Institute of Neuroinformatics, University of Zurich and ETH Zurich, Zurich, 8057, Switzerland [email protected].
  2. Institute of Neuroinformatics, University of Zurich and ETH Zurich, Zurich, 8057, Switzerland [email protected].
  3. Institute of Neuroinformatics, University of Zurich and ETH Zurich, Zurich, 8057, Switzerland [email protected].
  4. Institute of Neuroinformatics, University of Zurich and ETH Zurich, Zurich, 8057, Switzerland [email protected].
  5. Institute of Neuroinformatics, University of Zurich and ETH Zurich, Zurich, 8057, Switzerland [email protected].

PMID: 26313599 DOI: 10.1162/NECO_a_00776

Abstract

This letter addresses the problem of separating two speakers from a single microphone recording. Three linear methods are tested for source separation, all of which operate directly on sound spectrograms: (1) eigenmode analysis of covariance difference to identify spectro-temporal features associated with large variance for one source and small variance for the other source; (2) maximum likelihood demixing in which the mixture is modeled as the sum of two gaussian signals and maximum likelihood is used to identify the most likely sources; and (3) suppression-regression, in which autoregressive models are trained to reproduce one source and suppress the other. These linear approaches are tested on the problem of separating a known male from a known female speaker. The performance of these algorithms is assessed in terms of the residual error of estimated source spectrograms, waveform signal-to-noise ratio, and perceptual evaluation of speech quality scores. This work shows that the algorithms compare favorably to nonlinear approaches such as nonnegative sparse coding in terms of simplicity, performance, and suitability for real-time implementations, and they provide benchmark solutions for monaural source separation tasks.

Publication Types