Academic
Publications
Reconstructing spectral vectors with uncertain spectrographic masks for robust speech recognition

Reconstructing spectral vectors with uncertain spectrographic masks for robust speech recognition,10.1109/ASRU.2005.1566472,Bhiksha Raj,Rita Singh

Reconstructing spectral vectors with uncertain spectrographic masks for robust speech recognition   (Citations: 4)
BibTex | RIS | RefWorks Download
Missing-feature methods improve automatic recognition of noisy speech by removing unreliable noise corrupted spectrographic components from the signal. Recognition is performed either by modifying the recognizer to work from incomplete spectra, or by estimating the missing components to reconstruct complete spectra. While the former approach performs optimal classification with incomplete spectrograms, the latter permits recognition with cepstral features derived from reconstructed spectra. Traditionally, spectral components are considered unequivocally reliable or unreliable. Research has shown that the use of soft masks that provide a probability of reliability to spectral components instead can improve the performance of missing feature methods that modify the recognizer. However, soft masks have not been employed by methods that reconstruct the spectrogram. In this paper we present a new MMSE algorithm for spectrogram reconstruction. Experiments show that the use of soft masks results in significantly improved performance as compared to reconstruction methods that use binary masks
Cumulative Annual
View Publication
The following links allow you to view full publications. These links are maintained by other sources not affiliated with Microsoft Academic Search.
    • ...In [1] and [2], underlying clean speech spectral components are estimated using a Bayesian framework...
    • ...A study was performed to compare DWT-based imputation to an implementation of a well know MMSE-based data imputation approach [1] [2] on the Aurora 2 speech in noise task...

    Shirin Badiezadeganet al. A wavelet-based data imputation approach to spectrogram reconstruction...

    • ...In more recent work, Raj derived a bounded minimum mean squared error estimator [6], where the distribution of speech spectra was modeled as a diagonal covariance Gaussian mixture...
    • ...Note that the use of joint Gaussian distributions with diagonal covariance matrix Σ= diag(σ1 ,...,σ D) – as in [6] – inevitably leads to mean imputation...
    • ...The oracle experiments shown in Table 1 give a comparison of the proposed bounded conditional mean imputation (BCMI) method with standard conditional mean imputation (CMI) as well as diagonal covariance bounded mean imputation (DBMI) [6, 7]. Note that for CMI we enforced the upper bound by resetting those imputed components that exceeded the upper bound to the upper bound, as otherwise it consistently performed worse than the baseline...

    Friedrich Faubelet al. Bounded conditional mean imputation with Gaussian mixture models: A re...

    • ...Given these masks, a GMM signal model can be used to fill in the missing spectral regions that were labelled as unreliable and reconstruct the clean signal as in [5, 6, 7]...
    • ...The soft mask reconstruction process is described in [7]...
    • ...Reconstruction was performed by refiltering as in [2], where each cell of the mixed signal STFT is multiplied by the corresponding cell in the mask, and by MMSE reconstruction as in [7]...

    Ron J. Weisset al. Estimating Single-Channel Source Separation Masks: Relevance Vector Ma...

    • ...In the remaining part of this section we give an alternate and slightly generalized derivation of Raj’s soft-decision mean imputation approach [8], which can be regarded as a refinement of the softdecision mean imputation method devised in [9]...
    • ...As there might be uncertainty in d, the use of soft-decision masks or simply soft-masks has been proposed in [9], [10] and [8], whereby the decision in (2) is replaced by the probability ofxd being observable: d =P(xd nd): (3)...
    • ...Truncating a Gaussian also changes its mean, which was not taken into account in [7, 9], but was in [8]...
    • ...If we further assume that nd is uniformly distributed on [0;yd] we obtain Raj’s result [8]:...
    • ...This might be applied to approaches like [8] where the distribution of the noise is known...
    • ...A variety of different methods has been proposed for mask estimation, including computational auditory scene analysis (CASA) [7], spectral subtraction [6], the difference between cube root signal and noise energy [10], Bayesian classifiers as well as the Max-VQ algorithm [8]...

    Friedrich Faubelet al. PARTICLE FILTER BASED SOFT-MASK ESTIMATION FOR MISSING FEATURE RECONST...

Sort by: