Data-driven temporal filters and alternatives to GMM in speaker verification

Malayath, Narendranath ; Hermansky, Hynek ; Kajarekar, Sachin ; Yegnanarayana, B. (2000) Data-driven temporal filters and alternatives to GMM in speaker verification Digital Signal Processing, 10 (1-3). pp. 55-74. ISSN 1051-2004

Full text not available from this repository.

Official URL: http://www.sciencedirect.com/science/article/pii/S...

Related URL: http://dx.doi.org/10.1006/dspr.1999.0363

Abstract

This paper discusses the research directions pursued jointly at the Anthropic Signal Processing Group of the Oregon Graduate Institute and at the Speech and Vision Laboratory of the Indian Institute of Technology Madras. Current methods for speaker verification are based on modeling the speaker characteristics using Gaussian mixture models (GMM). The performance of these systems significantly degrades if the target speakers use a telephone handset that is different from that used while training. Conventional methods for channel normalization include utterance-based mean subtraction (MS) and RelAtive SpecTrAl (RASTA) filtering. In this paper we introduce a novel method for designing filters that are capable of normalizing the variability introduced by different telephone handsets. The design of the filter is based on the estimated second-order statistics of handset variability. This filter is applied on the logarithmic energy outputs of Mel spaced filter banks. We also demonstrate the effectiveness of the proposed channel normalizing filter in improving speaker verification performance in mismatched conditions. GMM-based systems often use thousands of mixture components and hence require a large number of parameters to characterize each target speaker. In order to address this issue we propose an alternative to GMM for modeling speaker characteristics. The alternative is based on speaker-specific mapping and it relies on a speaker-independent representation of speech.

Item Type:Article
Source:Copyright of this article belongs to Elsevier Science.
Keywords:Modulation Spectrum; Temporal Processing; Speaker Verification; Channel Variability; Data-driven Filter Design
ID Code:57722
Deposited On:29 Aug 2011 11:52
Last Modified:29 Aug 2011 11:52

Repository Staff Only: item control page