Speaker localization using excitation source information in speech

Raykar, V. C. ; Yegnanarayana, B. ; Prasanna, S. R. M. ; Duraiswami, R. (2005) Speaker localization using excitation source information in speech IEEE Transactions on Speech and Audio Processing, 13 (5). pp. 751-761. ISSN 1063-6676

Full text not available from this repository.

Official URL: http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arn...

Related URL: http://dx.doi.org/10.1109/TSA.2005.851907

Abstract

This paper presents the results of simulation and real room studies for localization of a moving speaker using information about the excitation source of speech production. The first step in localization is the estimation of time-delay from speech collected by a pair of microphones. Methods for time-delay estimation generally use spectral features that correspond mostly to the shape of vocal tract during speech production. Spectral features are affected by degradations due to noise and reverberation. This paper proposes a method for localizing a speaker using features that arise from the excitation source during speech production. Experiments were conducted by simulating different noise and reverberation conditions to compare the performance of the time-delay estimation and source localization using the proposed method with the results obtained using the spectrum-based generalized cross correlation (GCC) methods. The results show that the proposed method shows lower number of discrepancies in the estimated time-delays. The bias, variance and the root mean square error (RMSE) of the proposed method is consistently equal or less than the GCC methods. The location of a moving speaker estimated using the time-delays obtained by the proposed method are closer to the actual values, than those obtained by the GCC method.

Item Type:Article
Source:Copyright of this article belongs to IEEE.
ID Code:57772
Deposited On:29 Aug 2011 11:58
Last Modified:29 Aug 2011 11:58

Repository Staff Only: item control page