Improving recognition accuracy on CVSD speech under mismatched conditions

Ganapathiraju, Madhavi K. ; Balakrishnan, N. ; Reddy, Raj (2003) Improving recognition accuracy on CVSD speech under mismatched conditions WSEAS Transaction on Computers, 2 (4). pp. 887-892. ISSN 1109-2750

Preview

PDF - Author Version
140kB

Official URL: http://www.cs.cmu.edu/~madhavi/publications/Ganapa...

Abstract

Emerging technology in mobile communications is seeing increasingly high acceptance as a preferred choice for last-mile communication. There have been a wide range of techniques to achieve signal compression to suit to the smaller bandwidths available on mobile communication channels; but speech recognition methods have seen success mostly only in controlled speech environments. However, designing of speech recognition systems for mobile communications is crucial in order to provide voice enabled command and control and for applications like Mobile Voice Commerce. Continuously Variable Slope Delta (CVSD) modulation, a technique for low bitrate coding of speech, has been in use particularly in military wireless environments for over 30 years, and is now also adopted by BlueTooth. CVSD is particularly suitable for Internet and mobile environments due to its robustness against transmission errors, and simplicity of implementation and the absence of a need for synchronization. In this paper, we study some characteristics of the CVSD speech in the context of robust recognition of compressed speech, and present two methods of improving the recognition accuracy in Automatic Speech Recognition (ASR) systems. We study the characteristics of the features extracted for ASR and how they relate to the corresponding features computed from Pulse Coded Modulation (PCM) speech and apply this relation to correct the CVSD features to improve recognition accuracy. Secondly we show that the ASR done on bit-streams directly, gives a good recognition accuracy and when combined with our approach gives a better accuracy.

Item Type:	Article
Source:	Copyright of this article belongs to The World Scientific and Engineering Academy and Society.
Keywords:	CVSD; Bitstream; Speech Recognition; Corrected MFCC Estimation
ID Code:	64451
Deposited On:	10 Oct 2011 07:26
Last Modified:	18 May 2016 12:52

Repository Staff Only: item control page

PlumX Metrics