Word boundary hypothesization in Hindi speech

Ramana Rao, G. V. ; Yegnanarayana, B. (1991) Word boundary hypothesization in Hindi speech Computer Speech & Language, 5 (4). pp. 379-392. ISSN 0885-2308

Full text not available from this repository.

Official URL: http://www.sciencedirect.com/science/article/pii/0...

Related URL: http://dx.doi.org/10.1016/0885-2308(91)90005-B

Abstract

This paper proposes a method for hypothesizing word boundaries in Hindi speech. The method is based on the observation that function words such as case markers, pronouns and conjunctions occur frequently in Hindi text and spotting of these frequently occurring patterns is proposed as a means for hypothesizing word boundaries in a speech-to-text conversion system for Hindi. Initially, the idea was tested on a correct text with all word boundaries (except sentence boundaries) removed; the results showed that nearly 67% of the word boundaries were correctly hypothesized. Later, experiments with input containing errors simulated to represent speech environment showed that the proposed method is effective even at error levels as high as 50%. The implications of these results in the development of a speech-to-text conversion system for Hindi are discussed.

Item Type:	Article
Source:	Copyright of this article belongs to Elsevier Science.
ID Code:	57741
Deposited On:	29 Aug 2011 11:49
Last Modified:	29 Aug 2011 11:49

Repository Staff Only: item control page