Spectral mapping using artificial neural networks for voice conversion

Desai, S. ; Black, A. W. ; Yegnanarayana, B. ; Prahallad, K. (2010) Spectral mapping using artificial neural networks for voice conversion IEEE Transactions on Audio, Speech and Language Processing, 18 (5). pp. 954-964. ISSN 1558-7916

Full text not available from this repository.

Official URL: http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arn...

Related URL: http://dx.doi.org/10.1109/TASL.2010.2047683

Abstract

In this paper, we use artificial neural networks (ANNs) for voice conversion and exploit the mapping abilities of an ANN model to perform mapping of spectral features of a source speaker to that of a target speaker. A comparative study of voice conversion using an ANN model and the state-of-the-art Gaussian mixture model (GMM) is conducted. The results of voice conversion, evaluated using subjective and objective measures, confirm that an ANN-based VC system performs as good as that of a GMM-based VC system, and the quality of the transformed speech is intelligible and possesses the characteristics of a target speaker. In this paper, we also address the issue of dependency of voice conversion techniques on parallel data between the source and the target speakers. While there have been efforts to use nonparallel data and speaker adaptation techniques, it is important to investigate techniques which capture speaker-specific characteristics of a target speaker, and avoid any need for source speaker's data either for training or for adaptation. In this paper, we propose a voice conversion approach using an ANN model to capture speaker-specific characteristics of a target speaker and demonstrate that such a voice conversion approach can perform monolingual as well as cross-lingual voice conversion of an arbitrary source speaker.

Item Type:Article
Source:Copyright of this article belongs to IEEE.
ID Code:57793
Deposited On:29 Aug 2011 12:10
Last Modified:29 Aug 2011 12:10

Repository Staff Only: item control page