Title :
Web based and voice enabled IVRS for large scale Malayalam speech data collection
Author :
Shobana Devi, P. ; Das, Divya ; Stephen, Jose ; Bhadran, V.K.
Author_Institution :
Centre for Dev. of Adv. Comput., Trivandrum, India
Abstract :
Speech corpora are vital resource in development and evaluation of automatic speech recognition systems, as well as for acoustic phonetic studies. Collecting a huge corpus is not an easy task. The lack of such resources is one of the reasons for the absence of good quality speech recognition systems in Indian languages. Here we have automated such process by developing web based tool for collecting broad band speech data and an IVR system with speech recognition for collecting narrow band speech data. The main features includes the full support for the typical recording, annotation and project administration workflow, easy editing of the speech content, with an advantage of a fully localizable user interface. This paper describes in detail the development of web based speech collection tool and an IVR system which will enable end-to-end building of speech corpus with minimum manual effort.
Keywords :
Internet; computer aided instruction; natural language processing; speech processing; speech recognition; Indian languages; Web based enabled IVRS; acoustic phonetic studies; automatic speech recognition systems; large scale Malayalam speech data collection; project administration workflow; speech corpora; voice enabled IVRS; Acoustics; Buildings; Data collection; Databases; Servers; Speech; Speech recognition; Automatic Speech Recognizer; IVRS; Malayalam speech database collection; Speech corpus; TTS; Web interface; content management systems; web-based recording;
Conference_Titel :
Contemporary Computing and Informatics (IC3I), 2014 International Conference on
Conference_Location :
Mysore
DOI :
10.1109/IC3I.2014.7019717