DocumentCode :
3142405
Title :
Automatic speech recognition for closed-captioning of Filipino news broadcasts
Author :
Ang, Federico ; Burgos, Maria Czarina ; De Lara, Marvin
Author_Institution :
Digital Signal Process. Lab., Univ. of the Philippines-Diliman, Quezon City, Philippines
fYear :
2011
fDate :
27-29 Nov. 2011
Firstpage :
328
Lastpage :
333
Abstract :
In this paper, the development of a closed captioning system for Filipino TV news programs is discussed. The researchers tested the system for offline captioning and evaluated the performance of the system based on word error rate (WER). Carnegie Mellon University´s open-source speech recognition system, Sphinx-III, was used as the primary training and recognition engine. A Filipino News Corpus was built consisting of speech and text data obtained from Filipino news videos. Training and testing sets were generated and from this, different training and decoding parameters of Sphinx were evaluated. Using the word error rate (WER) computation, the highest average recognition accuracy achieved in developing for the test set was 57.36% using flat start context-dependent models and a language model with absolute discounting applied. This project is a first step towards establishing the baseline accuracy for future development of the system.
Keywords :
natural language processing; public domain software; speech recognition; Carnegie Mellon University; Filipino TV news programs; Filipino news broadcast closed captioning; Filipino news videos; Sphinx-Ill; absolute discounting; automatic speech recognition; flat start context dependent models; open source speech recognition system; word error rate; Adaptation models; Analytical models; Cepstral analysis; Computational modeling; DH-HEMTs; Data models; Erbium; Filipino Speech Recognition; Sphinx-III; closed-captioning;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Natural Language Processing andKnowledge Engineering (NLP-KE), 2011 7th International Conference on
Conference_Location :
Tokushima
Print_ISBN :
978-1-61284-729-0
Type :
conf
DOI :
10.1109/NLPKE.2011.6138219
Filename :
6138219
Link To Document :
بازگشت