DocumentCode :
3101913
Title :
On modeling non-word events in Large Vocabulary Continuous Speech Recognition
Author :
Sarosi, Gellert ; Tarjan, Balazs ; Balog, Andras ; Mozsolics, T. ; Mihajlik, Peter ; Fegyo, Tibor
Author_Institution :
Budapest University of Technology and Economics, Hungary
fYear :
2012
fDate :
2-5 Dec. 2012
Firstpage :
649
Lastpage :
653
Abstract :
This paper focuses on the integration of non-word acoustic events into LVCSR (Large Vocabulary Continuous Speech Recognition). Non-word events may have an important role in cognitive, paraverbal infocommunication; however, they often are not modeled explicitly due to computational difficulties. In our experiments a serial and a loopback WFST (Weighted Finite State Transducer) architecture was built to recognize and/or print out certain non-word events on the output. We have used a Hungarian Broadcast News corpus to evaluate the results. No performance degradation was observed in terms of normal word recognition accuracy as compared to the baseline, where no non-word event modeling was applied. The non-word event recognition accuracy was, however, lower than expected. One of the most likely reasons may be the less consistent manual transcription as compared to the normal words. Nonetheless, some of the non-word events were mostly correctly recognized. The loopback architecture has higher memory requirement, but gives significantly better non-word event accuracies, without any increase of recognition time.
Keywords :
Broadcast News; LVCSR; WFST; cognitive infocommunication; non-word event recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Cognitive Infocommunications (CogInfoCom), 2012 IEEE 3rd International Conference on
Conference_Location :
Kosice, Slovakia
Print_ISBN :
978-1-4673-5187-4
Electronic_ISBN :
978-1-4673-5186-7
Type :
conf
DOI :
10.1109/CogInfoCom.2012.6421932
Filename :
6421932
Link To Document :
بازگشت