DocumentCode :
735034
Title :
A grapheme and phone rescoring combination system for Malay broadcast news recognition
Author :
Khalaf, Zainab A. ; Tien-Ping Tan ; Li-Pei Wong
Author_Institution :
Sch. of Comput. Sci., Univ. Sains Malaysia, Minden, Malaysia
fYear :
2015
fDate :
12-15 July 2015
Firstpage :
353
Lastpage :
357
Abstract :
The main motivation of this paper is to improve the automatic speech recognition (ASR) hypothesis in the Malay language. Manual news transcription is too expensive and takes a long time. Hence, without an ASR system, access to audio archives and searches within them would be restricted to the limited number of textual documents that have been manually transcribed by humans or indexed with keywords. Multiple hypotheses are useful because the single best recognition output still has numerous errors, even for state-of-the-art systems. In this paper, we propose an approach to reduce the word error rate (WER) in an ASR hypothesis. This approach is known as the three-pass combination method using parallel ASR systems. The three-pass combination system based on grapheme rescoring and phone rescoring re-evaluates all of the hypotheses produced by the ASR systems to produce a more accurate hypothesis. To evaluate the performance of the proposed approach, Malay broadcast news contains speech from newscaster, reporter and interviewers in noisy environments recorded from Malaysia local news channels are employed. This approach reduced the WER by 4.4% from 34.5% to 30.1%. The performance of the proposed approach was compared with six approaches that are frequently used for ASR rescoring and combination.
Keywords :
error statistics; natural language processing; speech recognition; ASR rescoring; Malay broadcast news recognition; Malay language; Malaysia local news channels; WER; audio archives; audio searches; automatic speech recognition; grapheme rescoring; manual news transcription; noisy environments; parallel ASR systems; phone rescoring combination system; textual documents; three-pass combination method; word error rate; Decision support systems; Frequency modulation; Indexes; ASR Combination; Automatic Speech Recognition; Broadcast News; Language Model; Malay Language;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Signal and Information Processing (ChinaSIP), 2015 IEEE China Summit and International Conference on
Conference_Location :
Chengdu
Type :
conf
DOI :
10.1109/ChinaSIP.2015.7230423
Filename :
7230423
Link To Document :
بازگشت