Title :
Dynamic vocabulary adaptation for a daily and real-time broadcast news transcription system
Author :
Martins, C. ; Texeira, A. ; Neto, J.
Author_Institution :
Dept. Electron., Aveiro Univ., Aveiro
Abstract :
The daily and real-time transcription of broadcast news (BN) is a challenging task both in acoustic and in language modeling. To achieve optimal performance, several problems have to be overcome. Particularly, when transcribing BN data in highly inflected languages, the vocabulary growth leads to high OOV word rates. To address this problem, we propose a daily vocabulary and LM adaptation framework which directly extracts new words based on contemporary written news available on the Internet and some linguistic knowledge about the words found on those news. Experiments have been carried out for a European Portuguese BN transcription system. Preliminary results computed on 7 shows, yields a relative reduction of 61% in OOV and 2.1% in WER.
Keywords :
Internet; natural language processing; speech processing; speech recognition; user interfaces; acoustic modeling; broadcast news transcription system; contemporary written news; dynamic vocabulary adaptation; language modeling; linguistic knowledge; natural language interfaces; speech recognition; Broadcasting; Error analysis; Frequency; Informatics; Internet; Lattices; Natural languages; Performance analysis; Real time systems; Vocabulary;
Conference_Titel :
Spoken Language Technology Workshop, 2006. IEEE
Conference_Location :
Palm Beach
Print_ISBN :
1-4244-0872-5
DOI :
10.1109/SLT.2006.326839