DocumentCode :
1858313
Title :
Dynamic vocabulary adaptation for a daily and real-time broadcast news transcription system
Author :
Martins, C. ; Texeira, A. ; Neto, J.
Author_Institution :
Dept. Electron., Aveiro Univ., Aveiro
fYear :
2006
fDate :
10-13 Dec. 2006
Firstpage :
146
Lastpage :
149
Abstract :
The daily and real-time transcription of broadcast news (BN) is a challenging task both in acoustic and in language modeling. To achieve optimal performance, several problems have to be overcome. Particularly, when transcribing BN data in highly inflected languages, the vocabulary growth leads to high OOV word rates. To address this problem, we propose a daily vocabulary and LM adaptation framework which directly extracts new words based on contemporary written news available on the Internet and some linguistic knowledge about the words found on those news. Experiments have been carried out for a European Portuguese BN transcription system. Preliminary results computed on 7 shows, yields a relative reduction of 61% in OOV and 2.1% in WER.
Keywords :
Internet; natural language processing; speech processing; speech recognition; user interfaces; acoustic modeling; broadcast news transcription system; contemporary written news; dynamic vocabulary adaptation; language modeling; linguistic knowledge; natural language interfaces; speech recognition; Broadcasting; Error analysis; Frequency; Informatics; Internet; Lattices; Natural languages; Performance analysis; Real time systems; Vocabulary;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Spoken Language Technology Workshop, 2006. IEEE
Conference_Location :
Palm Beach
Print_ISBN :
1-4244-0872-5
Type :
conf
DOI :
10.1109/SLT.2006.326839
Filename :
4123383
Link To Document :
بازگشت