DocumentCode :
2009349
Title :
Language identification in code-switching speech using word-based lexical model
Author :
Lyu, Dou-Cheng ; Zhu, Cing-lei ; Lyu, Ren-Yuan ; Ko, Ming-Tat
Author_Institution :
Temasek Labs., Univ., Singapore, Singapore
fYear :
2010
fDate :
Nov. 29 2010-Dec. 3 2010
Firstpage :
460
Lastpage :
464
Abstract :
In this paper, a language identification (LID) task is described on Mandarin/Taiwanese code-switching utterances. The proposed word-based lexical model of this LID system integrates acoustic, phonetic and lexical cues. The first two cues are obtained from a large vocabulary continuous speech recognition (LYCSR) system, and the last one is trained for a word-based lexical model. The lexical model is used to identify languages according to the frequency and context of each word by given a sequence of words recognized by the LVCSR system. Because the switching unit in the code-switching speech is a word, the experiments showed that, by using a word-based lexical model, 16% relative reduction of classification errors was achieved compared with that in those LVSCR-based LID systems.
Keywords :
linguistics; speech recognition; switching; vocabulary; classification error; code switching speech; language identification; large vocabulary continuous speech recognition; switching unit; word based lexical model; Code-switching; Speech Recognition; Taiwanese Mandarin;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Chinese Spoken Language Processing (ISCSLP), 2010 7th International Symposium on
Conference_Location :
Tainan
Print_ISBN :
978-1-4244-6244-5
Type :
conf
DOI :
10.1109/ISCSLP.2010.5684483
Filename :
5684483
Link To Document :
بازگشت