DocumentCode :
2910235
Title :
Applying Grapheme, Word, and Syllable Information for Language Identification in Code Switching Sentences
Author :
Yeong, Yin-Lai ; Tan, Tien-Ping
Author_Institution :
Sch. of Comput. Sci., Univ. Sains Malaysia, Minden, Malaysia
fYear :
2011
fDate :
15-17 Nov. 2011
Firstpage :
111
Lastpage :
114
Abstract :
In this paper, we propose an automatic language identification approach for code switching sentences by using the morphological structures and sequence of the syllable. The approach was tested on Malay-English code switching sentences. The proposed language identification approach achieves 90.75% in term of accuracy on the vocabularies. Our approach was further improved by combining the knowledge from other level in the sentence: word and alphabet. The additional information further improves the accuracy of our language identification method to 96.36%.
Keywords :
natural language processing; vocabulary; word processing; Malay-English code switching sentence; automatic language identification; morphological structure; syllable information; vocabulary; word information; Accuracy; Context; Interpolation; Probability; Speech; Switches; Vocabulary; Language identification; alphabet; code switching; discounting strategy; grapheme; interpolation; n-gram; syllable structure information; word;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Asian Language Processing (IALP), 2011 International Conference on
Conference_Location :
Penang
Print_ISBN :
978-1-4577-1733-8
Type :
conf
DOI :
10.1109/IALP.2011.34
Filename :
6121482
Link To Document :
بازگشت