DocumentCode :
1638446
Title :
Automatic identification of assamese and bodo multiword expressions
Author :
Barman, A.K. ; Sarmah, J. ; Sarma, S.K.
Author_Institution :
Dept. of Inf. Technol., Gauhati Univ., Guwahati, India
fYear :
2013
Firstpage :
26
Lastpage :
30
Abstract :
Multiword Expressions (MWEs) are sequence of words separated by space or delimiter which determines a unique meaning instead of words´ individual meanings. Our work concentrates on automatic identification of MWEs for two less computationally aware languages Assamese and Bodo spoken in the North Eastern part of India. Statistical measure and Language specific knowledge helps us to extract MWEs from raw corpus. Natural Language Processing tasks in Assamese and Bodo languages have started in recent years, and this is the first organised approach to exploit MWEs in both these languages. Linguistics aspects for analysing the results have been considered, and we have found the results quite satisfactory.
Keywords :
linguistics; natural language processing; pattern recognition; statistical analysis; text analysis; Assamese language; Bodo language; MWE; computationally aware languages; language specific knowledge; linguistics; multiword expressions automatic identification; natural language processing; north eastern India; statistical measure; Bismuth; Informatics; Assamese; Bodo; MWEs; NLP; Statistical measure;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Advances in Computing, Communications and Informatics (ICACCI), 2013 International Conference on
Conference_Location :
Mysore
Print_ISBN :
978-1-4799-2432-5
Type :
conf
DOI :
10.1109/ICACCI.2013.6637141
Filename :
6637141
Link To Document :
بازگشت