DocumentCode :
172571
Title :
Automatic acquisition of morphological resources for Melanau language
Author :
Saee, Suhaila ; Lay-Ki Soon ; Tek-Yong Lim ; Ranaivo-Malancon, Bali ; Juk, Jovianna ; Tang, Enya Kong
Author_Institution :
Fac. of Comput. & Inf., Multimedia Univ., Cyberjaya, Malaysia
fYear :
2014
fDate :
20-22 Oct. 2014
Firstpage :
203
Lastpage :
206
Abstract :
Computational morphological resources are the crucial component needed in providing morphological information to create morphological analyser. To acquire the morphological resources in a manual way, two main components are required. The components, which are preprocessing and morphology induction, have led to two issues: i) time consuming and ii) ambiguity in managing the resources from under-resourced languages perspective. We proposed an automatic acquisition of morphological resources tool, which is an extension from the manual way, to overcome the mentioned issues. In this work, three main modules in the proposed automatic tool are: i) tokenization - to tokenise a raw text and generate a wordlist, ii) conversion - to convert a softcopy of morphological resources into required formats and iii) integration of segmentation tools - to integrate two established segmentation tools, namely, Linguistica and Morfessor, in obtaining morphological information from the generated wordlist. Two testing methods have been conducted are component and integration testing. Result shows the proposed tool has been devised and the effectiveness has been demonstrated which allows the linguist to obtain their wordlist and segmented data easily. We believe the proposed tool will assist other researchers to construct computational morphological resources in automated way for under-resourced languages.
Keywords :
natural language processing; resource allocation; Linguistica tool; Melanau language; Morfessor tool; automatic morphological resource acquisition; component testing; computational morphological resources; conversion module; integration module; integration testing; morphological analyser; morphological information; tokenization module; under-resourced languages perspective; wordlist generation; Computer science; Conferences; Educational institutions; Manuals; Morphology; Testing; computational morphological resources; morphological analyser; pre-processing; under-resourced language;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Asian Language Processing (IALP), 2014 International Conference on
Conference_Location :
Kuching
Type :
conf
DOI :
10.1109/IALP.2014.6973523
Filename :
6973523
Link To Document :
بازگشت