DocumentCode :
2364447
Title :
Name extraction for unstructured Malay text
Author :
Sharum, Mohd Yunus ; Abdullah, Muhamad Taufik ; Sulaiman, Md Nasir ; Murad, Masrah Azrifah Azmi ; Hamzah, Zaitul Azma Zainon
Author_Institution :
Fac. of Comp. Sci. & Info. Technol., UPM, Serdang, Malaysia
fYear :
2011
fDate :
20-23 March 2011
Firstpage :
787
Lastpage :
791
Abstract :
Names are categorized as proper nouns. Identifying nouns in unstructured text is very challenging since the number is almost unlimited. Name recognition can be used for part-of-speech (POS) tagging or automatic term acquisition in natural language processing (NLP). In this paper we proposed a general approach to recognize names in Malay text. Using the proposed approach, we implement an application of free indexing for indexing names from a collection of Malay texts. Our evaluation shows that the application reach 92% precision score, 54% recall score, and F-score 68% in indexing names from news´ articles.
Keywords :
feature extraction; natural language processing; text analysis; automatic term acquisition; indexing; name extraction; name recognition; natural language processing; part of speech tagging; unstructured Malay text; Bridges; Communities; Indexing; Natural language processing; Roads; Text recognition; Malay text processing; Name extraction; Name recognition; Pattern recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computers & Informatics (ISCI), 2011 IEEE Symposium on
Conference_Location :
Kuala Lumpur
Print_ISBN :
978-1-61284-689-7
Type :
conf
DOI :
10.1109/ISCI.2011.5959017
Filename :
5959017
Link To Document :
بازگشت