DocumentCode :
3760818
Title :
Text chunker for Malayalam using Memory-Based Learning
Author :
Rekha Raj C. T.; Reghu Raj P. C.
Author_Institution :
Department of Computer Science and Engineering, Government Engineering College, Sreekrishnapuram, Kerala, India 678633
fYear :
2015
Firstpage :
595
Lastpage :
599
Abstract :
Text chunking consists of dividing a text into syntactically correlated parts of words. Given the words and their morphosyntactic class, a chunker will decide which words can be grouped as chunks. Malayalam is a free word order language and has relatively unrestricted phrase structures that make the problem of chunking quite challenging. This paper aims to develop a text chunker for Malayalam using Memory-Based Learning (MBL) approach. Memory-Based Learning is a machine learning methodology based on the idea that the direct reuse of examples using analogical reasoning is more suited for solving language processing problems than the application of rules extracted from those examples. The chunker was trained using the tool Memory-Based Tagger (MBT) with words and their POS tags as features. The chunker demonstrated an accuracy of 97.14%.
Keywords :
"Measurement","Compounds","Tagging","Training","Speech","Context","Hidden Markov models"
Publisher :
ieee
Conference_Titel :
Control Communication & Computing India (ICCC), 2015 International Conference on
Type :
conf
DOI :
10.1109/ICCC.2015.7432966
Filename :
7432966
Link To Document :
بازگشت