DocumentCode
3761941
Title
PTokenizer: POS tagger Tokenizer
Author
Saeed Rahmani Seyyed;Mostafa Fakhrahmad;Mohammad Hadi Sadredini
Author_Institution
Department of Computer and IT Engineering, Shiraz University, Shiraz, Iran
fYear
2015
Firstpage
252
Lastpage
256
Abstract
By the advent of new information sources and the expansion of text data, natural language processing (NLP) has become one of the key parts of all the systems dealing with human written texts, and part of speech (POS) tagging is an inseparable part of all NLP tasks. As a result, it is of the paramount importance to enhance the accuracy of POS tagging. In this paper, applying language model and statistical information, we introduce a new approach to tokenize sentences and prepare them to be labeled by POS taggers. An evaluation shows that the proposed method yields a precision of 98 percent for tokenizing, and applying it to a Maximum Likelihood and TnT POS taggers achieve improvement in the accuracy of Persian POS tagging.
Keywords
"Decision support systems","Power capacitors","Speech","Speech processing","Tagging","Probabilistic logic","Compounds"
Publisher
ieee
Conference_Titel
Knowledge-Based Engineering and Innovation (KBEI), 2015 2nd International Conference on
Type
conf
DOI
10.1109/KBEI.2015.7436056
Filename
7436056
Link To Document