• DocumentCode
    3761941
  • Title

    PTokenizer: POS tagger Tokenizer

  • Author

    Saeed Rahmani Seyyed;Mostafa Fakhrahmad;Mohammad Hadi Sadredini

  • Author_Institution
    Department of Computer and IT Engineering, Shiraz University, Shiraz, Iran
  • fYear
    2015
  • Firstpage
    252
  • Lastpage
    256
  • Abstract
    By the advent of new information sources and the expansion of text data, natural language processing (NLP) has become one of the key parts of all the systems dealing with human written texts, and part of speech (POS) tagging is an inseparable part of all NLP tasks. As a result, it is of the paramount importance to enhance the accuracy of POS tagging. In this paper, applying language model and statistical information, we introduce a new approach to tokenize sentences and prepare them to be labeled by POS taggers. An evaluation shows that the proposed method yields a precision of 98 percent for tokenizing, and applying it to a Maximum Likelihood and TnT POS taggers achieve improvement in the accuracy of Persian POS tagging.
  • Keywords
    "Decision support systems","Power capacitors","Speech","Speech processing","Tagging","Probabilistic logic","Compounds"
  • Publisher
    ieee
  • Conference_Titel
    Knowledge-Based Engineering and Innovation (KBEI), 2015 2nd International Conference on
  • Type

    conf

  • DOI
    10.1109/KBEI.2015.7436056
  • Filename
    7436056