• DocumentCode
    2349003
  • Title

    Research on sentiment classification of Blog based on PMI-IR

  • Author

    Duan, Xiuting ; He, Tingting ; Song, Le

  • Author_Institution
    Dept. of Comput. Sci., Huazhong Normal Univ., Wuhan, China
  • fYear
    2010
  • fDate
    21-23 Aug. 2010
  • Firstpage
    1
  • Lastpage
    6
  • Abstract
    Development of Blog texts information on the internet has brought new challenge to Chinese text classification. Aim to solving the semantics deficiency problem in traditional methods for Chinese text classification, this paper implements a text classification method on classifying a blog as joy, angry, sad or fear using a simple unsupervised learning algorithm. The classification of a blog text is predicted by the max semantic orientation (SO) of the phrases in the blog text that contains adjectives or adverbs. In this paper, the SO of a phrase is calculated as the mutual information between the given phrase and the polar words. Then the SO of the given blog text is determined by the max mutual information value. A blog text is classified as joy if the SO of its phrases is joy. Two different corpora are adopted to test our method, one is the Blog corpus collected by Monitor and Research Center for National Language Resource Network Multimedia Sub-branch Center, and the other is Chinese dataset provided by COAE2008 task. Based on the two datasets, the method respectively achieves a high improvement compared to the traditional methods.
  • Keywords
    Web sites; information retrieval; pattern classification; text analysis; unsupervised learning; Chinese text classification; PMI-IR algorithm; blog corpus; blog texts information; information retrieval; max semantic orientation; point-wise mutual information; sentiment classification; unsupervised learning algorithm; Classification algorithms; Mutual Information; PMI-IR Algorithm; Semantic Classification;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Natural Language Processing and Knowledge Engineering (NLP-KE), 2010 International Conference on
  • Conference_Location
    Beijing
  • Print_ISBN
    978-1-4244-6896-6
  • Type

    conf

  • DOI
    10.1109/NLPKE.2010.5587849
  • Filename
    5587849