• DocumentCode
    3563652
  • Title

    Harmful comments extraction from a Bulletin Board System using word meaning and impression on thread context

  • Author

    Nishihara, Yoko ; Iwasa, Kazuki ; Fukumoto, Junichi ; Yamanishi, Ryosuke

  • Author_Institution
    Coll. of Inf. Sci. & Eng., Ritsumeikan Univ., Kusatsu, Japan
  • fYear
    2014
  • Firstpage
    1398
  • Lastpage
    1402
  • Abstract
    Harmful documents make readers unpleasant on the Web. In order to hide the harmful documents from the public, machine learning methods have been proposed, which learn words used in harmful documents and hide them automatically. The learned words often have bad meanings. Though word meanings are not changed, word impression may be changed on context. Even if a word with bad impression is contained in a document, the previous learning methods can not learn the word, and fail to hide documents. We select the following approach: word impression may be changed on context. If a word has been used with other words of good meaning, it is considered that impression of the word is also good. In contrast, if a word has been used with others of bad meaning, impression of the word may be bad. This paper proposes a new extraction method of harmful comments in a thread of a Bulletin Board System. The proposed method extracts comments using word meanings and word impression on thread context. We evaluated the proposed method using comments collected from four threads in Japanese BBS "2-channel." The averaged precision of extraction was 0.47, and the averaged recall was 0.68. We verified that the proposed method was suitable for extraction of harmful comments from a thread of a BBS.
  • Keywords
    Internet; document handling; feature extraction; learning (artificial intelligence); Japanese BBS; bad meaning; bulletin board system; harmful comments extraction method; harmful documents; harmful documents hiding; machine learning methods; thread context; word impression; word meaning; word meanings; Context; Data mining; Educational institutions; Postal services; TV; Web pages;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Soft Computing and Intelligent Systems (SCIS), 2014 Joint 7th International Conference on and Advanced Intelligent Systems (ISIS), 15th International Symposium on
  • Type

    conf

  • DOI
    10.1109/SCIS-ISIS.2014.7044663
  • Filename
    7044663