• DocumentCode
    2251626
  • Title

    Investigation of Preprocessing of Multilingual Online Reviews for Automatic Classification

  • Author

    Okada, Makoto ; Hashimoto, Kiyota

  • Author_Institution
    Grad. Sch. of Sci., Osaka Prefecture Univ., Sakai, Japan
  • fYear
    2012
  • fDate
    May 30 2012-June 1 2012
  • Firstpage
    306
  • Lastpage
    309
  • Abstract
    Online reviews of commercial sites are important sources for customers to obtain information and opinions by other customers. However, generally, the reviews contain several mixed information such as purpose and sentiments of reviewers. Therefore, when the reviews are classified automatically using some machine learning methods, it is necessary to construct adequate data preliminary for machine learning. In this paper, we obtained multilingual reviews from a travel information portal site "TripAdvisor", and investigated whether preprocessing is effective or not in order to classify reviews appropriately using a machine learning method "Support Vector Machine (SVM)." We also investigated difference of effectiveness when the reviews are written by different languages.
  • Keywords
    Web sites; classification; customer services; learning (artificial intelligence); natural languages; portals; reviews; support vector machines; travel industry; SVM; TripAdvisor; automatic classification; commercial sites; data preliminary; machine learning methods; multilingual online review preprocessing; reviewer purpose; reviewer sentiments; support vector machine; travel information portal site; Accuracy; Data mining; Dictionaries; Learning systems; Support vector machine classification; Vectors; Multilingual online reviews; Preprocessing; Support vector machine;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer and Information Science (ICIS), 2012 IEEE/ACIS 11th International Conference on
  • Conference_Location
    Shanghai
  • Print_ISBN
    978-1-4673-1536-4
  • Type

    conf

  • DOI
    10.1109/ICIS.2012.64
  • Filename
    6211114