• DocumentCode
    3277395
  • Title

    Refine Crude Corpus for Opinion Mining

  • Author

    Bhattacharyya, Debnath ; Das, Poulami ; Mitra, Kheyali ; Mukherjee, Swarnendu ; Ganguly, Debashis ; Bandyopadhyay, Samir Kumar ; Kim, Tai-Hoon

  • Author_Institution
    Comput. Sci. & Eng. Dept., Heritage Inst. of Technol., Kolkata, India
  • fYear
    2009
  • fDate
    23-25 July 2009
  • Firstpage
    17
  • Lastpage
    22
  • Abstract
    This paper is meant for a heuristic approach for the refinements of corpus based on regular expressions and its possible applications in the field of opinion mining. Corpus which is the plural form of dasiacorporapsila is nothing but the collection of linguistic data. And here the proposed work is based on a corpus of reviews; more specifically product reviews. The reviews are in the HTML files which are easily available in popular review sites like Cnet.com. The revolution in information and technologies has given a new era in the development of language industries. The versatility in technological development, along with the translations available in different languages has lead to use of this corpus for specific machine learning mechanism as well as various automatic translation applications. But the prime objective of researchers as well as the naive users is to give a fast developing technique of machine learning systems that should be both exact and effective. Most of the time it becomes a very tedious job to create exact dataset for the work due to the crisis of accurate corpus regarding respective research work. And that is why; we have proposed an algorithm for creating a corpus for opinion mining research field.
  • Keywords
    data mining; hypermedia markup languages; learning (artificial intelligence); HTML files; machine learning mechanism; opinion mining; refine crude corpus; Computational intelligence; Corpus; Opinion Mining; crude corpus; language processing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computational Intelligence, Communication Systems and Networks, 2009. CICSYN '09. First International Conference on
  • Conference_Location
    Indore
  • Print_ISBN
    978-0-7695-3743-6
  • Type

    conf

  • DOI
    10.1109/CICSYN.2009.12
  • Filename
    5231656