DocumentCode
3277395
Title
Refine Crude Corpus for Opinion Mining
Author
Bhattacharyya, Debnath ; Das, Poulami ; Mitra, Kheyali ; Mukherjee, Swarnendu ; Ganguly, Debashis ; Bandyopadhyay, Samir Kumar ; Kim, Tai-Hoon
Author_Institution
Comput. Sci. & Eng. Dept., Heritage Inst. of Technol., Kolkata, India
fYear
2009
fDate
23-25 July 2009
Firstpage
17
Lastpage
22
Abstract
This paper is meant for a heuristic approach for the refinements of corpus based on regular expressions and its possible applications in the field of opinion mining. Corpus which is the plural form of dasiacorporapsila is nothing but the collection of linguistic data. And here the proposed work is based on a corpus of reviews; more specifically product reviews. The reviews are in the HTML files which are easily available in popular review sites like Cnet.com. The revolution in information and technologies has given a new era in the development of language industries. The versatility in technological development, along with the translations available in different languages has lead to use of this corpus for specific machine learning mechanism as well as various automatic translation applications. But the prime objective of researchers as well as the naive users is to give a fast developing technique of machine learning systems that should be both exact and effective. Most of the time it becomes a very tedious job to create exact dataset for the work due to the crisis of accurate corpus regarding respective research work. And that is why; we have proposed an algorithm for creating a corpus for opinion mining research field.
Keywords
data mining; hypermedia markup languages; learning (artificial intelligence); HTML files; machine learning mechanism; opinion mining; refine crude corpus; Computational intelligence; Corpus; Opinion Mining; crude corpus; language processing;
fLanguage
English
Publisher
ieee
Conference_Titel
Computational Intelligence, Communication Systems and Networks, 2009. CICSYN '09. First International Conference on
Conference_Location
Indore
Print_ISBN
978-0-7695-3743-6
Type
conf
DOI
10.1109/CICSYN.2009.12
Filename
5231656
Link To Document