DocumentCode :
2753807
Title :
A Novel Approach for Refinement of Corpus in the Field of Opinion Mining
Author :
Bhattacharyya, Debnath ; Das, Poulami ; Mitra, Kheyali ; Ganguly, Debashis ; Mukherjee, Swarnendu ; Bandyopadhyay, S.K. ; Kim, Tai-Hoon
Author_Institution :
Comput. Sci. & Eng. Dept., Heritage Inst. of Technol., Kolkata, India
fYear :
2009
fDate :
7-9 March 2009
Firstpage :
281
Lastpage :
285
Abstract :
In this paper, we have provided a heuristic approach for the refinements of corpus based on regular expressions and its possible applications in the field of Opinion Mining. The proposed work is based on a corpus of reviews. The crude corpus is the raw html files containing reviews. This html file is refined further for the ease of our work so that we can get only the required part from that page. The ultimate output yields the xml files which will precisely store the important parts of the review pages from that refined html page. And that is going to be fed to the further process of language processing for machine learning process in the field of Opinion Mining.
Keywords :
XML; data mining; hypermedia markup languages; learning (artificial intelligence); natural language processing; HTML files; XML files; corpus refinement; crude corpus; language processing; machine learning process; opinion mining field; review corpus; Application software; Computer science; Frequency; HTML; Humans; Learning systems; Machine learning; Natural language processing; Natural languages; Speech; Corpus; crude corpus; natural language processing; regular expression;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Future Networks, 2009 International Conference on
Conference_Location :
Bangkok
Print_ISBN :
978-0-7695-3567-8
Type :
conf
DOI :
10.1109/ICFN.2009.24
Filename :
5189944
Link To Document :
بازگشت