DocumentCode :
589169
Title :
Fine-grained Product Features Extraction and Categorization in Reviews Opinion Mining
Author :
Sheng Huang ; Xinlan Liu ; Xueping Peng ; Zhendong Niu
Author_Institution :
Sch. of Comput. Sci. & Technol., Beijing Inst. of Technol., Beijing, China
fYear :
2012
fDate :
10-10 Dec. 2012
Firstpage :
680
Lastpage :
686
Abstract :
With the growth of user-generated contents on the Web, product reviews opinion mining increasingly becomes a research practice of great value to e-commerce, search and recommendation. Unfortunately, the number of reviews is rising up to hundreds or even thousands, especially for some popular items, which makes it a laborious work for the potential buyers and the manufacturers to read through them to make a wise decision. Besides, the free format and the uncertainty of reviews expressions, make fine-grained product features extraction and categorization a more difficult task than traditional information extraction techniques. In this work, we propose to treat product feature extraction as a sequence labeling task and employ a discriminative learning model using Conditional Random Fields (CRFs) to tackle it. We innovatively incorporate the part-of-speech features and the sentence structure features into the CRFs learning process. For product feature categorization, we introduce the semantic knowledge-based and distributional context-based similarity measures to calculate the similarities between product feature expressions, then an effective graph pruning based categorizing algorithm is proposed to classify the collection of feature expressions into different semantic groups. The empirical studies have proved the effectiveness and efficiency of our approaches compared with other counterpart methods.
Keywords :
data mining; decision making; electronic commerce; feature extraction; graph theory; information retrieval; knowledge based systems; learning (artificial intelligence); random processes; CRF learning process; conditional random fields; decision making; discriminative learning model; distributional context-based similarity measures; e-commerce; fine-grained product feature categorization; fine-grained product feature extraction; graph pruning based categorizing algorithm; part-of-speech features; product feature expression collection classification; product review opinion mining; semantic groups; semantic knowledge-based similarity measures; sentence structure features; sequence labeling task; user-generated contents; Batteries; Context; Entropy; Feature extraction; Lenses; Semantics; Syntactics; conditional random fields; extraction and categorization; product features; similarity calculation;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Mining Workshops (ICDMW), 2012 IEEE 12th International Conference on
Conference_Location :
Brussels
Print_ISBN :
978-1-4673-5164-5
Type :
conf
DOI :
10.1109/ICDMW.2012.53
Filename :
6406505
Link To Document :
بازگشت