DocumentCode :
3777326
Title :
Research of integrated algorithm establishment of a spam detection system
Author :
Ruxi Yin; Hanshi Wang; Lizhen Liu
Author_Institution :
Information and Engineering College, Capital Normal University, Beijing, China
Volume :
1
fYear :
2015
Firstpage :
584
Lastpage :
589
Abstract :
Nowadays, more and more people are getting engaged in the construction of the Internet, consciously or not, by posting their individual comments on it. In today´s big data era, opinion mining on customer´s opinions has become one of the most effective ways to roundly use the great amount of information. Opinion mining, a brand new section of unstructured information mining, is mainly related to emotional analysis, features digging and subjective comments recognition and so on. It is also an important part of knowledge discovery, often used to extract hidden information from unstructured or semi-structured data. In the field of key algorithm for opinion mining and integrating, opinion integration algorithm means a calculating method, which ignores the non-significant internal parts of the comments. That is, skipping the minor issues from the users´ comments, and focusing on the section of useful information, then summing up with some valuable conclusions for practical application. The research of opinion integration algorithm consists of four parts, namely, opinion spam detection opinion summarization, opinion visualization and opinion assessment. This paper focuses on opinion spam detection methods. Spam refers to fake user reviews, which means well-designed fake comments targeted at enhancing or damaging a specific product by an individual or an organization. Therefore, identifying spam comments becomes an important task for improving the authenticity and accuracy of opinion mining. We regard this task a classification problem. With the use of wed crawlers, segmentation system and artificial labeling methods, we acquired a big amount of online comments. By training these data and selecting the relevant features, we finally build a classifier. The results from this experiment show that the methods provided herein can achieve the purpose of preliminary comment spam detection.
Keywords :
"Classification algorithms","Feature extraction","Decision trees","Algorithm design and analysis","Data mining","Training","Prediction algorithms"
Publisher :
ieee
Conference_Titel :
Computer Science and Network Technology (ICCSNT), 2015 4th International Conference on
Type :
conf
DOI :
10.1109/ICCSNT.2015.7490814
Filename :
7490814
Link To Document :
بازگشت