DocumentCode
480701
Title
A Wavelet-Based Model to Recognize High-Quality Topics on Web Forum
Author
Chen, You ; Cheng, Xue-Qi ; Huang, Yu-Lan
Author_Institution
Inst. of Comput. Technol., Chinese Acad. of Sci., Beijing
Volume
1
fYear
2008
fDate
9-12 Dec. 2008
Firstpage
343
Lastpage
351
Abstract
Web forum has become an important resource on the Web due to its rich information contributed by millions of Internet users every day. Meanwhile, thousands of junk or valueless messages exist in Web forum. Recognizing high-quality topics should be fundamental tasks in search engine and Web mining systems. However, it is not a trivial problem to quantify high-quality topics on web forum. Users face a daunting challenge in identifying a small subset of topics worthy of their attention. In this paper, we present several characteristics to measure high-quality topic, based on these characteristics, we propose a novel model to recognize high-quality topics on Web forum. Our model consists of three steps. First, time series signals which contain distinctive characteristics between high-quality topics and non-high-quality topics are extracted from topics. Second, features are obtained from signals by using wavelet packet transform (WPT). Third, upon the features, high-quality topics are recognized by using backpropagation neural network. Conducting experiments on Tencent Message Boards which have 2,710,994 messages and 189,962 authors ranging from Jan 1, 2005 to Nov 12, 2007, we demonstrate the efficiency of our model, showing that the average accuracy rate of high-quality topic recognition is 95% and nearly 50,000 topics can be recognized in one second.
Keywords
Internet; backpropagation; data mining; neural nets; search engines; social sciences computing; Internet users; Tencent Message Boards; Web forum; Web mining systems; backpropagation neural network; search engine; wavelet-based model; Character recognition; Computers; Data mining; Discussion forums; Intelligent agent; Neural networks; Search engines; Wavelet packets; Wavelet transforms; Web mining; Feature Extraction; High-Quality topic; Wavelet Packet Transform; Web Forum;
fLanguage
English
Publisher
ieee
Conference_Titel
Web Intelligence and Intelligent Agent Technology, 2008. WI-IAT '08. IEEE/WIC/ACM International Conference on
Conference_Location
Sydney, NSW
Print_ISBN
978-0-7695-3496-1
Type
conf
DOI
10.1109/WIIAT.2008.17
Filename
4740470
Link To Document