• DocumentCode
    480701
  • Title

    A Wavelet-Based Model to Recognize High-Quality Topics on Web Forum

  • Author

    Chen, You ; Cheng, Xue-Qi ; Huang, Yu-Lan

  • Author_Institution
    Inst. of Comput. Technol., Chinese Acad. of Sci., Beijing
  • Volume
    1
  • fYear
    2008
  • fDate
    9-12 Dec. 2008
  • Firstpage
    343
  • Lastpage
    351
  • Abstract
    Web forum has become an important resource on the Web due to its rich information contributed by millions of Internet users every day. Meanwhile, thousands of junk or valueless messages exist in Web forum. Recognizing high-quality topics should be fundamental tasks in search engine and Web mining systems. However, it is not a trivial problem to quantify high-quality topics on web forum. Users face a daunting challenge in identifying a small subset of topics worthy of their attention. In this paper, we present several characteristics to measure high-quality topic, based on these characteristics, we propose a novel model to recognize high-quality topics on Web forum. Our model consists of three steps. First, time series signals which contain distinctive characteristics between high-quality topics and non-high-quality topics are extracted from topics. Second, features are obtained from signals by using wavelet packet transform (WPT). Third, upon the features, high-quality topics are recognized by using backpropagation neural network. Conducting experiments on Tencent Message Boards which have 2,710,994 messages and 189,962 authors ranging from Jan 1, 2005 to Nov 12, 2007, we demonstrate the efficiency of our model, showing that the average accuracy rate of high-quality topic recognition is 95% and nearly 50,000 topics can be recognized in one second.
  • Keywords
    Internet; backpropagation; data mining; neural nets; search engines; social sciences computing; Internet users; Tencent Message Boards; Web forum; Web mining systems; backpropagation neural network; search engine; wavelet-based model; Character recognition; Computers; Data mining; Discussion forums; Intelligent agent; Neural networks; Search engines; Wavelet packets; Wavelet transforms; Web mining; Feature Extraction; High-Quality topic; Wavelet Packet Transform; Web Forum;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Web Intelligence and Intelligent Agent Technology, 2008. WI-IAT '08. IEEE/WIC/ACM International Conference on
  • Conference_Location
    Sydney, NSW
  • Print_ISBN
    978-0-7695-3496-1
  • Type

    conf

  • DOI
    10.1109/WIIAT.2008.17
  • Filename
    4740470