• DocumentCode
    1679733
  • Title

    Research and Implementation of Web Structure-Based News Gathering System

  • Author

    Chen, Jianguo ; Lu, Minrong ; Ke, XiaoYu

  • Author_Institution
    Software Sch., Hunan Univ., Changsha, China
  • fYear
    2011
  • Firstpage
    502
  • Lastpage
    505
  • Abstract
    On the basis of depth studying the technology of web information gathering, a web structure-based news gathering model is proposed. Firstly, it load the gathering entry address, find the news list page with the Information Gathering and Filter Algorithm, then identify and improve the news content page link address according to the rules set by acquisition and combined with regular expression technology automatically, and then load the target page-news content page, gather the news information with the algorithm automatically. At the same time, it can filter any information that is set in this page such as embedded advertising messages. Practical results show that the proposed model works well, it can gather news information efficiently and automatically.
  • Keywords
    Internet; information filtering; Web structure based news gathering system; filter algorithm; information gathering; news content page; Algorithm design and analysis; Data mining; Educational institutions; Filtering algorithms; Information filters; Web pages; News Gathering; Regular Expressions; Web Gathering; Web Structure;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Future Computer Science and Education (ICFCSE), 2011 International Conference on
  • Conference_Location
    Xi´an
  • Print_ISBN
    978-1-4577-1562-4
  • Type

    conf

  • DOI
    10.1109/ICFCSE.2011.128
  • Filename
    6041745