• DocumentCode
    3403991
  • Title

    An Incremental Crawler for Web Video Based on Content Longevity

  • Author

    Feng Lu ; Zaiyang Tang ; Xiaofei Liao ; Hai Jin

  • Author_Institution
    Services Comput. Technol. & Syst. Lab., Huazhong Univ. of Sci. & Technol., Wuhan, China
  • fYear
    2013
  • fDate
    22-23 Aug. 2013
  • Firstpage
    98
  • Lastpage
    102
  • Abstract
    The explosive growth of online videos is crucial to the development of video search engines. Search engines use crawlers to retrieve pages and then discover new ones by extracting the pages´ outgoing links. However, the ephemeral and persistent content which are distinguished by the web crawlers are also exist on the online video pages and are rarely noticed by video search engines. Based on this observation, we characterize the longevity of content found on the video pages and develop an incremental crawler. In the crawling policy, a practical meaningful method to estimate utility threshold is given. As we show via experiments over real web data, our refresh policy obtain better freshness at lower cost, compared with previous approaches.
  • Keywords
    data mining; search engines; video retrieval; Web data; Web video; content longevity; crawling policy; ephemeral contents; incremental crawler; online video pages; page outgoing link extraction; page retrieval; persistent contents; refresh policy; video search engines; Bandwidth; Crawlers; Educational institutions; Fingerprint recognition; Search engines; Web pages; Incremental crawling; content longevity; web video;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    ChinaGrid Annual Conference (ChinaGrid), 2013 8th
  • Conference_Location
    Changchun
  • Print_ISBN
    978-0-7695-5058-9
  • Type

    conf

  • DOI
    10.1109/ChinaGrid.2013.16
  • Filename
    6623874