• DocumentCode
    3461683
  • Title

    Building domain-specific Web collections for scientific digital libraries: a meta-search enhanced focused crawling method

  • Author

    Qin, Jialun ; Zhou, Yilu ; Chau, Michael

  • Author_Institution
    Dept. of Manage. Inf. Syst., Arizona Univ., Tucson, AZ, USA
  • fYear
    2004
  • fDate
    7-11 June 2004
  • Firstpage
    135
  • Lastpage
    141
  • Abstract
    Collecting domain-specific documents from the Web using focused crawlers has been considered one of the most important strategies to build digital libraries that serve the scientific community. However, because most focused crawlers use local search algorithms to traverse the Web space, they could be easily trapped within a limited sub-graph of the Web that surrounds the starting URLs and build domain-specific collections that are not comprehensive and diverse enough to scientists and researchers. We investigated the problems of traditional focused crawlers caused by local search algorithms and proposed a new crawling approach, meta-search enhanced focused crawling, to address the problems. We conducted two user evaluation experiments to examine the performance of our proposed approach and the results showed that our approach could build domain-specific collections with higher quality than traditional focused crawling techniques.
  • Keywords
    Internet; digital libraries; meta data; search engines; URL; crawling method; domain-specific Web collection; local search algorithm; meta-search; scientific digital library; Algorithm design and analysis; Crawlers; Information retrieval; Management information systems; Metasearch; Search engines; Software libraries; Uniform resource locators; Web pages; Web search;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Digital Libraries, 2004. Proceedings of the 2004 Joint ACM/IEEE Conference on
  • Print_ISBN
    1-58113-832-6
  • Type

    conf

  • DOI
    10.1109/JCDL.2004.1336110
  • Filename
    1336110