• DocumentCode
    1619541
  • Title

    The Design and Implementation of a Spider in Local Network

  • Author

    Meixia Qu ; Xiue Jiang ; Junfeng Luan ; Xingjian Ren

  • Author_Institution
    Sch. of Mech., Electr. & Inf. Eng., Shandong Univ. at Weihai, Weihai, China
  • fYear
    2012
  • Firstpage
    1849
  • Lastpage
    1851
  • Abstract
    This paper mainly introduces the principle and method of the search engine, it also gives the design and implementation of the multi-thread concurrent spider based on the local network. This spider adopts the BloomFilter to solve the URL duplicate and thread pool to manage the concurrent threads; it uses the IoC technique in Spring to provide the support of the different file formats such as DOC, PDF, XLS etc which can demonstrate the scalability of the whole application; the spider speeds up the I/O performance by storing the data in the light database. At the end of the paper, we give the comparison and the analysis between the local search engine and general business search engine in the efficiency and performance.
  • Keywords
    Internet; data structures; document handling; input-output programs; multi-threading; search engines; software performance evaluation; storage management; BloomFilter; DOC; I/O performance; IoC technique; PDF; URL duplicate; XLS; concurrent thread management; data storage; file formats; general business search engine; local network; local search engine; multithread concurrent spider; thread pool; Educational institutions; Indexes; Instruction sets; Scalability; Search engines; Web pages; BloomFilter; Local Network; Search engine; Spider;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Industrial Control and Electronics Engineering (ICICEE), 2012 International Conference on
  • Conference_Location
    Xi´an
  • Print_ISBN
    978-1-4673-1450-3
  • Type

    conf

  • DOI
    10.1109/ICICEE.2012.490
  • Filename
    6322780