• DocumentCode
    2992968
  • Title

    Design and Implementation of Basic Educational Web Resources Gathering System

  • Author

    Chaojun, Xu

  • Author_Institution
    Data Min. Lab., Nanjing Normal Univ., Nanjing, China
  • fYear
    2011
  • fDate
    24-28 Sept. 2011
  • Firstpage
    51
  • Lastpage
    55
  • Abstract
    This paper introduces a topic specific web crawling system, which gathers basic educational resources from the web, and indexes them for the purpose of basic educational users. Compared to other similar theme based crawling system, the crawler integrates fuzzy rule based algorithm and VSM text analysis technology together to predicting each URL´s relevancy to basic education while parsing current downloaded page HTML code. So, the system need not to save and retrieve low relevant URLs, and improve the system´s whole efficiency greatly.
  • Keywords
    Internet; educational computing; fuzzy reasoning; hypermedia markup languages; indexing; relevance feedback; search engines; text analysis; HTML code; VSM text analysis technology; Web crawling system; basic educational Web resources gathering system design; fuzzy rule based algorithm; indexing; low relevant URL retrieval; Accuracy; Cognition; Crawlers; Educational institutions; Internet; Mathematical model; Basic educational resources; Fuzzy rule reasoning; Topic specific crawling;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Complexity and Data Mining (IWCDM), 2011 First International Workshop on
  • Conference_Location
    Nanjing, Jiangsu
  • Print_ISBN
    978-1-4577-2007-9
  • Type

    conf

  • DOI
    10.1109/IWCDM.2011.20
  • Filename
    6128416