• DocumentCode
    2927197
  • Title

    Distributed Web2.0 crawling for ontology evolution

  • Author

    Juffinger, Andreas ; Neidha, Thomas ; Weichselbraun, Albert ; Wohlgenannt, Gerhard ; Granitzer, Michael ; Kern, Roman ; Scharl, Arno

  • Author_Institution
    Graz University of Technology, 8010, Austria
  • Volume
    2
  • fYear
    2007
  • fDate
    28-31 Oct. 2007
  • Firstpage
    615
  • Lastpage
    620
  • Abstract
    Semantic Web technologies in general and ontologybased approaches in particular are considered the foundation for the next generation of information services. While ontologies enable software agents to exchange knowledge and information in a standardised, intelligent manner, describing todays vast amount of information in terms of ontological knowledge and to track the evolution of such ontologies remains a challenge. In this paper we describe Web2.0 crawling for ontology evolution. The World Wide Web, or Web for short, is due, its evolutionary properties and social network characteristics a perfect fitting data source to evolve an ontology. The decentralised structure of the Internet, the huge amount of data and upcoming Web2.0 technologies arise several challenges for a crawling system. In this paper we present a distributed crawling system with standard browser integration. The proposed system is a high performance, sitescript based noise reducing crawler, which loads standard browser equivalent content from Web2.0 resources. Furthermore we describe the integration of this spider into our ontology evolution framework.
  • Keywords
    Crawlers; Humans; Information retrieval; Intelligent agent; Internet; Ontologies; Semantic Web; Software agents; Text mining; Web sites;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Digital Information Management, 2007. ICDIM '07. 2nd International Conference on
  • Conference_Location
    Lyon, France
  • Print_ISBN
    978-1-4244-1475-8
  • Electronic_ISBN
    978-1-4244-1476-5
  • Type

    conf

  • DOI
    10.1109/ICDIM.2007.4444293
  • Filename
    4444293