• DocumentCode
    589176
  • Title

    Toward Geographic Information Harvesting: Extraction of Spatial Relational Facts from Web Documents

  • Author

    Loglisci, C. ; Ienco, D. ; Roche, M. ; Teisseire, M. ; Malerba, Donato

  • Author_Institution
    Dipt. di Inf., Univ. degli Studi di Bari “Aldo Moro”, Bari, Italy
  • fYear
    2012
  • fDate
    10-10 Dec. 2012
  • Firstpage
    789
  • Lastpage
    796
  • Abstract
    This paper faces the problem of harvesting geographic information from Web documents, specifically, extracting facts on spatial relations among geographic places. The motivation is twofold. First, researchers on Spatial Data Mining often assume that spatial data are already available, thanks to current GIS and positioning technologies. Nevertheless, this is not applicable to the case of spatial information embedded in data without an explicit spatial modeling, such as documents. Second, despite the huge amount of Web documents conveying useful geographic information, there is not much work on how to harvest spatial data from these documents. The problem is particularly challenging because of the lack of annotated documents, which prevents the application of supervised learning techniques. In this paper, we propose to harvest facts on geographic places through an unsupervised approach which recognizes spatial relations among geographic places without supposing the availability of annotated documents. The proposed approach is based on the combined use of a spatial ontology and a prototype-based classifier. A case study on topological and directional relations is reported and commented.
  • Keywords
    Internet; data mining; document handling; geographic information systems; information retrieval; ontologies (artificial intelligence); pattern classification; unsupervised learning; GIS technologies; Web document annotation; directional relations; geographic information harvesting; geographic places; positioning technologies; prototype-based classifier; spatial data mining; spatial ontology; spatial relation recognition; spatial relational facts extraction; supervised learning techniques; topological relations; unsupervised approach; Data mining; Geographic information systems; Natural languages; Ontologies; Prototypes; Spatial databases; Web pages; Geo-spatial Intelligence; Geographic Documents; Relation Extraction; Spatial Relations;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Mining Workshops (ICDMW), 2012 IEEE 12th International Conference on
  • Conference_Location
    Brussels
  • Print_ISBN
    978-1-4673-5164-5
  • Type

    conf

  • DOI
    10.1109/ICDMW.2012.20
  • Filename
    6406520