DocumentCode :
2707929
Title :
Extracting spatial relations from document for geographic information retrieval
Author :
Yuan, Yecheng
Author_Institution :
State Key Lab. of Resources & Environ. Inf. Syst., CAS, Beijing, China
fYear :
2011
fDate :
24-26 June 2011
Firstpage :
1
Lastpage :
5
Abstract :
Geographic information retrieval (GIR) is developed to retrieve geographical information from unstructured text (commonly web documents). Previous researches focus on applying traditional information retrieval (IR) techniques to GIR, such as ranking geographic relevance by vector space model (VSM). In many cases, these keyword-based methods can not support spatial query very well. For example, searching documents on “debris flow took place in Hunan last year”, the documents selected in this way may only contain the words “debris flow” and “Hunan” rather than refer to “debris flow actually occurred in Hunan”. Lack of spatial relations between thematic activates (debris flow) and geographic entities (Hunan) is the key reason for this problem. In this paper, we present a kernel-based approach and apply it in support vector machine (SVM) to extract spatial relations from free text for further GIS service and spatial reasoning. First, we analyze the characters of spatial relation expressions in natural language and there are two types of spatial relations: topology and direction. Both of them are used to qualitatively describe the relative positions of spatial objects to each other. Then we explore the use of dependency tree (a dependency tree represents the grammatical dependencies in a sentence and it can be generated by syntax parser) to identify these spatial relations. We observe that the features required to find a relationship between two spatial named entities in the same sentence is typically captured by the shortest path between the two entities in the dependency tree. Therefore, we construct a shortest path dependency kernel for SVM to complete the task. The experiment results show that our dependency tree kernel achieves significant improvement than previous method.
Keywords :
geographic information systems; inference mechanisms; information retrieval; natural language processing; support vector machines; text analysis; topology; trees (mathematics); GIS service; SVM; Web documents; dependency tree kernel; geographic information retrieval; natural language; shortest path dependency kernel; spatial reasoning; spatial relation expression; spatial relations; support vector machine; syntax parser; unstructured text; Data mining; Feature extraction; Kernel; Natural languages; Rivers; Support vector machines; Syntactics; Dependency tree; Geographic information retrieval; Kernel function; Spatial relations; Support vector machine;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Geoinformatics, 2011 19th International Conference on
Conference_Location :
Shanghai
ISSN :
2161-024X
Print_ISBN :
978-1-61284-849-5
Type :
conf
DOI :
10.1109/GeoInformatics.2011.5980797
Filename :
5980797
Link To Document :
بازگشت