DocumentCode :
124206
Title :
A POI Categorization by Composition of Onomastic and Contextual Information
Author :
Su Jeong Choi ; Hyun Je Song ; Seong Bae Park ; Sang Jo Lee
Author_Institution :
Sch. of Comput. Sci. & Engineerning, Kyungpook Nat. Univ., Daegu, South Korea
Volume :
2
fYear :
2014
fDate :
11-14 Aug. 2014
Firstpage :
38
Lastpage :
45
Abstract :
Point of interest (POI) categorization is the task of finding of categories of POIs within a document. Because the documents that possess POIs have clue words for identifying POI categories, the task can be solved as document classification. However, this approach misses two crucial factors for identifying the category of a POI. First, the approach pays no attention to onomastic information, even though POI names reveal much categorical information in many cases. Second, the approach ignores the fact that most clue words for identifying a POI category are located near the POI name. This paper proposes a novel method that incorporates both onomastic and local contextual information in POI categorization. The proposed method uses support vector machines (SVMs) to categorize POIs. In order to utilize the onomastic information of POIs, The proposed method adopts the string kernel that manages variations of the POI names efficiently at the character level. The method also proposes a Gaussian weighting to content words in a document. By setting the mean of a Gaussian weighting at the position of a POI name, the method imposes higher weights to the words near the POI name and lower weights to the words far from the name. Then, these two types of information are combined by a composite kernel of the string kernel and a linear kernel with the Gaussian weighting. A series of experiments prove that SVMs with the combined information outperforms those with single information.
Keywords :
Gaussian processes; document handling; pattern classification; support vector machines; Gaussian weighting; POI categorization; SVM; composite kernel; contextual information composition; document classification; linear kernel; onomastic information composition; point of interest categorization; string kernel; support vector machines; Context; Equations; Kernel; Marine animals; Support vector machines; Vectors; Contextual information; Kernel composition; Onomastic information; POI categorization;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Web Intelligence (WI) and Intelligent Agent Technologies (IAT), 2014 IEEE/WIC/ACM International Joint Conferences on
Conference_Location :
Warsaw
Type :
conf
DOI :
10.1109/WI-IAT.2014.78
Filename :
6927605
Link To Document :
بازگشت