DocumentCode :
3315194
Title :
Deep Web Data Source Classification Based on Query Interface Context
Author :
Cui, Zilu ; Fu, Yuchen
Author_Institution :
Sch. of Comput. Sci. & Technol., Soochow Univ., Suzhou, China
fYear :
2012
fDate :
17-19 Aug. 2012
Firstpage :
329
Lastpage :
332
Abstract :
As the volume of information in the Deep Web grows, a Deep Web data source classification algorithm based on query interface context is presented. Two methods are combined to get the search interface similarity. One is based on the vector space. The classical TF-IDF statistics are used to gain the similarity between search interfaces. The other is to compute the two pages semantic similarity by the use of HowNet. Based on the K-NN algorithm, a WDB classification algorithm is presented. Experimental results show this algorithm generates high-quality clusters, measured both in terms of entropy and F-measure. It indicates the practical value of application.
Keywords :
Internet; pattern classification; pattern matching; query processing; Deep Web data source classification; F-measure; HowNet; K-NN algorithm; WDB classification algorithm; classical TF-IDF statistics; high-quality clusters; pages semantic similarity; query interface context; search interface similarity; vector space; Catalogs; Classification algorithms; Databases; Entropy; Information systems; Semantics; Web pages; Deep Web; HowNet; K-NN algorithm; data source classification; semantic classification;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computational and Information Sciences (ICCIS), 2012 Fourth International Conference on
Conference_Location :
Chongqing
Print_ISBN :
978-1-4673-2406-9
Type :
conf
DOI :
10.1109/ICCIS.2012.117
Filename :
6300503
Link To Document :
بازگشت