DocumentCode :
2484243
Title :
Clustering of short commercial documents for the web
Author :
Carullo, Moreno ; Binaghi, Elisabetta ; Gallo, Ignazio ; Lamberti, Nicola
Author_Institution :
Dipt. di Inf. e Comun., Univ. degli Studi dell´´Insubria, Varese
fYear :
2008
fDate :
8-11 Dec. 2008
Firstpage :
1
Lastpage :
4
Abstract :
Document clustering techniques have been applied in several areas, with the Web as one of the most recent and influent. Both general-purpose and text-oriented techniques exist and can be used to cluster a collection of documents in many ways. In this work we propose an online, single-pass document clustering model that can be combined with a variety of text-oriented similarity measures. An experimental evaluation of the proposed model was conducted in the e-commerce domain. Performances were measured using a clustering-oriented metric based on F-Measure and compared with those obtained by other well-known approaches.
Keywords :
Internet; pattern clustering; text analysis; Web search; general-purpose technique; single-pass document clustering; text-oriented technique; Algorithm design and analysis; Clustering algorithms; Clustering methods; Electronic commerce; Encoding; Internet; Particle measurements; Performance evaluation; Text analysis; Web search;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Pattern Recognition, 2008. ICPR 2008. 19th International Conference on
Conference_Location :
Tampa, FL
ISSN :
1051-4651
Print_ISBN :
978-1-4244-2174-9
Electronic_ISBN :
1051-4651
Type :
conf
DOI :
10.1109/ICPR.2008.4761554
Filename :
4761554
Link To Document :
بازگشت