DocumentCode
2484243
Title
Clustering of short commercial documents for the web
Author
Carullo, Moreno ; Binaghi, Elisabetta ; Gallo, Ignazio ; Lamberti, Nicola
Author_Institution
Dipt. di Inf. e Comun., Univ. degli Studi dell´´Insubria, Varese
fYear
2008
fDate
8-11 Dec. 2008
Firstpage
1
Lastpage
4
Abstract
Document clustering techniques have been applied in several areas, with the Web as one of the most recent and influent. Both general-purpose and text-oriented techniques exist and can be used to cluster a collection of documents in many ways. In this work we propose an online, single-pass document clustering model that can be combined with a variety of text-oriented similarity measures. An experimental evaluation of the proposed model was conducted in the e-commerce domain. Performances were measured using a clustering-oriented metric based on F-Measure and compared with those obtained by other well-known approaches.
Keywords
Internet; pattern clustering; text analysis; Web search; general-purpose technique; single-pass document clustering; text-oriented technique; Algorithm design and analysis; Clustering algorithms; Clustering methods; Electronic commerce; Encoding; Internet; Particle measurements; Performance evaluation; Text analysis; Web search;
fLanguage
English
Publisher
ieee
Conference_Titel
Pattern Recognition, 2008. ICPR 2008. 19th International Conference on
Conference_Location
Tampa, FL
ISSN
1051-4651
Print_ISBN
978-1-4244-2174-9
Electronic_ISBN
1051-4651
Type
conf
DOI
10.1109/ICPR.2008.4761554
Filename
4761554
Link To Document