DocumentCode :
3299052
Title :
Concept-based clustering of textual documents using SOM
Author :
Amine, Abdelmalek ; Elberrichi, Zakaria ; Bellatreche, Ladjel ; Simonet, Michel ; Malki, Mimoun
Author_Institution :
Djillali Liabes Univ., Sidi Bel Abbes
fYear :
2008
fDate :
March 31 2008-April 4 2008
Firstpage :
156
Lastpage :
163
Abstract :
The classification of textual documents has been widely studied. The majority of classification approaches use supervised learning methods, which are acceptable for rather small corpora allowing experts to generate representative sets of data for the training, but are not feasible for significant flows of data. Unsupervised classification methods discover latent (hidden) classes automatically while minimizing human intervention. Many such methods exist, among which Kohonen self- organizing maps (SOM), which gather a certain number of similar objects without prior information. In this paper, we evaluate and compare the use of SOMs for the classification of textual documents in two situations: a conceptual representation of texts and a representation based on n-grams.
Keywords :
pattern classification; pattern clustering; self-organising feature maps; text analysis; unsupervised learning; Kohonen self-organizing maps; SOM; concept-based clustering; textual documents; unsupervised classification methods; Clustering algorithms; Computer science; Humans; Internet; Laboratories; Learning systems; Self organizing feature maps; Software libraries; Supervised learning; Unsupervised learning;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer Systems and Applications, 2008. AICCSA 2008. IEEE/ACS International Conference on
Conference_Location :
Doha
Print_ISBN :
978-1-4244-1967-8
Electronic_ISBN :
978-1-4244-1968-5
Type :
conf
DOI :
10.1109/AICCSA.2008.4493530
Filename :
4493530
Link To Document :
بازگشت