Title :
Concept-based clustering of textual documents using SOM
Author :
Amine, Abdelmalek ; Elberrichi, Zakaria ; Bellatreche, Ladjel ; Simonet, Michel ; Malki, Mimoun
Author_Institution :
Djillali Liabes Univ., Sidi Bel Abbes
fDate :
March 31 2008-April 4 2008
Abstract :
The classification of textual documents has been widely studied. The majority of classification approaches use supervised learning methods, which are acceptable for rather small corpora allowing experts to generate representative sets of data for the training, but are not feasible for significant flows of data. Unsupervised classification methods discover latent (hidden) classes automatically while minimizing human intervention. Many such methods exist, among which Kohonen self- organizing maps (SOM), which gather a certain number of similar objects without prior information. In this paper, we evaluate and compare the use of SOMs for the classification of textual documents in two situations: a conceptual representation of texts and a representation based on n-grams.
Keywords :
pattern classification; pattern clustering; self-organising feature maps; text analysis; unsupervised learning; Kohonen self-organizing maps; SOM; concept-based clustering; textual documents; unsupervised classification methods; Clustering algorithms; Computer science; Humans; Internet; Laboratories; Learning systems; Self organizing feature maps; Software libraries; Supervised learning; Unsupervised learning;
Conference_Titel :
Computer Systems and Applications, 2008. AICCSA 2008. IEEE/ACS International Conference on
Conference_Location :
Doha
Print_ISBN :
978-1-4244-1967-8
Electronic_ISBN :
978-1-4244-1968-5
DOI :
10.1109/AICCSA.2008.4493530