DocumentCode :
3533143
Title :
Clustering news groups using inverted index based NTSO
Author :
Jo, Taeho
Author_Institution :
Sch. of Comput. & Inf. Eng., Inha Univ., Incheon, South Korea
fYear :
2009
fDate :
28-31 July 2009
Firstpage :
1
Lastpage :
7
Abstract :
This research proposes NTSO (neural text self organizer) as the approach to text clustering and sets inverted index as the basis for execution of the NTSO. For using one of traditional approaches, documents should be encoded into numerical vectors and encoding so causes the two main problems: the huge dimensionality and the sparse distribution. This research proposes that documents should be encoded into string vectors as the alternative structured forms to numerical vectors and NTSO should be used as the approach to text clustering. By solving the two main problems, the proposed approach is expected to improve the performance of text clustering. By comparing the proposed approach with other approaches, we will validate the text clustering performance of the proposed approach as the results of solving the problems.
Keywords :
neural nets; text analysis; word processing; inverted index based NTSO; neural text self organizer; news groups clustering; text clustering; Clustering algorithms; Costs; Encoding; Indexing; Kernel; Robustness; Support vector machines; Text categorization;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Networked Digital Technologies, 2009. NDT '09. First International Conference on
Conference_Location :
Ostrava
Print_ISBN :
978-1-4244-4614-8
Electronic_ISBN :
978-1-4244-4615-5
Type :
conf
DOI :
10.1109/NDT.2009.5272194
Filename :
5272194
Link To Document :
بازگشت