DocumentCode
3533143
Title
Clustering news groups using inverted index based NTSO
Author
Jo, Taeho
Author_Institution
Sch. of Comput. & Inf. Eng., Inha Univ., Incheon, South Korea
fYear
2009
fDate
28-31 July 2009
Firstpage
1
Lastpage
7
Abstract
This research proposes NTSO (neural text self organizer) as the approach to text clustering and sets inverted index as the basis for execution of the NTSO. For using one of traditional approaches, documents should be encoded into numerical vectors and encoding so causes the two main problems: the huge dimensionality and the sparse distribution. This research proposes that documents should be encoded into string vectors as the alternative structured forms to numerical vectors and NTSO should be used as the approach to text clustering. By solving the two main problems, the proposed approach is expected to improve the performance of text clustering. By comparing the proposed approach with other approaches, we will validate the text clustering performance of the proposed approach as the results of solving the problems.
Keywords
neural nets; text analysis; word processing; inverted index based NTSO; neural text self organizer; news groups clustering; text clustering; Clustering algorithms; Costs; Encoding; Indexing; Kernel; Robustness; Support vector machines; Text categorization;
fLanguage
English
Publisher
ieee
Conference_Titel
Networked Digital Technologies, 2009. NDT '09. First International Conference on
Conference_Location
Ostrava
Print_ISBN
978-1-4244-4614-8
Electronic_ISBN
978-1-4244-4615-5
Type
conf
DOI
10.1109/NDT.2009.5272194
Filename
5272194
Link To Document