• DocumentCode
    3533143
  • Title

    Clustering news groups using inverted index based NTSO

  • Author

    Jo, Taeho

  • Author_Institution
    Sch. of Comput. & Inf. Eng., Inha Univ., Incheon, South Korea
  • fYear
    2009
  • fDate
    28-31 July 2009
  • Firstpage
    1
  • Lastpage
    7
  • Abstract
    This research proposes NTSO (neural text self organizer) as the approach to text clustering and sets inverted index as the basis for execution of the NTSO. For using one of traditional approaches, documents should be encoded into numerical vectors and encoding so causes the two main problems: the huge dimensionality and the sparse distribution. This research proposes that documents should be encoded into string vectors as the alternative structured forms to numerical vectors and NTSO should be used as the approach to text clustering. By solving the two main problems, the proposed approach is expected to improve the performance of text clustering. By comparing the proposed approach with other approaches, we will validate the text clustering performance of the proposed approach as the results of solving the problems.
  • Keywords
    neural nets; text analysis; word processing; inverted index based NTSO; neural text self organizer; news groups clustering; text clustering; Clustering algorithms; Costs; Encoding; Indexing; Kernel; Robustness; Support vector machines; Text categorization;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Networked Digital Technologies, 2009. NDT '09. First International Conference on
  • Conference_Location
    Ostrava
  • Print_ISBN
    978-1-4244-4614-8
  • Electronic_ISBN
    978-1-4244-4615-5
  • Type

    conf

  • DOI
    10.1109/NDT.2009.5272194
  • Filename
    5272194