• DocumentCode
    3300654
  • Title

    Dimensionality reduction for text using LLE

  • Author

    HE, Chuan ; DONG, Zhe ; Li, Ruifan ; Zhong, Yixin

  • Author_Institution
    Sch. of Inf. Eng., Beijing Univ. of Posts & Telecommun., Beijing
  • fYear
    2008
  • fDate
    19-22 Oct. 2008
  • Firstpage
    1
  • Lastpage
    7
  • Abstract
    Dimensionality reduction is a necessary preprocessing step in many fields of information processing such as information retrieval, pattern recognition and data compression. Its goal is to discover the representative or the discriminative information residing in raw data. Locally linear embedding (LLE), one of effective manifold learning algorithms, addresses this problem by computing low-dimensional, neighborhood preserving embeddings of high-dimensional data. The embedding is derived from the symmetries for locally linear reconstructions. And the computation of this embedding is related to an eigen-problem in the implement. Since LLE was proposed, it has been being applied to deal with image data only because it originated from facial recognition. However, the problem of curse of dimensionality is very prevalent. Therefore, we here try to apply this algorithm for text processing. In this paper, we introduce the LLE briefly and analyze its advantage and latent disadvantages, and the relationship between LSI and LLE in the graph embedding framework is then discussed from a theoretic view. Finally, the experimental results are show with the datasets of Reuters21578 and TDT2.
  • Keywords
    text analysis; data compression; dimensionality reduction; discriminative information; eigen-problem; facial recognition; graph embedding framework; image data; information processing; information retrieval; locally linear embedding; locally linear reconstruction; manifold learning algorithm; pattern recognition; raw data; text processing; Covariance matrix; Data compression; Embedded computing; Feature extraction; Image reconstruction; Information retrieval; Large scale integration; Pattern recognition; Principal component analysis; Text processing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Natural Language Processing and Knowledge Engineering, 2008. NLP-KE '08. International Conference on
  • Conference_Location
    Beijing
  • Print_ISBN
    978-1-4244-4515-8
  • Electronic_ISBN
    978-1-4244-2780-2
  • Type

    conf

  • DOI
    10.1109/NLPKE.2008.4906771
  • Filename
    4906771