• DocumentCode
    1908164
  • Title

    An Empirical Evaluation of Dimensionality Reduction Using Latent Semantic Analysis on Hindi Text

  • Author

    Krishnamurthi, Karthik ; Sudi, Ravi Kumar ; Panuganti, Vijayapal Reddy ; Bulusu, Vishnu Vardhan

  • Author_Institution
    Dept. of IT, SNIST, Hyderabad, India
  • fYear
    2013
  • fDate
    17-19 Aug. 2013
  • Firstpage
    21
  • Lastpage
    24
  • Abstract
    Dimensionality reduction is the process of deriving an approximate representation of a dataset, that can reflect most of the correlations underlying within the dataset. In the context of text processing, dimensionality reduction is used for transforming any text to a precise representation that efficiently identifies the main insights of the original text. LSA(Latent Semantic Analysis) is a technique that is used to find correlations between words and sentences based on the usage of words within the text. This paper addresses the issue of dimensionality reduction in representing relevant data from Hindi text using LSA. An empirical evaluation is performed to find the influence of language complexity and influence of various weighting schemes on dimensionality reduction. The results are presented using the standard measures such as recall, precision and F-score.
  • Keywords
    natural language processing; singular value decomposition; text analysis; Hindi text; LSA; dimensionality reduction; language complexity; latent semantic analysis; text processing; Dimensionality Reduction; Extractive summary; Latent Semantic Analysis; Singular Value Decomposition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Asian Language Processing (IALP), 2013 International Conference on
  • Conference_Location
    Urumqi
  • Type

    conf

  • DOI
    10.1109/IALP.2013.11
  • Filename
    6645994