Title :
Generalizing Latent Semantic Analysis
Author :
Olney, Andrew M.
Author_Institution :
Inst. for Intell. Syst., Univ. of Memphis, Memphis, TN, USA
Abstract :
Latent semantic analysis (LSA) is a vector space technique for representing word meaning. Traditionally, LSA consists of two steps, the formation of a word by document matrix followed by singular value decomposition of that matrix. However, the formation of the matrix according to the dimensions of words and documents is somewhat arbitrary. This paper attempts to reconceptualize LSA in more general terms, by characterizing the matrix as a feature by context matrix rather than a word by document matrix. Examples of generalized LSA utilizing n-grams and local context are presented and compared with traditional LSA on paraphrase comparison tasks.
Keywords :
matrix algebra; natural language processing; singular value decomposition; text analysis; document matrix; latent semantic analysis; n-grams; singular value decomposition; vector space technique; Dictionaries; Frequency; Functional analysis; Information retrieval; Intelligent systems; Least squares approximation; Matrix decomposition; Singular value decomposition; Sparse matrices; USA Councils; latent semantic analysis; n-gram; paraphrase; vector space;
Conference_Titel :
Semantic Computing, 2009. ICSC '09. IEEE International Conference on
Conference_Location :
Berkeley, CA
Print_ISBN :
978-1-4244-4962-0
Electronic_ISBN :
978-0-7695-3800-6
DOI :
10.1109/ICSC.2009.89