DocumentCode :
2307590
Title :
Textual noise analysis and removal for effective search engines
Author :
Jaber, Tareq ; Amira, Abbes ; Milligan, Peter
Author_Institution :
Fac. of Inf. Technol., Al-Ahliyya Amman Univ., Amman, Jordan
fYear :
2010
fDate :
5-6 July 2010
Firstpage :
129
Lastpage :
133
Abstract :
In the field of intelligent information retrieval (IR), latent semantic indexing (LSI) is a popular technique used to retrieve information related more in meaning than in lexical matching. A core component in the process is the use of the singular value decomposition (SVD) which is used to remove the lexical noise in the term document matrix (TDM). The topic of mathematical modelling for noise reduction in LSI is important and demands attention. In this paper some observations on aspects of this topic are introduced. The work addresses a definition for noise in text processing and seeks to determine the best structure for the TDM. In other words, the structure of the TDM that would facilitate efficient searching within the LSI.
Keywords :
information retrieval; pattern matching; search engines; singular value decomposition; text analysis; intelligent information retrieval; latent semantic indexing; lexical matching; lexical noise removal; search engines; singular value decomposition; term document matrix; textual noise removal; image processing; information retrieval; latent semantic indexing; noise modelling and removal; singular value decomposition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Visual Information Processing (EUVIP), 2010 2nd European Workshop on
Conference_Location :
Paris
Print_ISBN :
978-1-4244-7288-8
Type :
conf
DOI :
10.1109/EUVIP.2010.5699126
Filename :
5699126
Link To Document :
بازگشت