DocumentCode :
3286790
Title :
Comparison Probabilistic Latent Semantic Indexing Model In Chinese Information Retrieval
Author :
Fang, Xie ; Xiaoguang, Liu ; Quan, Hu
Author_Institution :
Coll. of Comput. Sci., Hubei Univ. of Technol., Wuhan, China
Volume :
3
fYear :
2009
fDate :
15-17 May 2009
Firstpage :
559
Lastpage :
562
Abstract :
With the increasing of information on Internet, Web mining has been the focus of information retrieval. By a certain metric of similarity, Web clustering groups the similar Web documents. But the classical algorithms of clustering are aimless in searching the solution space and absent of semantic characters. In this paper, the probabilistic latent semantic indexing (PLSI) models which using word segmentation, two-grams and key words extraction separately are compared. As comparison, vector models using different Chinese information retrieval technologies are also tested in the same time. The experimental results show that the correct word segmentation can improve precision of information retrieval obviously to PLSI model. But it isn´t effective to vector space model. And index based on key words extraction obtains highest accuracy rate to PLSI model.
Keywords :
Internet; data mining; indexing; information retrieval; Chinese information retrieval; Internet; PLSI model; Web clustering; Web documents; Web mining; key words extraction; probabilistic latent semantic indexing model; word segmentation; Application software; Clustering algorithms; Data mining; Educational institutions; Information analysis; Information retrieval; Information technology; Internet; Machine assisted indexing; Space technology; Chinese information retrieval; N-Grams retrieval; probabilistic latent semantic indexing; word segmentation;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Information Technology and Applications, 2009. IFITA '09. International Forum on
Conference_Location :
Chengdu
Print_ISBN :
978-0-7695-3600-2
Type :
conf
DOI :
10.1109/IFITA.2009.532
Filename :
5232186
Link To Document :
بازگشت