DocumentCode :
3458320
Title :
Research on Web Text Representation and the Similarity Based on Improved VSM in Uyghur Web Information Retrieval
Author :
Tohti, Turdi ; Hamdulla, Askar ; Musajan, Winira
Author_Institution :
Xinjiang Key Lab. of Multilingual Inf. Technol., Xinjiang Univ., Urumqi, China
fYear :
2010
fDate :
21-23 Oct. 2010
Firstpage :
1
Lastpage :
5
Abstract :
In the information retrieval technology based on vector space model, represent the Web documents with the vector space model, take the Indexed term weight as a main basis carry on the similarity computation between the user query and Web documents, and sorting query results according to the similarity size. In this paper, adjusted Indexed term weight with the position weighting factor, considering the term weight ,position, mutual distance, order and as well as the Uighur word similarity contributions, has carried on the user query and the Web documents similarity measure. Tests the experiment in the Uygur search engine, the results show that , the improved method obviously improved the accuracy, recall and sorting capacity of the Web information retrieval system.
Keywords :
Internet; query processing; search engines; text analysis; word processing; Uighur word similarity contributions; Uyghur Web information retrieval; Uygur search engine; Web documents; Web text representation; improved VSM; position weighting factor; sorting query; user query; vector space model; Accuracy; Computational modeling; Extraterrestrial measurements; Indexes; Search engines; Weight measurement;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Pattern Recognition (CCPR), 2010 Chinese Conference on
Conference_Location :
Chongqing
Print_ISBN :
978-1-4244-7209-3
Electronic_ISBN :
978-1-4244-7210-9
Type :
conf
DOI :
10.1109/CCPR.2010.5659262
Filename :
5659262
Link To Document :
بازگشت