Title :
The Characteristic of Chinese Search Based on Sogou Log
Author :
Wang, Xiaochun ; Yang, Muyun ; Li, Sheng ; Zhao, Tiejun ; Zhang, Zhitao
Author_Institution :
Sch. of Comput. Sci. & Technol., Harbin Inst. of Technol., Harbin, China
fDate :
Nov. 30 2009-Dec. 1 2009
Abstract :
To characterize Chinese search, this paper investigates a one-month user log from Sogou, a famous Chinese search engine. The query log analysis we perform falls into two categories: analysis of the query composition, and analysis of the query structure. Statistical results reveal that, in addition to Chinese character, users are accustomed to using the full-width and half-width characters and Japanese occasionally in the query. From the view of query structure, the simple query amounts to 81.6% of the total number, while complex queries accounts for 18.4%. Finally the differences between the Chinese search engine and the English one in the query formulation are discussed. The findings of this paper are beneficial to the study of information retrieval modeling, as well as the future log analysis of search engine.
Keywords :
query formulation; query processing; search engines; statistical analysis; Chinese character; Chinese search engine; English search engine; Japanese; Sogou log; full-width character; half-width character; information retrieval modeling; query composition analysis; query formulation; query log analysis; query structure analysis; Computer science; Database languages; Information analysis; Information retrieval; Knowledge acquisition; Large-scale systems; Natural languages; Performance analysis; Search engines; Statistics; Sogoue; log; query composition; query structure; search engine;
Conference_Titel :
Knowledge Acquisition and Modeling, 2009. KAM '09. Second International Symposium on
Conference_Location :
Wuhan
Print_ISBN :
978-0-7695-3888-4
DOI :
10.1109/KAM.2009.270