DocumentCode :
1637530
Title :
Suffix Tree Based Approach for Chinese Information Retrieval
Author :
Huang, Jin Hu ; Powers, David
Author_Institution :
Sch. of Comput. Sci., Flinders Univ. of South Australia, SA
Volume :
3
fYear :
2008
Firstpage :
393
Lastpage :
397
Abstract :
With the widespread of the Internet, great research interests are being shown in Chinese language information retrieval in recent years. The absence of word boundaries in Chinese language makes Chinese information retrieval (IR) different to European IR. In order to apply traditional IR approaches to Chinese language, sentences have to be segmented into words first. Word segmentation is playing a key role in Chinese IR. As word segmentation is not straightforward and the results are sometime ambiguous, n-grams are used as an alternative. Several experimental studies have been conducted to compare words and n-grams, word segmentation and its effect on information retrieval. These studies show that using either words or n-grams leads to comparable performances. Higher word segmentation accuracy does not necessarily result in better retrieval performance. In this paper we propose a suffix tree based approach for Chinese information retrieval without word segmentation.
Keywords :
Internet; information retrieval; natural language processing; Chinese language information retrieval; Internet; n-grams; suffix tree; Application software; Computer science; Design engineering; Frequency; Indexing; Information retrieval; Intelligent systems; Internet; Natural languages; Power engineering and energy; Suffix Tree;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Intelligent Systems Design and Applications, 2008. ISDA '08. Eighth International Conference on
Conference_Location :
Kaohsiung
Print_ISBN :
978-0-7695-3382-7
Type :
conf
DOI :
10.1109/ISDA.2008.365
Filename :
4696497
Link To Document :
بازگشت