DocumentCode :
2447574
Title :
A Dictionary-based Multi-language Segmentation and Search Engine
Author :
Weng Yu ; Cheng Wenyi
Author_Institution :
Minority Language Branch, Minzu Univ. of China, Beijing, China
fYear :
2012
fDate :
1-3 Nov. 2012
Firstpage :
275
Lastpage :
277
Abstract :
Because of the current search engines are mostly based on Chinese and English, the engines can provide the minority language search services are very small and the accuracy is low. We present a Dictionary-based multi-language Analyzer and Search Engine which can be used to analyze and search information on the Internet in minority languages such as Uighur, Tibetan, Mongol, Manchu etc. After preprocessed the Corpus we use Lucene to index, Segmentation. Segmentation is based on our dictionary and it depends much on it.
Keywords :
Internet; dictionaries; natural language processing; search engines; Chinese; English; Internet; dictionary-based multilanguage analyzer; dictionary-based multilanguage segmentation; minority languages; search engines; Accuracy; Dictionaries; Educational institutions; Engines; Indexes; Internet; Search engines; Minority Language; Mutilanguage; Search Engine; Tibetan Information Processing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Intelligent Networks and Intelligent Systems (ICINIS), 2012 Fifth International Conference on
Conference_Location :
Tianjin
Print_ISBN :
978-1-4673-3083-1
Type :
conf
DOI :
10.1109/ICINIS.2012.85
Filename :
6376541
Link To Document :
بازگشت