DocumentCode :
3099320
Title :
Theses cluster based on bilingual and synonymous keyword sets using mutual information
Author :
Huang, Chung-yi ; Chen, Rung-Ching
Author_Institution :
Dept. of Inf. Manage., Chaoyang Univ. of Technol., Wufong, Taiwan
Volume :
5
fYear :
2009
fDate :
12-15 July 2009
Firstpage :
2999
Lastpage :
3004
Abstract :
Searching published papers is a required activity for the researching process. Since articles are presented in various languages, it makes precise queries hard to achieve. In this paper, we propose an automatic theses clustering method based on bilingual and synonymous keyword sets which includes Chinese and English keywords. We also provide a clustering computation to speedup operation. First, the system automatically generates bilingual and synonymous keyword sets, and then based on bilingual and synonymous keyword sets, clustering the theses. The method not only solves the weakness of using digital dictionaries to solve clustering problems, but also makes error problem, the query by bilingual and synonymous keywords, be restricted. The system was implemented by a clustering computation technology to solve traditional documents clustering systems performance problems. Through many computer processes, the system not only can save a lot of time, but also can attain high availability and load balancing effectiveness. Primary experiments prove that the system makes the theses clustering work effectively.
Keywords :
data mining; dictionaries; text analysis; word processing; automatic theses clustering method; bilingual keyword; digital dictionary; error problem; mutual information; synonymous keyword sets; Classification tree analysis; Cybernetics; Databases; Dictionaries; Frequency; Machine learning; Mutual information; Natural languages; Wireless LAN; Wireless networks; Bilingual and synonymous keyword; Document clustering; Keyword set;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Machine Learning and Cybernetics, 2009 International Conference on
Conference_Location :
Baoding
Print_ISBN :
978-1-4244-3702-3
Electronic_ISBN :
978-1-4244-3703-0
Type :
conf
DOI :
10.1109/ICMLC.2009.5212598
Filename :
5212598
Link To Document :
بازگشت