DocumentCode
467808
Title
Clustering Synonymous English and Chinese Keywords for Cross-Language Queries
Author
Chen, Rung-Ching ; Huang, Chung-yi ; Huang, Yu-Len
Author_Institution
Chaoy ang Univ. of Technol., Taichung
Volume
4
fYear
2007
fDate
19-22 Aug. 2007
Firstpage
1875
Lastpage
1880
Abstract
In this paper, we propose an automatic clustering method to find synonymous terms including cross-language keywords from Chinese and English thesis documents. First, Chinese and English keyword pairs were collected from an existing database. Then, the system calculates the support and confidence values of the keyword pairs. Next, high confidence and support values are selected for keyword pairs. Subsequently, keyword pairs are merged by applying a clustering algorithm to various keyword pairs with similar meanings which are clustered into the same subset. Finally, effective applications can be applied based the subsets of collected words including cross-language or synonymous queries. The experimental results achieved 98.4% precision identifying correct terms from 1220 keyword pair clusters from the collected subsets. The primary experimental results show that the system can provide effective information for users when making queries online.
Keywords
natural language processing; pattern clustering; query processing; text analysis; Chinese keywords; English keywords; automatic clustering; cross-language queries; keyword pairs; synonymous keywords clustering; Abstracts; Clustering algorithms; Cybernetics; Data mining; Databases; Information management; Internet; Machine learning; Natural languages; Text categorization; Cross-language; Keyword clustering; Keyword pairs; Synonymous terms;
fLanguage
English
Publisher
ieee
Conference_Titel
Machine Learning and Cybernetics, 2007 International Conference on
Conference_Location
Hong Kong
Print_ISBN
978-1-4244-0973-0
Electronic_ISBN
978-1-4244-0973-0
Type
conf
DOI
10.1109/ICMLC.2007.4370454
Filename
4370454
Link To Document