DocumentCode
756685
Title
An automatic indexing and neural network approach to concept retrieval and classification of multilingual (Chinese-English) documents
Author
Lin, Chung-hsin ; Chen, Hsinchun
Author_Institution
Dept. of Manage. Inf. Syst., Arizona Univ., Tucson, AZ, USA
Volume
26
Issue
1
fYear
1996
fDate
2/1/1996 12:00:00 AM
Firstpage
75
Lastpage
88
Abstract
An automatic indexing and concept classification approach to a multilingual (Chinese and English) bibliographic database is presented. We introduced a multi-linear term-phrasing technique to extract concept descriptors (terms or keywords) from a Chinese-English bibliographic database. A concept space of related descriptors was then generated using a co-occurrence analysis technique. Like a man-made thesaurus, the system-generated concept space can be used to generate additional semantically-relevant terms for search. For concept classification and clustering, a variant of a Hopfield neural network was developed to cluster similar concept descriptors and to generate a small number of concept groups to represent (summarize) the subject matter of the database. The concept space approach to information classification and retrieval has been adopted by the authors in other scientific databases and business applications, but multilingual information retrieval presents a unique challenge. This research reports our experiment on multilingual databases. Our system was initially developed in the MS-DOS environment, running ETEN Chinese operating system. For performance reasons, it was then tested on a UNIX-based system. Due to the unique ideographic nature of the Chinese language, a Chinese term-phrase indexing paradigm considering the ideographic characteristics of Chinese was developed as a multilingual information classification model. By applying the neural network based concept classification technique, the model presents a novel way of organizing unstructured multilingual information
Keywords
bibliographic systems; classification; indexing; information retrieval; neural nets; Chinese-English bibliographic database; Hopfield neural network; MS-DOS; automatic indexing; classification; concept descriptors; concept retrieval; information classification; keywords; multilingual databases; multilingual documents; multilingual information retrieval; neural network; Databases; Hopfield neural networks; Information retrieval; Machine assisted indexing; Natural languages; Neural networks; Operating systems; Organizing; System testing; Thesauri;
fLanguage
English
Journal_Title
Systems, Man, and Cybernetics, Part B: Cybernetics, IEEE Transactions on
Publisher
ieee
ISSN
1083-4419
Type
jour
DOI
10.1109/3477.484439
Filename
484439
Link To Document