Title :
Development of a multilingual text mining approach for knowledge discovery in patents
Author :
Lee, Chung-Hong ; Yang, Hsin-Chang ; Li, Yi-Ju
Author_Institution :
Dept. of Electr., Eng., Nat. Kaohsiung Univ. of Appl. Sci., Kaohsiung, Taiwan
Abstract :
In this paper we describe our work on developing a novel technique for discovery of implicit knowledge about patents from multilingual patent information sources. In this work we developed a system platform to support locating similar and relevant multilingual patent documents. The platform was implemented using a multilingual vector space based on the latent semantic indexing (LSI) model, and utilizing collected professional Chinese-English parallel corpora for training the system model. These multilingual patent documents could then be mapped into the semantic vector space for evaluating their similarity by means of text clustering techniques. The preliminary results show that our platform framework has potential for retrieval and relatedness evaluation of multilingual patent documents.
Keywords :
data mining; indexing; information retrieval; patents; text analysis; document retrieval; knowledge discovery; latent semantic indexing; multilingual patent information sources; multilingual text mining; multilingual vector space; patents; professional Chinese-English parallel corpora; relatedness evaluation; text clustering; Cybernetics; Dictionaries; Indexing; Information analysis; Information management; Information systems; Large scale integration; Terminology; Text mining; USA Councils; Document clustering; Latent semantic indexing; Multilingual patent retrieval; Patent retrieval; Text mining;
Conference_Titel :
Systems, Man and Cybernetics, 2009. SMC 2009. IEEE International Conference on
Conference_Location :
San Antonio, TX
Print_ISBN :
978-1-4244-2793-2
Electronic_ISBN :
1062-922X
DOI :
10.1109/ICSMC.2009.5345953