Title :
Automatic Technical Term Extraction Based on Term Association
Author :
Wan, Miao ; Liu, Song ; Liu, Jian-Yi ; Wang, Cong
Author_Institution :
Center for Intell. Sci. & Technol. Res., Beijing Univ. of Posts & Telecommun., Beijing
Abstract :
This paper proposes a new automatic Chinese term extracting algorithm combining both statistics-based and rule-based methods. This algorithm firstly uses a statistical method to extract two-word candidates from raw corpus, and then extends these candidates forward to obtain multi-word candidate terms. We propose a new metric named term association (TA) that can measure the combining degree between words in a string very well. In the second subsystem it filters these candidates to get domain-specific technical terms based on defined rules. Our purpose is to achieve a higher precision of the domain-specific Chinese term extraction task by the hybrid method than the previous approaches. This algorithm implements an extractor with an unprocessed corpus as input for technical papers of ethanol fuels. The results of experiments are analyzed and evaluated, and the precision and recall are 84.26% and 63.86% respectively.
Keywords :
data mining; statistical analysis; Chinese term extracting algorithm; automatic technical term extraction; rule-based methods; statistics-based methods; term association; Algorithm design and analysis; Dictionaries; Ethanol; Filters; Frequency; Fuels; Fuzzy systems; Natural languages; Statistical analysis; Training data;
Conference_Titel :
Fuzzy Systems and Knowledge Discovery, 2008. FSKD '08. Fifth International Conference on
Conference_Location :
Shandong
Print_ISBN :
978-0-7695-3305-6
DOI :
10.1109/FSKD.2008.40