DocumentCode :
3309389
Title :
A combined method for automatic domain-specific Terminology extraction
Author :
Li Liu ; Quan Qi
Author_Institution :
Sch. of Comput. Sci. & Technol., Beijing Inst. of Technol., Beijing, China
Volume :
3
fYear :
2011
fDate :
26-28 July 2011
Firstpage :
1734
Lastpage :
1737
Abstract :
In this paper we present a Terminology extraction algorithm combining with machine learning and corpus-based statistical model. We collect a balanced corpus with all the possible nominal terms of every domain annotated, and take this corpus as training corpus. After selecting training features for terms, we use SVM to recognize terminological candidates in target corpus. Then we calculate the Domain Relevance (DR) and Domain Consensus (DC) scores for the terminological candidates to acquire domain-specific Terminologies. We make 4 experiments on Tourism corpus and short sentences with two kinds of balanced training corpora. Furthermore, we evaluate the precision and recall of our Terminology extraction algorithm by comparing the words in a golden standard with the words extracted by our system. The experiments show that our algorithm can get improved result in automatic extraction of nominal domain-specific Terminologies. A detailed analysis shows the advantages and disadvantages of our algorithm.
Keywords :
learning (artificial intelligence); ontologies (artificial intelligence); statistical analysis; support vector machines; SVM; automatic domain-specific terminology extraction algorithm; balanced training corpora; corpus-based statistical model; domain consensus score; domain relevance score; machine learning; ontology learning; support vector machine; tourism corpus; training feature selection; Algorithm design and analysis; Feature extraction; Machine learning; Machine learning algorithms; Support vector machines; Terminology; Training; GATE; SVM; Terminology; domain consensus; domain relevance;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Fuzzy Systems and Knowledge Discovery (FSKD), 2011 Eighth International Conference on
Conference_Location :
Shanghai
Print_ISBN :
978-1-61284-180-9
Type :
conf
DOI :
10.1109/FSKD.2011.6019798
Filename :
6019798
Link To Document :
بازگشت