• DocumentCode
    2865778
  • Title

    Research on Method of Extracting Chinese Domain Terms Based on Rough and Fuzzy Clustering

  • Author

    Liu, Jie ; Fan, Xiao-Zhong ; Chen, CKang

  • Author_Institution
    Beijing Inst. of Technol., Beijing
  • fYear
    2007
  • fDate
    29-31 Oct. 2007
  • Firstpage
    366
  • Lastpage
    369
  • Abstract
    Automatic extraction of domain terms is the basis of domain ontology learning. General linguistic resources such as WordNet and HowNet can be applied to extract only partial domain terms from domain unstructured texts. In this paper, we firstly extract partial terms by calculating domain relatedness between words by HowNet. Then the extracted terms are semantically clustered with fuzzy c-means clustering algorithm based on properties of rough sets. Finally more domain terms are extracted from unknown words according to the clustering results with the method of machine learning. The experimental results showed that the method can not only extract domain terms as more as possible, but also ensure higher precision.
  • Keywords
    dictionaries; fuzzy set theory; information retrieval; learning (artificial intelligence); natural language processing; ontologies (artificial intelligence); pattern clustering; rough set theory; semantic Web; text analysis; vocabulary; HowNet semantic dictionary; WordNet; automatic Chinese domain term extraction; domain ontology learning; domain unstructured text; fuzzy c-means clustering algorithm; machine learning; rough set-based clustering; semantic Web; Clustering algorithms; Computer science; Data mining; Dictionaries; Fuzzy sets; Machine learning; Machine learning algorithms; Ontologies; Rough sets; Semantic Web;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Semantics, Knowledge and Grid, Third International Conference on
  • Conference_Location
    Shan Xi
  • Print_ISBN
    0-7695-3007-9
  • Electronic_ISBN
    978-0-7695-3007-9
  • Type

    conf

  • DOI
    10.1109/SKG.2007.71
  • Filename
    4438571