• DocumentCode
    3374734
  • Title

    An approach of hierarchical concept clustering on Medical Short Text corpus

  • Author

    Wei Li ; Dazhe Zhao ; Jinzhu Yang ; Longbing Cao

  • Author_Institution
    Key Lab. of Med. Image Comput. of Minist. of Educ., Northeastern Univ., Shenyang, China
  • fYear
    2013
  • fDate
    16-18 Dec. 2013
  • Firstpage
    509
  • Lastpage
    518
  • Abstract
    Hierarchical clustering and conceptual clustering are two important types of clustering analysis methods. A variety of approaches have been proposed in previous works. However, seldom methods are designed to run on the medical short text database and construct a hierarchical concept taxonomy. This paper proposes a new clustering method of Hierarchical Concept Clustering on Medical Short Text corpus (HCCST), which presents a new solution on actionable disease taxonomy construction from the actual medical data. Our approach has three advantages. Firstly, HCCST takes a new similarity method which covers all the problems in medical short text distance computing. Secondly, an adaptive clustering method is proposed for synonymous disease names without predefining the size of clusters. Thirdly, this paper uses a mutual information based potential hierarchy concept pair recognition method which improves the subsumption method to create hierarchical disease taxonomy. The evaluation is conducted on Chinese medical disease name text data set and the result shows that HCCST achieves satisfactory performance.
  • Keywords
    database management systems; diseases; medical computing; pattern clustering; text analysis; Chinese medical disease name text data set; HCCST; actionable disease taxonomy construction; adaptive clustering method; clustering analysis methods; conceptual clustering; hierarchical concept clustering; hierarchical concept taxonomy; hierarchical disease taxonomy; hierarchy concept pair recognition method; medical short text corpus; medical short text database; medical short text distance computing; mutual information; subsumption method; synonymous disease names; Clustering algorithms; Diabetes; Diseases; Medical diagnostic imaging; Retinopathy; Taxonomy; Hierarchical clustering; concept clustering; medical disease taxonomy; short text clustering;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Biomedical Engineering and Informatics (BMEI), 2013 6th International Conference on
  • Conference_Location
    Hangzhou
  • Print_ISBN
    978-1-4799-2760-9
  • Type

    conf

  • DOI
    10.1109/BMEI.2013.6746995
  • Filename
    6746995