• DocumentCode
    1662646
  • Title

    Domain Driven Two-Phase Feature Selection Method Based on Bhattacharyya Distance and Kernel Distance Measurements

  • Author

    Chen, Yibing ; Zhang, Lingling ; Li, Jun ; Shi, Yong

  • Author_Institution
    Sch. of Manage., Grad. Univ. of Chinese Acad. of Sci., Beijing, China
  • Volume
    3
  • fYear
    2011
  • Firstpage
    217
  • Lastpage
    220
  • Abstract
    This paper proposes a two-phase feature selection method specific for bioinformatics domain from classification perspective in data mining. In the first phase, Bhattacharyya distance measurement is used for filtering the majority of irrelevant genes. Upon the basis, we apply floating sequential search method (FSSM) to further select informative gene set using kernel distance as measurement of class separability. The verification of colon tissue dataset using support vector machines (SVMs) proves that informative gene set selected by our method is acceptable for disease identification.
  • Keywords
    bioinformatics; data mining; diseases; statistical distributions; support vector machines; Bhattacharyya distance measurement; bioinformatics domain; class separability; colon tissue dataset verification; data mining; disease identification; domain driven two-phase feature selection method; floating sequential search method; genes filtering; informative gene set; kernel distance measurements; support vector machines; Bioinformatics; Colon; Data mining; Diseases; Distance measurement; Gene expression; Kernel; Bhattacharyya distance; domain driven data mining; feature selection; floating sequential search method; kernel distance measurement;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Web Intelligence and Intelligent Agent Technology (WI-IAT), 2011 IEEE/WIC/ACM International Conference on
  • Conference_Location
    Lyon
  • Print_ISBN
    978-1-4577-1373-6
  • Electronic_ISBN
    978-0-7695-4513-4
  • Type

    conf

  • DOI
    10.1109/WI-IAT.2011.61
  • Filename
    6040844