• DocumentCode
    1805718
  • Title

    Clustering of SNP data based on SCLIQUE

  • Author

    Jia, Min ; Wu, Yue ; Lei, Zhou ; Liu, Zongtian

  • Author_Institution
    Comput. Eng. & Sci, Shanghai Univ., Shanghai, China
  • Volume
    4
  • fYear
    2011
  • fDate
    24-26 Dec. 2011
  • Firstpage
    2359
  • Lastpage
    2363
  • Abstract
    SNP clustering is an indispensable exploratory tool of biology researchers, which can identify co-expression or co-regulated genes, and predict functions of unknown genes according to the same cluster of genes with known ones. CLIQUE clustering algorithm is an effective way to solve high-dimensional clustering problems, but it is not applicable for categorical data. Single nucleotide polymorphisms (SNPs) are single base pair positions in genomic DNA at which different sequence alternatives (alleles) exist in normal individuals in some population(s). SNPS data is genotype value, which belongs to the categorical data. In this paper, we improve CLIQUE algorithm aimed at SNP clustering from three aspects: re-defining the grids division, re-defining common face between two units, re-defining rules on the generation of high-dimensional candidate dense units. Experiments show that the proposed algorithm SCLIQUE not only takes the advantages of CLIQUE algorithm, but also expands CLIQUE clustering algorithm from numerical space to categorical space.
  • Keywords
    DNA; biology computing; genetics; genomics; molecular biophysics; pattern clustering; CLIQUE clustering algorithm; SCLIQUE algorithm; SNP data clustering; biology researchers; categorical data; coexpression genes; coregulated genes; genes cluster; genomic DNA; genotype value; grids division; high-dimensional candidate dense units generation; high-dimensional clustering problems; single base pair positions; single nucleotide polymorph; Accuracy; Algorithm design and analysis; Rocks; SCLIQUE algorithm; SNP clustering; categorical data; high dimensional clustering;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Science and Network Technology (ICCSNT), 2011 International Conference on
  • Conference_Location
    Harbin
  • Print_ISBN
    978-1-4577-1586-0
  • Type

    conf

  • DOI
    10.1109/ICCSNT.2011.6182446
  • Filename
    6182446