• DocumentCode
    260359
  • Title

    An Associative Classification Based Approach for Detecting SNP-SNP Interactions in High Dimensional Genome

  • Author

    Uppu, Suneetha ; Krishna, Aneesh ; Gopalan, Raj P.

  • Author_Institution
    Dept. of Comput., Curtin Univ., Perth, WA, Australia
  • fYear
    2014
  • fDate
    10-12 Nov. 2014
  • Firstpage
    329
  • Lastpage
    333
  • Abstract
    There have been many studies that depict genotype-phenotype relationships by identifying genetic variants associated with a specific disease. Researchers focus more attention on interactions between SNPs that are strongly associated with disease in the absence of main effect. In this context, a number of machine learning and data mining tools are applied to identify the combinations of multi-locus SNPs in higher order data. However, none of the current models can identify useful SNP-SNP interactions for high dimensional genome data. Detecting these interactions is challenging due to bio-molecular complexities and computational limitations. The goal of this research was to implement associative classification and study its effectiveness for detecting the epistasis in balanced and imbalanced datasets. The proposed approach was evaluated for two locus epistasis interactions using simulated data. The datasets were generated for 5 different penetrance functions by varying heritability, minor allele frequency and sample size. In total, 23,400 datasets were generated and several experiments are conducted to identify the disease causal SNP interactions. The accuracy of classification by the proposed approach was compared with the previous approaches. Though associative classification showed only relatively small improvement in accuracy for balanced datasets, it outperformed existing approaches in higher order multi-locus interactions in imbalanced datasets.
  • Keywords
    DNA; biology computing; data mining; diseases; genetics; genomics; learning (artificial intelligence); molecular biophysics; molecular configurations; pattern classification; DNA sequences; SNP-SNP interactions; associative classification-based approach; biomolecular complexity; computational limitations; data mining tools; epistasis; genetic variants; genotype-phenotype relationships; high-dimensional genome; high-dimensional genome data; locus epistasis interactions; machine learning; minor allele frequency; multilocus SNP; penetrance functions; single nucleotide polymorphism; specific disease; Accuracy; Association rules; Bioinformatics; Diseases; Genomics; Epistasis; SNP-SNP interactions; associative classification; multi-locus;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Bioinformatics and Bioengineering (BIBE), 2014 IEEE International Conference on
  • Conference_Location
    Boca Raton, FL
  • Type

    conf

  • DOI
    10.1109/BIBE.2014.29
  • Filename
    7033602