• DocumentCode
    260304
  • Title

    A Genetic Algorithm for the Selection of Features Used in the Prediction of Protein Function

  • Author

    Fernandes Leijoto, Larissa ; Assis De Oliveira Rodrigues, Thiago ; Zaratey, Luis Enrique ; Nobre, Cristiane Neri

  • Author_Institution
    Dept. of Comput. Sci., Pontifical Catholic Univ. of Minas Gerais, Belo Horizonte, Brazil
  • fYear
    2014
  • fDate
    10-12 Nov. 2014
  • Firstpage
    168
  • Lastpage
    174
  • Abstract
    Proteins are macromolecules that have a high molecular weight, and make up, along with water, most of the composition of cells. The functions they perform are extremely important, such as the catalysis of biochemical reactions, cytoskeleton formation, and the transportation and storage of substances. With the completion of genome sequencing, protein discovery has been growing exponentially, and the laboratory methods for determining their functions have not been able to keep up with this growth. Due to this fact, it is necessary to develop methods to aid in this function discovery process. Thus, this work proposes a physical-chemical feature selection methodology calculated by means of the structures that compose the proteins. This stage has the goal of choosing a feature subset from all available features. A feature is considered relevant if it can be used by the machine to create a separation capability between the different protein classes. To select this subset, we proposed the use of a simple genetic algorithm. The results obtained with the proposed methodology were superior to those found in the literature, reaching a precision of 71% and a sensitivity of 68%.
  • Keywords
    biochemistry; bioinformatics; cellular biophysics; feature selection; genetic algorithms; genomics; macromolecules; molecular biophysics; molecular configurations; molecular weight; proteins; biochemical reactions; catalysis; cell composition; cytoskeleton formation; function discovery process; genetic algorithm; genome sequencing; macromolecules; molecular weight; physical-chemical feature selection; protein discovery; protein function; protein structures; transportation; Amino acids; Databases; Feature extraction; Genetic algorithms; Proteins; Sensitivity; Support vector machines;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Bioinformatics and Bioengineering (BIBE), 2014 IEEE International Conference on
  • Conference_Location
    Boca Raton, FL
  • Type

    conf

  • DOI
    10.1109/BIBE.2014.42
  • Filename
    7033576