Gene-finding as an Attribute Selection Task

Author

Borges, Helyane Bronoski ; Nievola, Julio Cesar

Author_Institution

Pontificia Univ. Catolica do Parana, Curitiba

fYear

2007

fDate

11-13 July 2007

Firstpage

537

Lastpage

542

Abstract

For data miners, bioinformatics pose a most demanding challenge than only creating efficient algorithms. They should work with databases that are more "horizontal" than "vertical", as the data consist of a few samples of a large (sometimes huge) number of attributes in the case of micro-arrays. More important is the fact that there is a priori biological knowledge saying that only a few genes are normally linked to each characteristic exhibited by the individual. It allows one to use Attribute Selection to determine which attributes are more likely to induce the observable characteristic. In this paper a study on many configurations of attribute selection schemes is made on two typical bioinformatics datasets. The results show that sequential subset generation guarantees better results and reiterates the use of the wrapper approach to achieve better classification, despite its running time being larger than the filter approach.

Keywords

DNA; biology computing; data mining; genetics; pattern classification; DNA; attribute selection scheme; bioinformatics; data mining; genetics; microarray technology; pattern classification; Bioinformatics; Cancer; DNA; Data mining; Databases; Diseases; Filters; Genetics; Machine learning; Malignant tumors;

fLanguage

English

Publisher

ieee

Conference_Titel

Computer and Information Science, 2007. ICIS 2007. 6th IEEE/ACIS International Conference on

Conference_Location

Melbourne, Qld.

Print_ISBN

0-7695-2841-4

Type

conf

DOI

10.1109/ICIS.2007.104

Filename

4276437