DocumentCode
2914638
Title
Efficient soft relational clustering based on randomized search applied to selection of bio-basis for amino acid sequence analysis
Author
Mahfouz, M.A. ; Ismail, Muhammad Ali
Author_Institution
Dept. of Comput. & Syst. Eng., Univ. of Alexandria, Alexandria, Egypt
fYear
2012
fDate
27-29 Nov. 2012
Firstpage
287
Lastpage
292
Abstract
Protein sequence clustering is a process that aims to identify sets of homologous proteins in a protein database. In this paper, two efficient soft c-mediods clustering algorithms for prototype selection for protein sequences are presented. In the proposed techniques patterns are considered to belong to some but not necessarily all clusters. The proposed algorithms is comprised of a judicious integration of the principles of fuzzy sets, semi-fuzzy or soft clustering models, the amino acid mutation matrix. Applying randomized search along with soft clustering model to the fuzzy c-medoids algorithm enables efficient and effective selection of the minimum set of the most informative bio-bases. The efficiency and the effectiveness of the proposed algorithms, along with a comparison with other algorithms, have been demonstrated on different types of protein data sets.
Keywords
biology computing; fuzzy set theory; matrix algebra; pattern clustering; proteins; search problems; amino acid mutation matrix; amino acid sequence analysis; bio-basis selection; fuzzy set; protein data set; protein sequence clustering; randomized search; semi-fuzzy model; soft c-mediods clustering algorithm; soft clustering model; soft relational clustering; Algorithm design and analysis; Amino acids; Clustering algorithms; Linear programming; Partitioning algorithms; Proteins; Runtime; Cluster Analysis; Data Mining; Fuzzy Clustering; Medoid-Based Clustering; Protein sequences; Relational Clustering; Unsupervised Learning;
fLanguage
English
Publisher
ieee
Conference_Titel
Computer Engineering & Systems (ICCES), 2012 Seventh International Conference on
Conference_Location
Cairo
Print_ISBN
978-1-4673-2960-6
Type
conf
DOI
10.1109/ICCES.2012.6408530
Filename
6408530
Link To Document