DocumentCode
650475
Title
Nonlinear Dimensionality Reduction for Cluster Identification in Metagenomic Samples
Author
Gisbrecht, Andrej ; Hammer, Barbara ; Mokbel, Bassam ; Sczyrba, Alexander
Author_Institution
CITEC Center of Excellence, Bielefeld Univ., Bielefeld, Germany
fYear
2013
fDate
16-18 July 2013
Firstpage
174
Lastpage
179
Abstract
We investigate the potential of modern nonlinear dimensionality reduction techniques for an interactive cluster detection in bioinformatics applications. We demonstrate that recent non-parametric techniques such as t-distributed stochastic neighbor embedding (t-SNE) allow a cluster identification which is superior to direct clustering of the original data or cluster detection based on classical parametric dimensionality reduction approaches. Non-parametric approaches, however, display quadratic complexity which makes them unsuitable in interactive devices. As speedup, we propose kernel-t-SNE as a fast parametric counterpart based on t-SNE.
Keywords
bioinformatics; computational complexity; data analysis; genomics; nonparametric statistics; pattern clustering; stochastic processes; bioinformatics; cluster identification; data analysis; direct clustering; interactive cluster detection; kernel-t-SNE; metagenomic samples; next generation sequencing; nonlinear dimensionality reduction; nonparametric techniques; quadratic complexity; t-distributed stochastic neighbor embedding; Metagenomics; NGS data; clustering; kernel mapping; nonlinear dimensionality reduction; t-SNE;
fLanguage
English
Publisher
ieee
Conference_Titel
Information Visualisation (IV), 2013 17th International Conference
Conference_Location
London
ISSN
1550-6037
Type
conf
DOI
10.1109/IV.2013.22
Filename
6676559
Link To Document