Title :
Multi-Layer and Recursive Neural Networks for Metagenomic Classification
Author :
Ditzler, Gregory ; Polikar, Robi ; Rosen, Gail
Author_Institution :
Dept. of Electr. & Comput. Eng., Drexel Univ., Philadelphia, PA, USA
Abstract :
Recent advances in machine learning, specifically in deep learning with neural networks, has made a profound impact on fields such as natural language processing, image classification, and language modeling; however, feasibility and potential benefits of the approaches to metagenomic data analysis has been largely under-explored. Deep learning exploits many layers of learning nonlinear feature representations, typically in an unsupervised fashion, and recent results have shown outstanding generalization performance on previously unseen data. Furthermore, some deep learning methods can also represent the structure in a data set. Consequently, deep learning and neural networks may prove to be an appropriate approach for metagenomic data. To determine whether such approaches are indeed appropriate for metagenomics, we experiment with two deep learning methods: i) a deep belief network, and ii) a recursive neural network, the latter of which provides a tree representing the structure of the data. We compare these approaches to the standard multi-layer perceptron, which has been well-established in the machine learning community as a powerful prediction algorithm, though its presence is largely missing in metagenomics literature. We find that traditional neural networks can be quite powerful classifiers on metagenomic data compared to baseline methods, such as random forests. On the other hand, while the deep learning approaches did not result in improvements to the classification accuracy, they do provide the ability to learn hierarchical representations of a data set that standard classification methods do not allow. Our goal in this effort is not to determine the best algorithm in terms accuracy-as that depends on the specific application-but rather to highlight the benefits and drawbacks of each of the approach we discuss and provide insight on how they can be improved for predictive metagenomic analysis.
Keywords :
DNA; genomics; image classification; image representation; learning (artificial intelligence); medical image processing; molecular biophysics; multilayer perceptrons; natural language processing; neurophysiology; classification accuracy; data structure; deep belief network; deep learning methods; generalization performance; hierarchical representations; image classification; language modeling; machine learning; machine learning community; metagenomic classification; metagenomic data analysis; metagenomic literature; multilayer perceptron; multilayer-recursive neural networks; natural language processing; neural networks; nonlinear feature representations; prediction algorithm; predictive metagenomic analysis; unsupervised fashion; Feature extraction; Machine learning; Nanobioscience; Neural networks; Organisms; Training; Vegetation; Comparative metagenomics; metagenomics; microbiome; neural networks;
Journal_Title :
NanoBioscience, IEEE Transactions on
DOI :
10.1109/TNB.2015.2461219