DocumentCode
3751980
Title
Online marginalized linear stacked denoising autoencoders for learning from big data stream
Author
Arif Budiman;Mohamad Ivan Fanany;Chan Basaruddin
Author_Institution
Faculty of Computer Science, University of Indonesia, Depok, West Java Indonesia
fYear
2015
Firstpage
227
Lastpage
235
Abstract
Big non-stationary data, which comes in gradual fashion or stream, is one important issue in the application of big data to train deep learning machines. In this paper, we focused on a unique variant of traditional autoencoder, which is called Marginalized Linear Stacked Denoising Autoencoder (MLSDA). MLSDA uses a simple linear model. It is faster and uses less number of parameters than the traditional SDA. It also takes advantages of convex optimization. It has better improvement in the bag of words feature representation. However, the traditional SDA with stochastic gradient descent has been more widely accepted in many applications. The stochastic gradient descent is naturally an online learning. It makes the traditional SDA more scalable for streaming big data. This paper proposes a simple modification of MLSDA. Our modification uses matrix multiplication concept for online learning. The experiment result showed the similar accuracy level compared with a batch version of MLSDA and using lower computation resources. The online MLSDA will improve the scalability of MLSDA for handling streaming big data that representing bag of words features for natural language processing, information retrieval, and computer vision.
Keywords
"Natural language processing","Graphics processing units","Support vector machines"
Publisher
ieee
Conference_Titel
Advanced Computer Science and Information Systems (ICACSIS), 2015 International Conference on
Type
conf
DOI
10.1109/ICACSIS.2015.7415181
Filename
7415181
Link To Document