Title :
Text classification dimension reduction algorithm for Chinese web page based on deep learning
Author :
Feng Shen ; Xiong Luo ; Yi Chen
Author_Institution :
School of Computer and Communication Engineering, University of Science and Technology Beijing, 30Xueyuan Road, Haidian District, 100083, China
Abstract :
Nowadays, due to the development of network technology, the Internet becomes the main resource for people to obtain information. The openness of the network makes the network abound of all kinds of information, so it becomes more and more important that using network text classification techniques enable people to get the information they are interested in quickly from the mixed and disorderly network information. Since network text classification technology is the basis of information filtering, search engines, and other fields, it has gradually become a research focus. The traditional text classification technology can´t effectively support the Chinese web page text classification because of the large scale of Chinese web page text. An important way to learn the data feature from massive data is to use deep learning neural network structure. Deep learning network has excellent feature learning ability. It can combine objects of low-level features to form advanced abstract representations of the object which will be more suitable for classification. This paper proposes a new deep learning based text classification model to solve the problem of Chinese web text categorization of dimension reduction. Moreover we verify the feasibility of this method through the experiment.
Keywords :
Autoencoder; Chinese Web Page Text Classification; Deep Learning; Feature Dimension Reduction;
Conference_Titel :
Cyberspace Technology (CCT 2013), International Conference on
Conference_Location :
Beijing, China
Electronic_ISBN :
978-1-84919-801-1
DOI :
10.1049/cp.2013.2171