Title :
Classification of Sensitive Web Documents
Author :
Gao, Hui ; Fu, Yan ; Li, Jian-ping
Author_Institution :
Sch. of Comput. Sci. & Eng., Univ. of Electron. Sci.& Technol. of China, Chengdu
Abstract :
Web document classification is the process of grouping web documents into one or more predefined categories based on their content. It is an important component of web monitor system that can assist people to reduce the dissemination of harmful information. This paper proposes a combined approach for building a decision tree with the multilayer neural network as its categorically value function, and presented a complete approach for automated news categorization. The experimental evaluation demonstrates that this approach provides better classification accuracy than single traditional text categorization methods.
Keywords :
Internet; multilayer perceptrons; pattern classification; text analysis; Web document classification; Web monitor system; automated news categorization; decision tree; multilayer neural network; sensitive Web documents; text categorization; Classification algorithms; Computer science; Data mining; Decision trees; Frequency; Internet; Monitoring; Multi-layer neural network; Neural networks; Text categorization; Decision tree; Neural network; PCA; Text classification;
Conference_Titel :
Apperceiving Computing and Intelligence Analysis, 2008. ICACIA 2008. International Conference on
Conference_Location :
Chengdu
Print_ISBN :
978-1-4244-3427-5
Electronic_ISBN :
978-1-4244-3426-8
DOI :
10.1109/ICACIA.2008.4770027