Title :
Chinese Web page classification based on self-organizing mapping neural networks
Author_Institution :
Inst. of Comput. Sci., Zhejiang Normal Univ., Jinhua, China
Abstract :
This paper deals with self-organizing mapping (SOM) neural network´s topology and learning algorithm, and the application in the automatic classification of Chinese Web pages. SOM neural network has the advantages of simple structure, ordered mapping topology and low complexity of learning. It is suitable for many complex problems such as multi-class pattern recognition, high dimension input vector and large quantity of training data. The accuracy of clustering can be improved when combining SOM´s unsupervised learning algorithm with LVQ learning algorithm. At the end of the paper, it is proposed the classification result of SOM neural network applied in the 5087 html pages of People´s Daily Web edition, with the average precision 90.08% and the average recall 89.85%.
Keywords :
Web sites; feature extraction; pattern classification; self-organising feature maps; unsupervised learning; vector quantisation; Chinese Web page classification; Chinese Web pages; LVQ learning algorithm; People´s Daily Web edition; SOM; automatic classification; high dimension input vector; html pages; learning algorithm; multiclass pattern recognition; neural network topology; ordered mapping topology; self-organizing map; self-organizing mapping neural networks; training data; unsupervised learning algorithm; Clustering algorithms; Dictionaries; Feature extraction; Frequency; Network topology; Neural networks; Support vector machine classification; Support vector machines; Text categorization; Web pages;
Conference_Titel :
Computational Intelligence and Multimedia Applications, 2003. ICCIMA 2003. Proceedings. Fifth International Conference on
Print_ISBN :
0-7695-1957-1
DOI :
10.1109/ICCIMA.2003.1238107