DocumentCode :
2203999
Title :
Web Page Classification using Modified Naïve Bayesian Approach
Author :
Tomar, G.S. ; Verma, Shekhar ; Jha, Ashish
Author_Institution :
Jaypee Inst. of Eng. & Technol., Guna
fYear :
2006
fDate :
14-17 Nov. 2006
Firstpage :
1
Lastpage :
4
Abstract :
This paper introduces the concept of a classification tool for Web pages called WebClassify, which uses modified naive Bayesian algorithm with multinomial model to classify pages into various categories. The tool starts the classification from downloading training Web text from Internet, preparing the hypertext for mining, and then storing Web data in a local database. The paper also gives an account of choosing naive Bayesian approach over other approaches for Web text mining. The experimental results along with the classification accuracy analysis with increasing vocabulary size, is also shown
Keywords :
Bayes methods; Internet; Web sites; classification; data mining; text analysis; Web data storage; Web page classification; Web text downloading training; Web text mining; WebClassify; classification accuracy analysis; hypertext preparation; multinomial model; naive Bayesian approach; vocabulary size; Bayesian methods; Cleaning; Data mining; Databases; Internet; Ontologies; Uniform resource locators; Vocabulary; Web mining; Web pages; Bayesian algorithm; competency; ontology; stemming;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
TENCON 2006. 2006 IEEE Region 10 Conference
Conference_Location :
Hong Kong
Print_ISBN :
1-4244-0548-3
Electronic_ISBN :
1-4244-0549-1
Type :
conf
DOI :
10.1109/TENCON.2006.344193
Filename :
4142381
Link To Document :
بازگشت