Title :
Detecting Web Content Function Using Generalized Hidden Markov Model
Author :
Chen, Jinlin ; Zhong, Ping ; Cook, Terry
Author_Institution :
Dept. of Comput. Sci., City Univ. of New York, NY
Abstract :
Web content function indicates authors´ intension towards the purpose of the content and therefore plays an important role for Web information processing. In this paper we propose a generalized hidden Markov model which extends traditional hidden Markov model for Web content function detection. By incorporating multiple emission features and detecting state transition sequence based on layout structure, generalized hidden Markov model can effectively make use of Web-specific information and achieve better performance comparing to traditional hidden Markov model. Comparing to previous approaches on function detection, our approach has the advantages of domain-independency and extensibility for other applications. Experiments show promising results with our approach
Keywords :
Internet; hidden Markov models; Web content function detection; Web information processing; generalized hidden Markov model; state transition sequence detection; Application software; Computer science; Computer vision; Content based retrieval; Hidden Markov models; Information processing; Information retrieval; Natural language processing; Testing; Web pages;
Conference_Titel :
Machine Learning and Applications, 2006. ICMLA '06. 5th International Conference on
Conference_Location :
Orlando, FL
Print_ISBN :
0-7695-2735-3
DOI :
10.1109/ICMLA.2006.21