DocumentCode :
2682022
Title :
Semantics-Based Extraction of Webpage Main Text
Author :
Fengjiao, Han ; Zhurong, Zhou
Author_Institution :
Coll. of Comput. & Inf. Sci., Southwest Univ., Chongqing, China
fYear :
2012
fDate :
22-24 Oct. 2012
Firstpage :
181
Lastpage :
184
Abstract :
Extraction of web page main text is one of the most efficient methods to improve search engine. In the traditional method, the extraction of the web page main text use the similarity of DOM sub-tree as a end condition for the DOM tree traversing, while its speed is unsatisfactory on such a complex web page structure. Thus, to raise the traverse speed and accuracy of DOM sub-tree effectively, we propose a method which is Semantics-based Extraction of Web page Main text.
Keywords :
Web sites; search engines; semantic Web; text analysis; DOM sub-tree; DOM tree traversing; Webpage main text; complex Webpage structure; search engine; semantics-based extraction; Accuracy; Computers; Data mining; Educational institutions; HTML; Navigation; Semantics; Extraction; Semantics; Webpage;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Semantics, Knowledge and Grids (SKG), 2012 Eighth International Conference on
Conference_Location :
Beijing
Print_ISBN :
978-1-4673-2561-5
Type :
conf
DOI :
10.1109/SKG.2012.47
Filename :
6391827
Link To Document :
بازگشت