مرکز منطقه ای اطلاع رساني علوم و فناوري - Semantics-Based Extraction of Webpage Main Text

DocumentCode :

2682022

Title :

Semantics-Based Extraction of Webpage Main Text

Author :

Fengjiao, Han ; Zhurong, Zhou

Author_Institution :

Coll. of Comput. & Inf. Sci., Southwest Univ., Chongqing, China

fYear :

2012

fDate :

22-24 Oct. 2012

Firstpage :

181

Lastpage :

184

Abstract :

Extraction of web page main text is one of the most efficient methods to improve search engine. In the traditional method, the extraction of the web page main text use the similarity of DOM sub-tree as a end condition for the DOM tree traversing, while its speed is unsatisfactory on such a complex web page structure. Thus, to raise the traverse speed and accuracy of DOM sub-tree effectively, we propose a method which is Semantics-based Extraction of Web page Main text.

Keywords :

Web sites; search engines; semantic Web; text analysis; DOM sub-tree; DOM tree traversing; Webpage main text; complex Webpage structure; search engine; semantics-based extraction; Accuracy; Computers; Data mining; Educational institutions; HTML; Navigation; Semantics; Extraction; Semantics; Webpage;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Semantics, Knowledge and Grids (SKG), 2012 Eighth International Conference on

Conference_Location :

Beijing

Print_ISBN :

978-1-4673-2561-5

Type :

conf

DOI :

10.1109/SKG.2012.47

Filename :

6391827

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2682022