مرکز منطقه ای اطلاع رساني علوم و فناوري - Research for Information Extraction Based on Wrapper Model Algorithm

DocumentCode :

2710517

Title :

Research for Information Extraction Based on Wrapper Model Algorithm

Author :

Zhiwei, Xu ; Xinghua, Wang

Author_Institution :

Dept. of Comput. Sci., Chang Chun Univ., Chang Chun, China

fYear :

2010

fDate :

7-10 May 2010

Firstpage :

652

Lastpage :

655

Abstract :

Mainly on data-intensive Web site research experiment. In the web pages of the automatically generated wrapper method of research-based information extraction, the main job is to make the page tree matching algorithm, the sample tree and the tree wrapper DOM tree matching two pages compared to the first to discover the page selection mode, producing the primary template, and then self-correction of primary template found iterative model, and finally generate the page wrapper method. The wrapper generation process does not require human intervention to achieve a fully automated completion. Experiment with satisfactory results.

Keywords :

Internet; Web sites; information retrieval; iterative methods; trees (mathematics); Web pages; data-intensive Web site research experiment; information extraction; page tree matching algorithm; primary template found iterative model; sample tree; tree wrapper DOM tree matching; wrapper model algorithm; Computer science; Data mining; Databases; HTML; Humans; Information technology; Iterative algorithms; Iterative methods; Research and development; Web pages; DOM tree; information extraction; match technology; wrapper;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Computer Research and Development, 2010 Second International Conference on

Conference_Location :

Kuala Lumpur

Print_ISBN :

978-0-7695-4043-6

Type :

conf

DOI :

10.1109/ICCRD.2010.141

Filename :

5489547

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2710517