DocumentCode
2483045
Title
A Methodology for Extracting Head Contents from Meaningful Tables in Web Pages
Author
Chavan, Madhuri M. ; Shirgave, S.K.
Author_Institution
D.Y. Patil Coll. of Eng. & Technol., Kolhapur, India
fYear
2011
fDate
3-5 June 2011
Firstpage
272
Lastpage
277
Abstract
Tables are an important feature of presenting information & are widely used on the web. They show relational data in a simple & precise manner. A typical web page consists of many blocks or areas e.g. main content areas, advertisements, images etc. Tables contain meaningful information. Almost all data is arranged in tabular format. Tables describe relational information in a compact manner. So there is need to find out the tables which contains meaningfulness structural data. In this paper, a method is introduced for determining the meaningfulness of a table and extracting the Head from meaningful table.
Keywords
Web sites; text analysis; Web pages; head content extraction; meaningful tables; relational data; Data mining; Decision trees; Filtering; HTML; Head; Magnetic heads; Web pages; DOM Tree; Table mining; Text mining; Web table; information extraction;
fLanguage
English
Publisher
ieee
Conference_Titel
Communication Systems and Network Technologies (CSNT), 2011 International Conference on
Conference_Location
Katra, Jammu
Print_ISBN
978-1-4577-0543-4
Electronic_ISBN
978-0-7695-4437-3
Type
conf
DOI
10.1109/CSNT.2011.66
Filename
5966452
Link To Document