• DocumentCode
    2483045
  • Title

    A Methodology for Extracting Head Contents from Meaningful Tables in Web Pages

  • Author

    Chavan, Madhuri M. ; Shirgave, S.K.

  • Author_Institution
    D.Y. Patil Coll. of Eng. & Technol., Kolhapur, India
  • fYear
    2011
  • fDate
    3-5 June 2011
  • Firstpage
    272
  • Lastpage
    277
  • Abstract
    Tables are an important feature of presenting information & are widely used on the web. They show relational data in a simple & precise manner. A typical web page consists of many blocks or areas e.g. main content areas, advertisements, images etc. Tables contain meaningful information. Almost all data is arranged in tabular format. Tables describe relational information in a compact manner. So there is need to find out the tables which contains meaningfulness structural data. In this paper, a method is introduced for determining the meaningfulness of a table and extracting the Head from meaningful table.
  • Keywords
    Web sites; text analysis; Web pages; head content extraction; meaningful tables; relational data; Data mining; Decision trees; Filtering; HTML; Head; Magnetic heads; Web pages; DOM Tree; Table mining; Text mining; Web table; information extraction;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Communication Systems and Network Technologies (CSNT), 2011 International Conference on
  • Conference_Location
    Katra, Jammu
  • Print_ISBN
    978-1-4577-0543-4
  • Electronic_ISBN
    978-0-7695-4437-3
  • Type

    conf

  • DOI
    10.1109/CSNT.2011.66
  • Filename
    5966452