• DocumentCode
    3680998
  • Title

    Application of Internet Technology and Web Information Extraction Wrapper Based on DOM for Agricultural Data Acquisition

  • Author

    Liming Luo;Wen Lu;Bing Wei;Ye Qin;Yeqing Xiong

  • Author_Institution
    Coll. of Inf. Eng., Capital Normal Univ., Beijing, China
  • fYear
    2015
  • Firstpage
    327
  • Lastpage
    331
  • Abstract
    This paper presents a construction method of Web Information extraction wrapper based on DOM is proposed. Combining XPath and pattern matching, it can deal with the two type of information at the same time under the guide of source and target knowledge library. Also, knowledge libraries help to extract more useful information for users. This paper introduces in detail the process of building the wrapper and the corresponding algorithm, including information judgment based on DOM, key extraction block determination by hierarchical clustering thoughts, extraction expression determination using inductive learning and natural language processing and so on.
  • Keywords
    "Data mining","Web pages","Knowledge based systems","HTML","Internet","Feature extraction"
  • Publisher
    ieee
  • Conference_Titel
    Network and Information Systems for Computers (ICNISC), 2015 International Conference on
  • Type

    conf

  • DOI
    10.1109/ICNISC.2015.84
  • Filename
    7311897