DocumentCode :
3680998
Title :
Application of Internet Technology and Web Information Extraction Wrapper Based on DOM for Agricultural Data Acquisition
Author :
Liming Luo;Wen Lu;Bing Wei;Ye Qin;Yeqing Xiong
Author_Institution :
Coll. of Inf. Eng., Capital Normal Univ., Beijing, China
fYear :
2015
Firstpage :
327
Lastpage :
331
Abstract :
This paper presents a construction method of Web Information extraction wrapper based on DOM is proposed. Combining XPath and pattern matching, it can deal with the two type of information at the same time under the guide of source and target knowledge library. Also, knowledge libraries help to extract more useful information for users. This paper introduces in detail the process of building the wrapper and the corresponding algorithm, including information judgment based on DOM, key extraction block determination by hierarchical clustering thoughts, extraction expression determination using inductive learning and natural language processing and so on.
Keywords :
"Data mining","Web pages","Knowledge based systems","HTML","Internet","Feature extraction"
Publisher :
ieee
Conference_Titel :
Network and Information Systems for Computers (ICNISC), 2015 International Conference on
Type :
conf
DOI :
10.1109/ICNISC.2015.84
Filename :
7311897
Link To Document :
بازگشت