DocumentCode :
2487296
Title :
Analysis and Improvement of Data Extraction Technology on the Web
Author :
Li Bi
Author_Institution :
Ningxia Univ., Yinchuan, China
fYear :
2010
fDate :
22-23 May 2010
Firstpage :
1
Lastpage :
3
Abstract :
The paper introduces an improved technology and infrastructure to support the effective flow of information among the sources and services on the Web and their interconnection with legacy systems that were designed to operate with traditional relational databases. This technology is designed to work as a relational front-end to semi-structured data sources. It extracts data from web pages using declarative specification files that define extraction rules expressed in regular expressions.
Keywords :
Web services; knowledge acquisition; online front-ends; relational databases; software maintenance; Web services; data extraction technology; declarative specification files; legacy systems; relational databases; relational front end; semistructured data sources; Bismuth; Data mining; HTML; Humans; Information analysis; Information retrieval; Machine learning algorithms; Paper technology; Search engines; Web pages;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
e-Business and Information System Security (EBISS), 2010 2nd International Conference on
Conference_Location :
Wuhan
Print_ISBN :
978-1-4244-5893-6
Electronic_ISBN :
978-1-4244-5895-0
Type :
conf
DOI :
10.1109/EBISS.2010.5473712
Filename :
5473712
Link To Document :
بازگشت