DocumentCode
2247409
Title
Web information processing and extracting
Author
Gao, Kai ; Zong, Bao-qin ; Yang, Xiu-li
Author_Institution
Dept. of Inf. Sci. & Eng., Hebei Univ. of Sci. & Technol., Shijiazhuang, China
Volume
5
fYear
2010
fDate
11-14 July 2010
Firstpage
2350
Lastpage
2355
Abstract
With the rapid growth of the web, search engine has been an important tool to retrieve relevant information from the Internet. Due to the limited bandwidth, storage and some other limitations, the general search engine is not suitable for some situations. A topical search engine which is focused on collecting domain-specific issues by focused crawling is needed. It can provide higher accuracy than general search because of the lack of irrelevant information on the domain collection, so the web information processing and extracting is necessary. This paper presents some strategies on web information processing, together with analyzing and extracting based on data content mining. The experimental result validates the suitable of the approach, and some problems are also present in the end.
Keywords
Internet; data mining; information retrieval; search engines; Web information extracting; Web information processing; data content mining; search engine; Accuracy; Data mining; Databases; Materials; Noise; Web pages; Crawling; Information extracting; Information processing; Topical search;
fLanguage
English
Publisher
ieee
Conference_Titel
Machine Learning and Cybernetics (ICMLC), 2010 International Conference on
Conference_Location
Qingdao
Print_ISBN
978-1-4244-6526-2
Type
conf
DOI
10.1109/ICMLC.2010.5580664
Filename
5580664
Link To Document