Title :
Natural Language Processing in Web data mining
Author_Institution :
Sch. of Comput. Sci. & Technol., Beijing Inst. of Technol., Beijing, China
Abstract :
This paper describes the research about Web data mining using Natural Language Processing. System accepts arbitrary data as input from Web document and then extracts information from the document. A new method to implement Web data mining is proposed in this paper. There are three steps in this system. First, the Web document will be decomposed to paragraph, sentence and phrase level. Second, extract information from all sentences. Finally, add the information to the knowledge model. The methods used have proved to be efficient for Web data mining with the experimental corpus.
Keywords :
Internet; data mining; information retrieval; natural language processing; text analysis; Web data mining; Web document; information extraction; knowledge model; natural language processing; paragraph level; phrase level; sentence level; Data mining; Data models; Feature extraction; Image color analysis; Knowledge engineering; Semantics; Web pages;
Conference_Titel :
Web Society (SWS), 2010 IEEE 2nd Symposium on
Conference_Location :
Beijing
Print_ISBN :
978-1-4244-6356-5
DOI :
10.1109/SWS.2010.5607419