Title :
Confidence on approximate query in large datasets
Author :
Ford, Charles Wesley ; Chiang, Chia-Chu ; Wu, Hao ; Chilka, Radhika R. ; Talburt, John
Author_Institution :
Dept. of Comput. Sci., Arkansas Univ., Little Rock, AR, USA
Abstract :
The evolution of the World Wide Web has brought us enormous amounts of information for business and research use. Design and implementation of an automated system for Web data mining has become important for companies wishing to utilize useful information from the Web. We attempt to describe confidence on approximate queries on large datasets, which is done in the context of an automated system for Web data mining. The system has been designed to identify, extract, filter, and analyze data from Web resources. An approach to evaluating the quality of extracted Web data is also discussed. This is an exploratory study of Web data retrieval and Web data analysis.
Keywords :
Internet; Web sites; data analysis; data mining; information filters; query processing; very large databases; Internet; Web data analysis; Web data extraction; Web data filtering; Web data identification; Web data mining; World Wide Web; approximate query confidence; automated system design; large datasets; Application software; Data analysis; Data mining; Databases; Information filtering; Information filters; Information retrieval; Search engines; Statistics; Web sites;
Conference_Titel :
Information Technology: Coding and Computing, 2004. Proceedings. ITCC 2004. International Conference on
Print_ISBN :
0-7695-2108-8
DOI :
10.1109/ITCC.2004.1286700