Title :
Exploiting parallelism in knowledge discovery systems to improve scalability
Author :
Galal, Gehad ; Cook, Diane J. ; Holder, Lawrence B.
Author_Institution :
Dept. of Comput. Sci. & Eng., Texas Univ., Arlington, TX, USA
Abstract :
The large amount of data collected today is quickly overwhelming researchers´ abilities to interpret the data and discover interesting patterns. Knowledge discovery and data mining approaches hold the potential to automate the interpretation process, but these approaches frequently utilize computationally expensive algorithms. In particular, scientific discovery systems focus on the utilization of richer data representation, sometimes without regard for scalability. This research outlines a general approach for scaling KDD systems using parallel and distributed resources and applies the suggested strategies to the SUBDUE knowledge discovery system. SUBDUE has been used to discover interesting and repetitive concepts in graph-based databases from a variety of domains, but requires a substantial amount of processing time. Experiments that demonstrate that scalability of parallel versions of the SUBDUE system are performed using CAD circuit databases and artificially-generated databases, and potential achievements and obstacles are discussed
Keywords :
circuit CAD; deductive databases; distributed databases; knowledge acquisition; parallel processing; very large databases; visual databases; SUBDUE; circuit CAD database; computationally expensive algorithms; data mining; data representation; distributed databases; graph-based databases; knowledge discovery systems; parallelism; pattern discovery; scalability; scientific discovery systems; very large database; Circuits; Concurrent computing; Data engineering; Data mining; Databases; Degradation; Parallel processing; Polynomials; Scalability; Shape;
Conference_Titel :
System Sciences, 1998., Proceedings of the Thirty-First Hawaii International Conference on
Conference_Location :
Kohala Coast, HI
Print_ISBN :
0-8186-8255-8
DOI :
10.1109/HICSS.1998.648320