DocumentCode
2384600
Title
A pipelined data-parallel algorithm for ILP
Author
Fonseca, Nuno A. ; Silva, Fernando ; Costa, Vitor Santos ; Camacho, Rui
Author_Institution
DCC-FC, Univ. do Porto
fYear
2005
fDate
Sept. 2005
Firstpage
1
Lastpage
10
Abstract
The amount of data collected and stored in databases is growing considerably for almost all areas of human activity. Processing this amount of data is very expensive, both humanly and computationally. This justifies the increased interest both on the automatic discovery of useful knowledge from databases, and on using parallel processing for this task. Multi relational data mining (MRDM) techniques, such as inductive logic programming (ILP), can learn rides from relational databases consisting of multiple tables. However, ILP systems are designed to run in main memory and can have long running times. We propose a pipelined data-parallel algorithm for ILP. The algorithm was implemented and evaluated on a commodity PC cluster with 8 processors. The results show that our algorithm yields excellent speedups, while preserving the quality of learning
Keywords
data mining; inductive logic programming; parallel algorithms; pipeline processing; relational databases; inductive logic programming; knowledge discovery; multirelational data mining; parallel processing; pipelined data-parallel algorithm; relational databases; Clustering algorithms; Data mining; Humans; Logic programming; Parallel processing; Parallel programming; Relational databases; Scalability; Sequential analysis; System testing;
fLanguage
English
Publisher
ieee
Conference_Titel
Cluster Computing, 2005. IEEE International
Conference_Location
Burlington, MA
ISSN
1552-5244
Print_ISBN
0-7803-9486-0
Electronic_ISBN
1552-5244
Type
conf
DOI
10.1109/CLUSTR.2005.347059
Filename
4154102
Link To Document