Title :
Classification modeling on distributed environment
Author :
Ting Ah Choo ; Bakar, Afarulrazi Abu ; Talebi, Amin Benjavad ; Sundararajan, Elankovan ; Rahmany, Mahathir
Author_Institution :
Comput. Sci. Programme, Nat. Univ. of Malaysia, Bangi, Malaysia
Abstract :
High Performance Computing (HPC) is usually used to solve problems that cannot be solved on a single machine due to constraints in computing resources such as memory and number of processor in science and technology. The speed of processing can be improved through HPC. However, the use of high-powered supercomputer for this type of problems involves huge cost. In some circumstances, High-Throughput Computing (HTC) on distributed environments performs parallel processing with speed that are comparable to supercomputer. In this work, we improve the time and speed in mining process for developing a classification modeling for a large data file on distributed environments via a web-based portal that provides various classification methods. The web-based application was build using PHP language, and adapt combination of data mining software WEKA version 3.6.0 of classification techniques with split percentage of training and testing data. HTCondor middleware is used to control and run all jobs on distributed environment. The results show significant improvement in processing time.
Keywords :
data mining; microprocessor chips; middleware; parallel processing; HPC; HTCondor middleware; PHP language; Web based portal; classification modeling; computing resources; data mining software; distributed environment; high performance computing; high throughput computing; parallel processing; supercomputer; Classification algorithms; Computational modeling; Computers; Java; Servers; Visualization; Classification; High Performance Computing (HPC); High-Throughput Computing condor (HTCondor); PHP; WEKA;
Conference_Titel :
Open Systems (ICOS), 2013 IEEE Conference on
Conference_Location :
Kuching
Print_ISBN :
978-1-4799-3152-1
DOI :
10.1109/ICOS.2013.6735076