Title :
Analytical models combining methodology with classification model example
Author :
Gorawski, Marcin ; Pluciennik, E.
Author_Institution :
Inst. of Comput. Sci., Silesian Univ. of Technol., Gliwice
Abstract :
Distributed computing is nowadays almost ubiquities. So is data mining - time and hardware resources consuming process of building analytical models of data. Authors propose methodology of combining local analytical models (build parallely in nodes of distributed computer system) into a global one without necessary to construct distributed version of data mining algorithm. Basic assumptions for proposed solution is (i) a complete horizontal data fragmentation and (ii) a model form understood for human being. All steps of combining methodology are presented with classification model example in form of a rule set. Authors define and consider problems with combining local classification modelspsila rules into one final set of global model rules encompassing conflicting rules, sub-rules, partial sub-rules and unclassified objects. Algorithms for different combining strategies are also presented as well as their tests results. Tests were conducted with data sets from UCI Machine Learning Repository.
Keywords :
data mining; distributed algorithms; pattern classification; UCI Machine Learning Repository; analytical models combining methodology; classification model; data mining; distributed computing; horizontal data fragmentation; Analytical models; Buildings; Concurrent computing; Data mining; Distributed computing; Hardware; Humans; Machine learning; Machine learning algorithms; Testing;
Conference_Titel :
Information Technology, 2008. IT 2008. 1st International Conference on
Conference_Location :
Gdansk
Print_ISBN :
978-1-4244-2244-9
Electronic_ISBN :
978-1-4244-2245-6
DOI :
10.1109/INFTECH.2008.4621623