Title :
Distributed boosting algorithm for classification of text documents
Author :
Sarnovsky, Martin ; Vronc, Michal
Author_Institution :
Dept. of Cybern. & Artificial Intell., Tech. Univ. in Kosice, Kosice, Slovakia
Abstract :
Presented paper focuses on the area of analysis and classification of textual documents. We present the classification of documents based on boosting method applied on the decision tree algorithm. Main objective of the paper is to present the implementation of distributed boosting algorithm based on Map Reduce paradigm. We have used the GridGain framework as a platform for distributed data processing and have tested the implemented solution on two different dataset within our testing environment.
Keywords :
data mining; decision trees; learning (artificial intelligence); pattern classification; text analysis; GridGain framework; Map Reduce; boosting method; decision tree algorithm; distributed boosting algorithm; distributed data processing; text document classification; textual document analysis; Algorithm design and analysis; Boosting; Classification algorithms; Computational modeling; Informatics; Text mining; Training;
Conference_Titel :
Applied Machine Intelligence and Informatics (SAMI), 2014 IEEE 12th International Symposium on
Conference_Location :
Herl´any
Print_ISBN :
978-1-4799-3441-6
DOI :
10.1109/SAMI.2014.6822410