Title :
A set of classical factors for rule mining in pruning methods
Author :
Ravichandran, M. ; Sengottuvelan, P. ; Shanmugam, A.
Author_Institution :
Dept. of Inf. Technol., Bannari Amman Inst. of Technol., Sathyamangalam
Abstract :
Classification is an important task in both data mining and machine learning communities. In this paper we present a method to build a categorization system that merges association rule mining task with the classification problem.A classifier is built by applying a learning method to a training set of objects. In this case the learning method is represented by the association rule mining and used for building classification models.A good text classifier is a classifier that efficiently categorizes large sets of text documents in a specified time and with an acceptable accuracy, but most of the learning approaches with new technique text, automatic categorization method are coming from machine learning research using association rule mining in the data-mining field which proves to be efficient and effective for construction of effective classifiers. We focus on two major problems: (1) finding the best term association rules in a textual database by generating and pruning; and (2) using the rules to build a text classifier. In addition, training as well as classification is fast. The main idea behind this approach is to discover strong patterns that are associated with the class labels._ The problem of discovering all association rules from a set of transactions_consists of generating the rules that have a support and confidence greater than given thresholds. These rules are called strong rules.. The pruning methods eliminate the specific rules and keep only those that are more general and with high confidence, and prune unnecessary rules by database coverage. I have presented an association rule-based classifier with all categories (ARC-AC) algorithm for building the classifier and association rule based classifier by category (ARC-BC) that considers categories one at a time. The algorithm assumes a transaction-based model for the training document. The introduction of the dominance factory allowed multi-class categorization..
Keywords :
data mining; pattern classification; text analysis; association rule based classifier by category; association rule mining task; automatic categorization method; data mining; machine learning; multiclass categorization; pruning methods; text classifier; textual database; transaction-based model; Association rules; Classification algorithms; Data mining; Electronic mail; Information technology; Learning systems; Machine learning; Production facilities; Text categorization; Transaction databases; Data mining rule mining classifier; instancecentric; prune;
Conference_Titel :
Computing, Communication and Networking, 2008. ICCCn 2008. International Conference on
Conference_Location :
St. Thomas, VI
Print_ISBN :
978-1-4244-3594-4
Electronic_ISBN :
978-1-4244-3595-1
DOI :
10.1109/ICCCNET.2008.4787670