Title :
Text classification based on a novel ensemble multi-label learning method
Author :
Tao Zhang ; Jiansheng Wu ; Haifeng Hu
Author_Institution :
Sch. of Telecommun. & Inf. Eng., Nanjing Univ. of Posts & Telecommun., Nanjing, China
Abstract :
Text classification is one of the most significant contents in Natural Language Processing research field. In most real cases, text classification is usually a multi-label learning task. Currently, there are three mainstream attribute measures (i.e., information gain, document frequency and chi-square test values) which are often used to describe documents. The three attribute measures have been applied successfully in some tasks for text classification, but the information that each attribute measure is to focus on is different. It´s valuable to improve the prediction performance of text classification by designing ensemble methods to combine these measures. In this paper, we have proposed a novel ensemble multi-label learning method En-MLKNN based on the state-of-the-art multi-label learning method MLKNN for this task. In addition, in order to make better use of our approach, we have constructed a complete framework for text classification. Experiments on two classic datasets show that our En-MLKNN algorithm is superior to most state-of-the-art Multi-Label learning algorithms.
Keywords :
classification; learning (artificial intelligence); natural language processing; statistical analysis; text analysis; En-MLKNN; chi-square test value; document frequency; ensemble multilabel learning method; information gain; natural language processing; text classification; Classification algorithms; Current measurement; Educational institutions; Prediction algorithms; Text categorization; Training; Vectors; En-MLKNN; multi-label learning; text classification;
Conference_Titel :
Systems and Informatics (ICSAI), 2014 2nd International Conference on
Conference_Location :
Shanghai
Print_ISBN :
978-1-4799-5457-5
DOI :
10.1109/ICSAI.2014.7009425