Title :
News topics categorization using latent Dirichlet allocation and sparse representation classifier
Author :
Yuan-Shan Lee ; Lo, Rocky ; Chia-Yen Chen ; Po-Chuan Lin ; Jia-Ching Wang
Abstract :
Recently, subscribing news from websites has become a new trend for many Internet users. In a news reading browser, it is essential all the news documents are properly categorized. For automatically categorizing the news topics, this paper presents a news categorization method using latent Dirichlet allocation (LDA) and sparse representation classifier (SRC). In our work, the LDA is used as the feature learning method. The multinomial distribution of the news topics is regarded as the feature of the document. These features are stacked as an over-complete dictionary, permitting us to perform SRC-based categorization. The experimental results show that the proposed method outperforms the traditional method.
Keywords :
Internet; Web sites; learning (artificial intelligence); pattern classification; Internet; LDA; SRC-based categorization; Web sites; feature learning method; latent Dirichlet allocation; news categorization method; news topics multinomial distribution; over-complete dictionary; sparse representation classifier; topics categorization method; Computer science; Dictionaries; Resource management; Support vector machines; Testing; Training data; Vocabulary;
Conference_Titel :
Consumer Electronics - Taiwan (ICCE-TW), 2015 IEEE International Conference on
Conference_Location :
Taipei
DOI :
10.1109/ICCE-TW.2015.7216819