DocumentCode
1679532
Title
A new geometric approach to latent topic modeling and discovery
Author
Weicong Ding ; Rohban, Mohammad Hossein ; Ishwar, Prakash ; Saligrama, Venkatesh
Author_Institution
Dept. of Electr. & Comput. Eng., Boston Univ., Boston, MA, USA
fYear
2013
Firstpage
5568
Lastpage
5572
Abstract
A new geometrically-motivated algorithm for topic modeling is developed and applied to the discovery of latent “topics” in text and image “document” corpora. The algorithm is based on robustly finding and clustering extreme-points of empirical cross-document word-frequencies that correspond to novel words unique to each topic. In contrast to related approaches that are based on solving non-convex optimization problems using suboptimal approximations, locally-optimal methods, or heuristics, the new algorithm is convex, has polynomial complexity, and has competitive qualitative and quantitative performance compared to the current state- of-the-art approaches on synthetic and real-world datasets.
Keywords
approximation theory; data mining; document image processing; optimisation; pattern clustering; text analysis; empirical cross-document word-frequency; geometrically-motivated algorithm; image document corpora; latent topic discovery; latent topic modeling; locally-optimal method; nonconvex optimization problem; polynomial complexity; suboptimal approximation; Abstracts; Games; Integrated circuits; Logic gates; Nominations and elections; Support vector machines; Topic modeling; extreme points; nonnegative matrix factorization (NMF); subspace clustering;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on
Conference_Location
Vancouver, BC
ISSN
1520-6149
Type
conf
DOI
10.1109/ICASSP.2013.6638729
Filename
6638729
Link To Document