Title :
Representative selection for big data via sparse graph and geodesic Grassmann manifold distance
Author :
Chinh Dang ; Al-Qizwini, Mohammed ; Radha, Hayder
Author_Institution :
Dept. of Electr. & Comput. Eng., Michigan State Univ., East Lansing, MI, USA
Abstract :
This paper addresses the problem of identifying a very small subset of data points that belong to a significantly larger massive dataset (i.e., Big Data). The small number of selected data points must adequately represent and faithfully characterize the massive Big Data. Such identification process is known as representative selection [19]. We propose a novel representative selection framework by generating an ℓ1 norm sparse graph for a given Big-Data dataset The Big Data is partitioned recursively into clusters using a spectral clustering algorithm on the generated sparse graph. We consider each cluster as one point in a Grassmann manifold, and measure the geodesic distance among these points. The distances are further analyzed using a min-max algorithm [1] to extract an optimal subset of clusters. Finally, by considering a sparse subgraph of each selected cluster, we detect a representative using principal component centrality [11]. We refer to the proposed representative selection framework as a Sparse Graph and Grassmann Manifold (SGGM) based approach. To validate the proposed SGGM framework, we apply it onto the problem of video summarization where only few video frames, known as key frames, are selected among a much longer video sequence. A comparison of the results obtained by the proposed algorithm with the ground truth, which is agreed by multiple human judges, and with some state-of-the-art methods clearly indicates the viability of the SGGM framework.
Keywords :
Big Data; differential geometry; directed graphs; minimax techniques; video signal processing; Big Data; SGGM based approach; geodesic Grassmann manifold distance; min-max algorithm; principal component centrality; representative selection framework; sparse graph; sparse graph-and Grassmann manifold based approach; spectral clustering algorithm; video frames; video summarization; Big data; Clustering algorithms; Image processing; Manifolds; Partitioning algorithms; Sparse matrices; Video sequences; geodesic Grassmann manifold distance; principal component centrality; sparse graph;
Conference_Titel :
Signals, Systems and Computers, 2014 48th Asilomar Conference on
Print_ISBN :
978-1-4799-8295-0
DOI :
10.1109/ACSSC.2014.7094591