Title :
How to put it into words - using random forests to extract symbol level descriptions from audio content for concept detection
Author :
Huang, Po-Sen ; Mertens, Robert ; Divakaran, Ajay ; Friedland, Gerald ; Hasegawa-Johnson, Mark
Author_Institution :
ECE Dept., Univ. of Illinois at Urbana-Champaign, Urbana, IL, USA
Abstract :
This paper presents a system that uses symbolic representations of audio concepts as words for the descriptions of audio tracks, that enable it to go beyond the state of the art, which is audio event classification of a small number of audio classes in constrained settings, to large-scale classification in the wild. These audio words might be less meaningful for an annotator but they are descriptive for computer algorithms. We devise a random-forest vocabulary learning method with an audio word weighting scheme based on TF-IDF and TD-IDD, so as to combine the computational simplicity and accurate multi-class classification of the random forest with the data-driven discriminative power of the TF-IDF/TD-IDD methods. The proposed random forest clustering with text-retrieval methods significantly outperforms two state-of-the-art methods on the dry-run set and the full set of the TRECVID MED 2010 dataset.
Keywords :
audio signal processing; pattern clustering; text detection; vocabulary; TRECVID MED 2010 dataset; accurate multiclass classification; annotator; audio concepts; audio content; audio event classification; audio track descriptions; audio word weighting; computational simplicity; concept detection; extract symbol level descriptions; large-scale classification; random forest clustering; random-forest vocabulary learning; symbolic representations; text-retrieval; Multimedia communication; Radio frequency; Streaming media; Support vector machines; Training; Vegetation; Vocabulary; Audio Classification; Frequency; Inverse Document; Multimedia Event Detection; Random Forests; Term Frequency;
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on
Conference_Location :
Kyoto
Print_ISBN :
978-1-4673-0045-2
Electronic_ISBN :
1520-6149
DOI :
10.1109/ICASSP.2012.6287927