DocumentCode :
2694631
Title :
Classification of sound clips by two schemes: Using onomatopoeia and semantic labels
Author :
Sundaram, Shiva ; Narayanan, Shrikanth
Author_Institution :
Dept. of Electr. Eng.-Syst., Southern California Univ., Los Angeles, CA
fYear :
2008
fDate :
June 23 2008-April 26 2008
Firstpage :
1341
Lastpage :
1344
Abstract :
Using the recently proposed framework for latent perceptual indexing of audio clips, we present classification of whole clips categorized by two schemes: high-level semantic labels and the mid-level perceptually motivated onomatopoeia labels. First, feature-vectors extracted from the clips in the database are grouped into reference clusters using an unsupervised clustering technique. A unit-document co-occurrence matrix is then obtained by quantizing the feature-vectors extracted from the audio clips into the reference clusters. The audio clips are then mapped to a latent perceptual space by the reduced rank approximation of this matrix. The classification experiments are performed in this representation space using corresponding semantic and onomatopoeic labels of the clips. Using the proposed method, classification accuracy of about sixty percent was obtained when tested on the BBC sound effects library using over twenty categories. Having the two labeling schemes together in a single framework makes the classification system more flexible as each scheme addresses the limitation of the other. These aspects are the main motivation of the work presented here.
Keywords :
audio databases; audio signal processing; database indexing; matrix algebra; pattern classification; pattern clustering; audio clips; classification accuracy; database; feature-vectors; latent perceptual indexing; onomatopoeia; reduced rank matrix approximation; semantic labels; sound clip classification; unit-document cooccurrence matrix; unsupervised clustering technique; Acoustical engineering; Automatic speech recognition; Feature extraction; Indexing; Labeling; Music; Robustness; Spatial databases; Speech analysis; Streaming media; audio classification; audio representation; indexing; latent document analysis; onomatopoeia; semantic audio;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Multimedia and Expo, 2008 IEEE International Conference on
Conference_Location :
Hannover
Print_ISBN :
978-1-4244-2570-9
Electronic_ISBN :
978-1-4244-2571-6
Type :
conf
DOI :
10.1109/ICME.2008.4607691
Filename :
4607691
Link To Document :
بازگشت