DocumentCode :
2212307
Title :
Scalable environmental sounds analysis
Author :
Biatov, Konstantin
Author_Institution :
Fraunhofer IAIS, St. Augustin, Germany
fYear :
2009
fDate :
28-30 Sept. 2009
Firstpage :
1
Lastpage :
6
Abstract :
This paper describes a method for environmental audio events analysis. The audio events are modeled using a common universal codebook. The codebook is based on the bag-of-frames (BOF). The features corresponding to the frames and extracted from all audio files are grouped into clusters using the k-means algorithm. The individual audio file is modeled on the normalized distribution of the numbers of cluster bins corresponding to the frames of this file. Each audio file is described by one vector. The audio data are represented as feature-file matrix similar to term-document representation in latent semantic indexing (LSI). The LSI is applied to the feature-file matrix to represent the data in latent semantic space. Then the primary file description is converted to the vectors of similarity to anchor reference data. For anchor reference the training data are used. Each component of this vector is a probabilistic similarity between target file and anchor reference file corresponding to the considered component. The LSI is applied once more to the new feature-file matrix, mapping the data to the latent semantic space in the anchor reference space. For audio recognition and audio retrieval the nearest-neighbor (NN) algorithm is exploited. The described data representation improves the results of audio retrieval and recognition.
Keywords :
audio coding; matrix algebra; probability; audio recognition; audio retrieval; bag-of-frames; codebook; environmental audio event analysis; feature-file matrix; k-means algorithm; latent semantic indexing; nearest-neighbor algorithm; scalable environmental sound analysis; Acoustic testing; Birds; Hidden Markov models; Indexing; Information retrieval; Large scale integration; Matrix converters; Neural networks; Production facilities; Spatial databases; Latent Semantic Indexing; anchor reference space; audio events recognition; audio events retrieval; common codebook; environmental sounds;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Signal Processing and Communication Systems, 2009. ICSPCS 2009. 3rd International Conference on
Conference_Location :
Omaha, NE
Print_ISBN :
978-1-4244-4473-1
Electronic_ISBN :
978-1-4244-4474-8
Type :
conf
DOI :
10.1109/ICSPCS.2009.5306423
Filename :
5306423
Link To Document :
بازگشت