مرکز منطقه ای اطلاع رساني علوم و فناوري - Automatic Classification of Bird Species From Their Sounds Using Two-Dimensional Cepstral Coefficients

DocumentCode :

3558766

Title :

Automatic Classification of Bird Species From Their Sounds Using Two-Dimensional Cepstral Coefficients

Author :

Lee, Chang-Hsing ; Han, Chin-Chuan ; Chuang, Ching-Chien

Author_Institution :

Dept. of Comput. Sci. & Inf. Eng., Chung Hua Univ., Hsinchu

Volume :

Issue :

fYear :

2008

Firstpage :

1541

Lastpage :

1550

Abstract :

This paper presents a method for automatic classification of birds into different species based on the audio recordings of their sounds. Each individual syllable segmented from continuous recordings is regarded as the basic recognition unit. To represent the temporal variations as well as sharp transitions within a syllable, a feature set derived from static and dynamic two-dimensional Mel-frequency cepstral coefficients are calculated for the classification of each syllable. Since a bird might generate several types of sounds with variant characteristics, a number of representative prototype vectors are used to model different syllables of identical bird species. For each bird species, a model selection method is developed to determine the optimal mode between Gaussian mixture models (GMM) and vector quantization (VQ) when the amount of training data is different for each species. In addition, a component number selection algorithm is employed to find the most appropriate number of components of GMM or the cluster number of VQ for each species. The mean vectors of GMM or the cluster centroids of VQ will form the prototype vectors of a certain bird species. In the experiments, the best classification accuracy is 84.06% for the classification of 28 bird species.

Keywords :

Gaussian processes; audio signal processing; biology computing; cepstral analysis; vector quantisation; 2D cepstral coefficients; Gaussian mixture models; audio recordings; birdsong classification; vector quantization; Animals; Audio recording; Birds; Cepstral analysis; Character generation; Computer science; Humans; Prototypes; Spectrogram; Vector quantization; Birdsong classification; Gaussian mixture models (GMMs); two-dimensional Mel-frequency cepstral coefficients;

fLanguage :

English

Journal_Title :

Audio, Speech, and Language Processing, IEEE Transactions on

Publisher :

ieee

ISSN :

1558-7916

Type :

jour

DOI :

10.1109/TASL.2008.2005345

Filename :

4648921

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3558766