مرکز منطقه ای اطلاع رساني علوم و فناوري - Detection of nonlinguistic vocalizations using ALISP sequencing

DocumentCode :

1691298

Title :

Detection of nonlinguistic vocalizations using ALISP sequencing

Author :

Pammi, Sathish ; Khemiri, Houssemeddine ; Petrovska-Delacretaz, Dijana ; Chollet, Gerard

Author_Institution :

Inst. Mines-Telecom, Telecom ParisTech, Paris, France

fYear :

2013

Firstpage :

7557

Lastpage :

7561

Abstract :

In this paper, we present a generic methodology to detect nonlinguistic vocalizations using ALISP (Automatic Language Independent Speech Processing), which is a data-driven audio segmentation approach. Using Maximum Likelihood Linear Regression (MLLR) and Maximum A Posterior (MAP) techniques, the proposed method adapts ALISP models, which then facilitate detection of local regions of nonlinguistic vocalizations with the standard Viterbi decoding algorithm. We also illustrate how a simple majority voting scheme, using a sliding window on ALISP sequences, can be helpful in eliminating outliers from the Viterbi-predicted sequence automatically. We evaluate the performance of our method on detection of laughter, a nonlinguistic vocalization, in comparison with global acoustic models such as GMMs, left-to-right HMMs and ergodic HMMs. The results indicate that adapted ALISP acoustic models perform better than global acoustic models in terms of F-measure. Moreover, our majority voting scheme on ALISP-sequences further improves the performance yielding, in total, an increase of 19.6%, 8.1% and 5.6% on the F-measure against global acoustic models GMMs, left-to-right HMMs, and ergodic HMMs respectively.

Keywords :

Gaussian processes; Viterbi decoding; audio signal processing; hidden Markov models; maximum likelihood decoding; maximum likelihood estimation; regression analysis; speech coding; speech recognition; ALISP sequencing; GMM; Gaussian mixture models; MAP technique; MLLR technique; Viterbi decoding algorithm; automatic language independent speech processing; data driven audio segmentation approach; ergodic HMM; global acoustic model; hidden Markov models; laughter detection method; left to right HMM; majority voting scheme; maximum a posterior technique; maximum likelihood linear regression technique; nonlinguistic vocalization detection; sliding window; Acoustics; Adaptation models; Hidden Markov models; Speech; Training; Vectors; Viterbi algorithm; ALISP sequencing; acoustic models; audio segmentation; model adaptation;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on

Conference_Location :

Vancouver, BC

ISSN :

1520-6149

Type :

conf

DOI :

10.1109/ICASSP.2013.6639132

Filename :

6639132

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1691298