Segmentation of continuous speech by using multidimensional scaling techniques

Author

Charbonneau, Gérard R. ; Moussa, Tarek

Author_Institution

Université de Paris XI, Orsay Cedex, France

Volume

fYear

1982

fDate

30072

Firstpage

2012

Lastpage

2014

Abstract

Continuous speech is digitized at the rate of 20480 Hz. Power spectra are taken on 1024 blocks shifted from 256 to 256 samples. These spectra are divided into 25 channels chosen to discriminate at best peaks and valleys. For each spectrum i, and each channel j, the sum Pij of the components is computed. A multidimensional scaling analysis is done on the matrix Pij. This gives a 7-dimension space in which variations of the spectra versus time are represented by a moving spot. The main result is that a transition between two spoken sounds induces a significant variation on, at least one, and generally several axes. This can be used for segmenting and recognizing the continuous speech at the acoustical level. The first attempts have given modest results, but great improvements are expected soon.

Keywords

Acoustic signal processing; Attenuation; Low pass filters; Microphones; Multidimensional signal processing; Multidimensional systems; Sampling methods; Spectral analysis; Speech processing; Speech recognition;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech, and Signal Processing, IEEE International Conference on ICASSP '82.

Type

conf

DOI

10.1109/ICASSP.1982.1171834

Filename

1171834

Link To Document

https://search.isc.ac/dl/search/defaultta.aspx?DTC=49&DC=388452