DocumentCode
1908921
Title
Use of PLP Cepstral Features for Phonetic Segmentation
Author
Vachhani, Bhavik B. ; Patil, Hemant A.
Author_Institution
Dhirubhai Ambani Inst. of Inf. & Commun. Technol. (DA-IICT), Gandhinagar, India
fYear
2013
fDate
17-19 Aug. 2013
Firstpage
143
Lastpage
146
Abstract
Phonetic segmentation can find its potential application for Text-to-Speech (TTS) synthesis and Automatic Speech Recognition (ASR) systems. In this paper, we propose use of Perceptual Linear Prediction Cepstral Coefficients (PLPCC) feature for phonetic segmentation task. To detect phonetic boundaries, we used spectral transition measure (STM). Using proposed approach, we achieve 85 % (i.e., 3 % better than state-of-the art Mel-frequency Cepstral Coefficients (MFCC) for 20 ms agreement duration) accuracy and 15 % over-segmentation rate (i.e., 8 % less than MFCC) for automatic boundary detection of 2, 34, 925 phone boundaries corresponding 630 speakers of entire TIMIT database.
Keywords
natural language processing; speech synthesis; ASR; MFCC; PLP cepstral features; PLPCC; STM; TIMIT database; TTS; automatic boundary detection; automatic speech recognition systems; mel-frequency cepstral coefficients; perceptual linear prediction cepstral coefficients feature; phone boundaries; phonetic boundaries; phonetic segmentation task; spectral transition measure; text-to-speech synthesis; Accuracy; Databases; Feature extraction; Mel frequency cepstral coefficient; Speech; Training; Phonetic segmentation; mel cepstrum; perceptual linear prediction cepstrum; spectral transition measure; unsupervised approach;
fLanguage
English
Publisher
ieee
Conference_Titel
Asian Language Processing (IALP), 2013 International Conference on
Conference_Location
Urumqi
Type
conf
DOI
10.1109/IALP.2013.47
Filename
6646023
Link To Document