مرکز منطقه ای اطلاع رساني علوم و فناوري - Bootstrapping Text-to-Speech for speech processing in languages without an orthography

DocumentCode :

1693764

Title :

Bootstrapping Text-to-Speech for speech processing in languages without an orthography

Author :

Sitaram, Sunayana ; Palkar, Shrikant ; Yun-Nung Chen ; Parlikar, Alok ; Black, Alan W.

Author_Institution :

Language Technol. Inst., Carnegie Mellon Univ., Pittsburgh, PA, USA

fYear :

2013

Firstpage :

7992

Lastpage :

7996

Abstract :

Speech synthesis technology has reached the stage where given a well-designed corpus of audio and accurate transcription an at least understandable synthesizer can be built without necessarily resorting to new innovations. However many languages do not have a well-defined writing system but such languages could still greatly benefit from speech systems. In this paper we consider the case where we have a (potentially large) single speaker database but have no transcriptions and no standardized way to write transcriptions. To address this scenario we propose a method that allows us to bootstrap synthetic voices purely from speech data. We use a novel combination of automatic speech recognition and automatic word segmentation for the bootstrapping. Our experimental results on speech corpora in two languages, English and German, show that synthetic voices that are built using this method are close to understandable. Our method is language-independent and can thus be used to build synthetic voices from a speech corpus in any new language.

Keywords :

bootstrapping; natural language processing; speech processing; speech recognition; speech synthesis; English language; German language; automatic speech recognition; automatic word segmentation; single speaker database; speech data; speech processing; speech synthesis technology; synthetic voices; text-to-speech bootstrapping; well-defined writing system; Data models; Decoding; Speech; Speech processing; Speech recognition; Synthesizers; Languages without an Orthography; Speech Synthesis; Synthesis without Text;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on

Conference_Location :

Vancouver, BC

ISSN :

1520-6149

Type :

conf

DOI :

10.1109/ICASSP.2013.6639221

Filename :

6639221

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1693764