ASCII based transcription systems for languages with the Arabic script: the case of Persian

Author

Ganjavi, Shadi ; Georgiou, Panayiotis G. ; Narayanan, Shrikanth

Author_Institution

Dept. of Linguistics, Univ. of Southern California, CA, USA

fYear

2003

fDate

30 Nov.-3 Dec. 2003

Firstpage

595

Lastpage

600

Abstract

We discuss transcription systems needed for automated spoken language processing applications in languages such as Persian that use the Arabic script for writing. The work is described in the context of a speech-to-speech translation system development for English and Persian. This system can easily be modified for Arabic, Dari, Urdu and any other language that uses the Arabic script. The proposed system has two components. One is a phonemic based transcription of sounds for acoustic modeling in automatic speech recognizers and for text-to-speech synthesizers, using ASCII based symbols, rather than International Phonetic Alphabet symbols. The other is a hybrid system, that provides a minimally-ambiguous lexical representation that explicitly includes vocalic information; such a representation is needed for language modeling and machine translation.

Keywords

language translation; linguistics; natural language interfaces; natural languages; signal representation; speech recognition; speech synthesis; speech-based user interfaces; text analysis; ASCII based transcription systems; Arabic script; English language; International Phonetic Alphabet symbols; Persian language; automated spoken language processing; automatic speech recognition; language modeling; machine translation; sound transcription; speech-to-speech translation system; text-to-speech synthesizers; Automatic speech recognition; Computer aided software engineering; Laboratories; Natural languages; Speech analysis; Speech recognition; Speech synthesis; Synthesizers; Text recognition; Writing;

fLanguage

English

Publisher

ieee

Conference_Titel

Automatic Speech Recognition and Understanding, 2003. ASRU '03. 2003 IEEE Workshop on

Print_ISBN

0-7803-7980-2

Type

conf

DOI

10.1109/ASRU.2003.1318507

Filename

1318507