• DocumentCode
    2280324
  • Title

    Speech data retrieval system constructed on a universal phonetic code domain

  • Author

    Tanaka, Kazuyo ; Itoh, Yoshiaki ; Kojima, Hiroahi ; Fujimura, Nahoko

  • Author_Institution
    Nat. Inst. of Adv. Ind. Sci. & Technol., Japan
  • fYear
    2001
  • fDate
    2001
  • Firstpage
    323
  • Lastpage
    326
  • Abstract
    We propose a novel speech processing framework, where all of the speech data are encoded into universal phonetic code (UPC) sequences and speech processing systems, such as speech recognition, retrieval, digesting, etc., are constructed on this UPC domain. As the first step, we introduce a sub-phonetic segment (SPS) set, based on IPA (international phonetic alphabet), to deal with multilingual speech and develop a procedure to estimate acoustic models of the SPS from IPA-like phone models. The key point of the framework is to employ environment adaptation into the SPS encoding stage. This makes it possible to normalize acoustic variations and extract the language factor contained in speech signals as encoded SPS sequences. We confirm these characteristics by constructing a speech retrieval system on the SPS domain. The system can retrieve key phrases, given by speech, from different environment speech data in a vocabulary-free condition. We show several preliminary experimental results on this system, using Japanese and English sentence speech sets.
  • Keywords
    acoustic signal processing; feature extraction; information retrieval; linguistics; natural languages; parameter estimation; sequential codes; speech coding; speech recognition; English sentence speech sets; Japanese sentence speech sets; acoustic model estimation; data retrieval system; encoded sequences; feature extraction; international phonetic alphabet; language factor; multilingual speech; speech processing; speech recognition; speech retrieval system; subphonetic segment set; universal phonetic code; Computer architecture; Data mining; Encoding; Information retrieval; Natural languages; Speech coding; Speech processing; Speech recognition; Speech synthesis; Vocabulary;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Automatic Speech Recognition and Understanding, 2001. ASRU '01. IEEE Workshop on
  • Print_ISBN
    0-7803-7343-X
  • Type

    conf

  • DOI
    10.1109/ASRU.2001.1034652
  • Filename
    1034652