• DocumentCode
    177472
  • Title

    Transcribing code-switched bilingual lectures using deep neural networks with unit merging in acoustic modeling

  • Author

    Ching-Feng Yeh ; Lin-Shan Lee

  • Author_Institution
    Grad. Inst. of Commun. Eng., Nat. Taiwan Univ., Taipei, Taiwan
  • fYear
    2014
  • fDate
    4-9 May 2014
  • Firstpage
    220
  • Lastpage
    224
  • Abstract
    This paper considers the transcription of the widely observed yet less investigated bilingual code-switched speech: the words or phrases of the guest language are inserted within the utterances of the host language, so the languages are switched back and forth within an utterance, and much less data are available for the guest language. Two approaches utilizing the deep neural network (DNN) were tested and analyzed, including using DNN bottleneck features in HMM/GMM (BF-HMM/GMM) and modeling context-dependent HMM senones by DNN (CD-DNN-HMM). In both cases the unit merging (and recovery) techniques in acoustic modeling were used to handle the data imbalance problem. Improved recognition accuracies were observed with unit merging (and recovery) for the two approaches under different conditions.
  • Keywords
    hidden Markov models; speech recognition; BF-HMM/GMM; acoustic modeling; bilingual code-switched speech; code-switched bilingual lectures; context-dependent HMM senones; data imbalance problem; deep neural networks; guest language; host language; unit merging; Accuracy; Acoustics; Hidden Markov models; Merging; Neural networks; Speech; Speech recognition; Bilingual; Code-switching; Deep Neural Networks; Speech Recognition; Unit Merging;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on
  • Conference_Location
    Florence
  • Type

    conf

  • DOI
    10.1109/ICASSP.2014.6853590
  • Filename
    6853590