• DocumentCode
    1689809
  • Title

    Language diarization for code-switch conversational speech

  • Author

    Dau-Cheng Lyu ; Eng-Siong Chng ; Haizhou Li

  • Author_Institution
    Temasek Labs., Nanyang Technol. Univ., Singapore, Singapore
  • fYear
    2013
  • Firstpage
    7314
  • Lastpage
    7318
  • Abstract
    This paper examines the process of language diarization, the process to perform language segmentation and recognition, in a code-switched speech. Towards this task, we have developed a 63 hours conversational code-switch corpus recorded from Singapore/Malaysia speakers. We show that code-switching can occur frequently and the average language interval may be as short as one second. As such, language diarization is challenging task. To process such short segments, we propose a language diarization system using long term context feature across several phone-based segments and the combination of acoustics and phonotactic information. We achieved a frame error rate of 14.7% for language diarization performance on a Mandarin-English code-switch corpus. To evaluate our system, we measured the language recognition performance on monolingual segments extracted from the code-switch corpus against published techniques of LID systems - we obtained a relative equal error rate reduction of 5.2%, 13.8%, 15.1% and 17.9% on speech durations of 0.1 to 0.5 sec., 0.5 to 1 sec., 1 to 3 sec. and 3 to 9 sec respectively.
  • Keywords
    error statistics; natural language processing; speaker recognition; speech coding; LID system; Malaysia speaker; Mandarin-English code switch corpus; Singapore speaker; acoustics information; code-switch conversational speech; frame error rate; language diarization process; language diarization system; language recognition; language segmentation; monolingual segment; phone-based segment; phonotactic information; time 0.1 s to 9 s; time 63 hour; Acoustics; Error analysis; Feature extraction; Speech; Speech coding; Speech processing; Speech recognition; code-switch; conversational speech; language diarization; language recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on
  • Conference_Location
    Vancouver, BC
  • ISSN
    1520-6149
  • Type

    conf

  • DOI
    10.1109/ICASSP.2013.6639083
  • Filename
    6639083