• DocumentCode
    691716
  • Title

    Analysis of cross-gender adaptation using MAP and MLLR in speech recognition systems

  • Author

    Mahiba, S. Magdalene ; Christina, S. Lilly ; Vijayalakshmi, P. ; Nagarajan, T.

  • Author_Institution
    SSN Coll. of Eng., Chennai, India
  • fYear
    2013
  • fDate
    25-27 July 2013
  • Firstpage
    387
  • Lastpage
    392
  • Abstract
    Speech recognition system developed with context-dependent phonemes captures the co-articulation effect and it gives a better performance compared to systems developed with context-independent units. However the performance of the system is also dependent on the speaker. Speaker dependence of the recognition system arises from the speaker-dependent speech features. The variation of the vocal tract length and! shape is the major cause for this inter-speaker variation. Thus the performance of speaker-independent (SI) systems is surpassed by speaker-dependent (SD) systems. It is well established in the literature that the recognition performance of the SI system can be improved to the standards of an SD system by speaker adaptation. The main focus in this paper revolves around the analysis on the amount and ratio of male and female training data for which the cross-gender speaker adaptation gives higher performance. The speaker adaptation cechniques MAP and MLLR are implemented, using the TIMIT speech corpus. It is observed that MLLR adapts the model parameters better than MAP even with 24s of adaptation data. It is also inferred that training the system with both male and female data results in better cross-gender adaptation performance, when compared with the system trained with a either male or female data, primarily because the system parameters differ greatly for male and female speakers. The overall recognition performance of the context-dependent system is improved by 0.55% for MAP adaptation and 2.75% for MLLR adaptation over the unadapted recognition system, for the minimal amount of data.
  • Keywords
    maximum likelihood estimation; regression analysis; speaker recognition; MAP adaptation; MLLR adaptation; SD system; SI system; coarticulation effect; context-dependent phonemes; cross-gender speaker adaptation analysis; female training data; inter-speaker variation; male training data; maximum a posteriori algorithm; maximum likelihood linear regression; speaker-dependent sgstems; speaker-dependent speech features; speaker-independent sνstems; speech recognition systems; vocal tract length; vocal tract shape; Adaptation models; Data models; Market research; Silicon; Speech; Speech recognition; Training; MAP; MLLR; Speaker adaptation;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Recent Trends in Information Technology (ICRTIT), 2013 International Conference on
  • Conference_Location
    Chennai
  • Type

    conf

  • DOI
    10.1109/ICRTIT.2013.6844235
  • Filename
    6844235