• DocumentCode
    3723931
  • Title

    Enhancing the recognition of children´s speech on acoustically mismatched ASR system

  • Author

    S Shahnawazuddin;Hemant Kumar Kathania;Rohit Sinha

  • Author_Institution
    Department of Electronics and Electrical Engineering, Indian Institute of Technology Guwahati, 781039, India
  • fYear
    2015
  • Firstpage
    1
  • Lastpage
    5
  • Abstract
    The work presented in this paper explores the issues of recognizing children´s speech using acoustic models trained on adults´ speech data. In such conditions, on account of large acoustic mismatch between training and test data, a high degradation in the recognition performance is noted. In our earlier work, a binary weighting of cepstral features as well as of acoustic model parameters was explored to address the same. In this paper, a soft-weighting is proposed to overcome the information loss with simple binary weighting scheme. This is achieved through a low-rank projection learned using adults´ training data. The so derived transform happens to emphasize the principal dimensions of acoustic variations in adults´ speech. During testing, the transform maps children´s test data to the space of the training data and thus suppresses the mismatched dimensions. The proposed scheme is also verified experimentally using a recognition system trained on adults´ data only as well as another system trained using adults´ and children´s data pooled together. The effectiveness of acoustic model adaptation is also explored to further enhance the system performance. Combining SW with cluster model interpolation leads to a relative improvement of 14% over the baseline.
  • Keywords
    "Hidden Markov models","Covariance matrices","Computational modeling","Indexes","Measurement","Mel frequency cepstral coefficient","Matrix decomposition"
  • Publisher
    ieee
  • Conference_Titel
    TENCON 2015 - 2015 IEEE Region 10 Conference
  • ISSN
    2159-3442
  • Print_ISBN
    978-1-4799-8639-2
  • Electronic_ISBN
    2159-3450
  • Type

    conf

  • DOI
    10.1109/TENCON.2015.7373176
  • Filename
    7373176