• DocumentCode
    312179
  • Title

    Iterative unsupervised speaker adaptation for batch dictation

  • Author

    Homma, Shigent ; Takahashi, Junji ; Sagayama, Shigeki

  • Author_Institution
    NTT Human Interface Labs., Kanagawa, Japan
  • Volume
    2
  • fYear
    1996
  • fDate
    3-6 Oct 1996
  • Firstpage
    1141
  • Abstract
    Describes an automatic batch-style dictation paradigm in which the entire dictated speech is fully utilized for speaker adaptation and is recognized using the speaker adaptation results. The key point is that the same speech data is used both for recognition as the target and for speaker adaptation. Two steps, speech recognition and speaker adaptation which uses the recognition results as means of supervision, are iterated to maximize the advantage of closed-data speaker adaptation. Recognition errors are reduced by 37% in a practical application of batch-style speech-to-text conversion of recorded dictation of Japanese medical diagnoses compared to speaker-independent recognition. To select only reliable recognition results, a supervision improvement procedure is used, by which erroneous recognition results can be eliminated from the supervision. In this procedure, 59-74% of the data are extracted from the tentative recognition results, and their reliability is 89-93%. This procedure also reduces recognition errors by 45%
  • Keywords
    batch processing (computers); dictation; iterative methods; medical diagnostic computing; reliability; speech recognition; unsupervised learning; Japanese medical diagnoses; batch dictation; closed-data speaker adaptation; dictated speech; iterative unsupervised speaker adaptation; recognition error reduction; reliability; speech recognition; speech-to-text conversion; supervision improvement procedure; Automatic speech recognition; Context modeling; Databases; Hidden Markov models; Humans; Laboratories; Loudspeakers; Maximum likelihood estimation; Speech recognition; Target recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Spoken Language, 1996. ICSLP 96. Proceedings., Fourth International Conference on
  • Conference_Location
    Philadelphia, PA
  • Print_ISBN
    0-7803-3555-4
  • Type

    conf

  • DOI
    10.1109/ICSLP.1996.607808
  • Filename
    607808