• DocumentCode
    775135
  • Title

    Modularity and scaling in large phonemic neural networks

  • Author

    Waibel, Alexander ; Sawai, Hidefumi ; Shikano, Kiyohiro

  • Author_Institution
    Carnegie-Mellon Univ., Pittsburgh, PA, USA
  • Volume
    37
  • Issue
    12
  • fYear
    1989
  • fDate
    12/1/1989 12:00:00 AM
  • Firstpage
    1888
  • Lastpage
    1898
  • Abstract
    The authors train several small time-delay neural networks aimed at all phonemic subcategories (nasals, fricatives, etc.) and report excellent fine phonemic discrimination performance for all cases. Exploiting the hidden structure of these small phonemic subcategory networks, they propose several technique that make it possible to grow larger nets in an incremental and modular fashion without loss in recognition performance and without the need for excessive training time or additional data. The techniques include class discriminatory learning, connectionist glue, selective/partial learning, and all-net fine tuning. A set of experiments shows that stop consonant networks (BDGPTK) constructed from subcomponent BDG- and PTK-nets achieved up to 98.6% correct recognition compared to 98.3 and 98.7% correct for the BDG- and PTK-nets. Similarly, an incrementally trained network aimed at all consonants achieved recognition scores of about 96% correct. These results are comparable to the performance of the subcomponent networks and significantly better than that of several alternative speech recognition methods
  • Keywords
    neural nets; speech recognition; all-net fine tuning; class discriminatory learning; connectionist glue; fricatives; modularity; nasals; phonemic discrimination; phonemic neural networks; phonemic subcategories; scaling; selective/partial learning; speech recognition; stop consonant networks; time-delay; Computer networks; Databases; Hardware; Intelligent networks; Large-scale systems; Neural networks; Speech recognition; Supercomputers; Telephony; Training data;
  • fLanguage
    English
  • Journal_Title
    Acoustics, Speech and Signal Processing, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0096-3518
  • Type

    jour

  • DOI
    10.1109/29.45535
  • Filename
    45535