• DocumentCode
    3526755
  • Title

    Filtering web text to match target genres

  • Author

    Marin, M.A. ; Feldman, S. ; Ostendorf, M. ; Gupta, M.

  • Author_Institution
    Dept. of Electr. Eng., Univ. of Washington, Seattle, WA
  • fYear
    2009
  • fDate
    19-24 April 2009
  • Firstpage
    3705
  • Lastpage
    3708
  • Abstract
    In language modeling for speech recognition, both the amount of training data and the match to the target task impact the goodness of the model, with the trade-off usually favoring more data. For conversational speech, having some genre-matched text is particularly important, but also hard to obtain. This paper proposes a new approach for genre detection and compares different alternatives for filtering Web text for genre to improve language models for use in automatic transcription of broadcast conversations (talk shows).
  • Keywords
    Internet; information filtering; speech recognition; Web text filtering; genre detection; genre-matched text; language modeling; speech recognition; Adaptation model; Information filtering; Information filters; Information retrieval; Matched filters; Natural languages; Search engines; Speech recognition; Statistics; Training data; genre; language modeling; web text filtering;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. IEEE International Conference on
  • Conference_Location
    Taipei
  • ISSN
    1520-6149
  • Print_ISBN
    978-1-4244-2353-8
  • Electronic_ISBN
    1520-6149
  • Type

    conf

  • DOI
    10.1109/ICASSP.2009.4960431
  • Filename
    4960431