• DocumentCode
    353715
  • Title

    Syntactic heads in statistical language modeling

  • Author

    Wu, Jun ; Khudanpur, Sanjeev

  • Author_Institution
    Center for Language & Speech Process., Johns Hopkins Univ., Baltimore, MD, USA
  • Volume
    3
  • fYear
    2000
  • fDate
    2000
  • Firstpage
    1699
  • Abstract
    The use of syntactic structure in general and heads of syntactic constituents in particular has recently been shown to be beneficial for statistical language modeling. The paper provides an insightful analysis of this role of syntactic structure. It is shown that the predictive power of syntactic heads is mostly complementary to the predictive power of N-grams: they help in positions where an intervening phrase or clause separates the heads from the word being predicted, making the N-gram a poor predictor. Furthermore, a significant portion of this predictive power comes in the form of a more sophisticated back-off effect via the syntactic categories (nonterminal tags) of the heads. Finally, it is shown that using the categories of the syntactic heads is better than using the categories (part-of-speech tags) of the two preceding words, confirming that it is the syntactic analysis and not just the improved back-off strategy which leads to improvements over N-gram models. Experimental results for perplexity and word error rate are presented on the Switchboard corpus to support this analysis
  • Keywords
    computational linguistics; maximum entropy methods; modelling; speech recognition; word processing; N-gram models; Switchboard corpus; back-off effect; intervening phrase; nonterminal tags; part-of-speech tags; perplexity; predictive power; statistical language modeling; syntactic analysis; syntactic categories; syntactic constituents; syntactic heads; syntactic structure; word error rate; Contracts; Entropy; Error analysis; History; Interpolation; Natural languages; Predictive models; Speech processing; Speech recognition; Statistics;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 2000. ICASSP '00. Proceedings. 2000 IEEE International Conference on
  • Conference_Location
    Istanbul
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-6293-4
  • Type

    conf

  • DOI
    10.1109/ICASSP.2000.862078
  • Filename
    862078