• DocumentCode
    1879283
  • Title

    Domain-oriented two-stage aggregation: Generating baseball play-by-play narratives

  • Author

    Baldwin, James ; Channarukul, Songsak

  • Author_Institution
    Dept. of Comput. Sci., Assumption Univ., Bangkok, Thailand
  • fYear
    2015
  • fDate
    28-31 Jan. 2015
  • Firstpage
    42
  • Lastpage
    47
  • Abstract
    This paper presents an end-to-end natural language generation system that performs aggregation in two stages: the first takes advantage of the information implicit in the source knowledge base in order to aggregate event components into complex sentences. The second stage examines the developing context of the text in order to aggregate similar adjacent events into more fluent text. The source knowledge base is the Retrosheet collection of play-by-play baseball scoresheets encoded in machine-readable form. The output is reasonably fluent and natural, human-readable play-by-play narratives of historical baseball games. The system was tested against all regular season major league games played from 1950 to 1969, taking less than a second to produce three to five pages of text for each game. The aggregation achieved resulted in a substantial improvement in native speaker judgments of fluency and readability.
  • Keywords
    history; natural language processing; sport; text analysis; Retrosheet collection; baseball play-by-play narratives; domain-oriented two-stage aggregation; end-to-end natural language generation system; event component aggregation; historical baseball games; play-by-play baseball scoresheets; Abstracts; Dictionaries; Encoding; Games; Knowledge based systems; Skeleton; Sports equipment; aggregation; natural language generation;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Knowledge and Smart Technology (KST), 2015 7th International Conference on
  • Conference_Location
    Chonburi
  • Print_ISBN
    978-1-4799-6048-4
  • Type

    conf

  • DOI
    10.1109/KST.2015.7051463
  • Filename
    7051463