• DocumentCode
    682620
  • Title

    News search using discourse analytics

  • Author

    Thompson, Paul ; Nawaz, R. ; Korkontzelos, Ioannis ; Black, William ; McNaught, John ; Ananiadou, Sophia

  • Author_Institution
    Nat. Centre for Text Min., Univ. of Manchester, Manchester, UK
  • Volume
    1
  • fYear
    2013
  • fDate
    Oct. 28 2013-Nov. 1 2013
  • Firstpage
    597
  • Lastpage
    604
  • Abstract
    The vast numbers of digitised documents containing historical data constitute a rich research data repository. However, computational methods and tools available to explore this data are still limited in functionality. Research on historical archives is still largely carried out manually. Text mining technologies offer novel methods to analyse digital content to identify various types of semantic information in these documents and to extract them as semantic metadata. Methods range from the automatic identification of named entities (e.g., people, places, organisations, etc.) to more sophisticated methods to extract information about events (e.g., births, deaths, arrests, etc.), allowing users to greatly increase the specificity of their search. We have created an extended model of event interpretation to allow searches to be refined based on various discourse facets, including isolating definite information about events from more speculative details, distinguishing positive and negative opinions and categorising events according to information source. We present ISHER as an example of a multifaceted, semantically oriented system for searching news articles from the New York Times, dating back to 1987. We explain how our extended event interpretation model can enhance search capabilities in systems such as ISHER, including the identification of contrasting and contradictory information in news articles.
  • Keywords
    information filtering; publishing; search engines; text analysis; ISHER; New York Times; discourse analytics; historical archives; news articles search; research data repository; semantic metadata; text mining technologies; Abstracts; Context; Filtering; Semantics; Text mining; Training; discourse analysis; event interpretation; event-based search; events; semantic metadata; social history; text mining;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Digital Heritage International Congress (DigitalHeritage), 2013
  • Conference_Location
    Marseille
  • Print_ISBN
    978-1-4799-3168-2
  • Type

    conf

  • DOI
    10.1109/DigitalHeritage.2013.6743801
  • Filename
    6743801