• DocumentCode
    480739
  • Title

    Predicting News Story Importance Using Language Features

  • Author

    Krestel, Ralf ; Mehta, Bhaskar

  • Author_Institution
    L3S Res. Inst., Univ. Hannover, Hannover
  • Volume
    1
  • fYear
    2008
  • fDate
    9-12 Dec. 2008
  • Firstpage
    683
  • Lastpage
    689
  • Abstract
    In this age of awareness, people have access to information like never before. Hundreds of newspapers and millions of bloggers present news and their interpretations in an openly accessible manner. With globalization, distant events can have impact on people thousands of miles away. While expert humans can recognize a potentially important piece of news, this is still a difficult problem for an automatic system. Since people are increasingly relying on multiple online sources of information, it is important to support users in filtering news automatically. In this work, we consider the problem of anticipating news story importance, i.e. given a news item, predicting if it will be of interest for a majority of users. Such ranking is currently done manually for newspapers, and we explore automatic approaches and indicative features for the same. Our main conclusion is that importance prediction is a hard problem, and pure textual features are not sufficient for classifiers with 90% accuracy.
  • Keywords
    information resources; pattern classification; text analysis; automatic system; expert humans; globalization; information multiple online sources; language features; news story importance; textual features; Economic forecasting; Feedback; Information filtering; Information filters; Information resources; Intelligent agent; Internet; Natural languages; Stock markets; Vocabulary;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Web Intelligence and Intelligent Agent Technology, 2008. WI-IAT '08. IEEE/WIC/ACM International Conference on
  • Conference_Location
    Sydney, NSW
  • Print_ISBN
    978-0-7695-3496-1
  • Type

    conf

  • DOI
    10.1109/WIIAT.2008.193
  • Filename
    4740530