• DocumentCode
    1594428
  • Title

    Clustering to Improve Microblog Stream Summarization

  • Author

    Olariu, A.

  • Author_Institution
    Fac. of Math. & Comput. Sci., Univ. of Bucharest, Bucharest, Romania
  • fYear
    2012
  • Firstpage
    220
  • Lastpage
    226
  • Abstract
    Microblogging has shown a massive increase in use over the past couple of years. According to recent statistics, Twitter (the most popular microblogging platform) has over 340 million posts per day coming from its 140 million active users. In order to help users manage this information overload or to assess the full information potential of such microblogging streams (sequences of posts), a few summarization algorithms have been proposed. However, they are designed to work on a stream of posts filtered on a particular keyword, whereas most streams suffer from noise or have posts referring to more than one topic. Because of this, the generated summary is incomplete and even meaningless. We approach the problem of summarizing a stream and propose adding a layer of text clustering as a preprocessing step. We show how, by clustering posts into related groups and then applying a summarization algorithm, the quality of the summary improves.
  • Keywords
    information filtering; pattern clustering; social networking (online); text analysis; Twitter; information overload management; microblog stream summarization; microblogging platform; post sequences; preprocessing step; summary quality improvement; text clustering; Algorithm design and analysis; Blogs; Clustering algorithms; Event detection; Indexes; Noise; Twitter; microblog; summarization; text clustering;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Symbolic and Numeric Algorithms for Scientific Computing (SYNASC), 2012 14th International Symposium on
  • Conference_Location
    Timisoara
  • Print_ISBN
    978-1-4673-5026-6
  • Type

    conf

  • DOI
    10.1109/SYNASC.2012.10
  • Filename
    6481033