• DocumentCode
    1918128
  • Title

    Identifying Themes in Social Media and Detecting Sentiments

  • Author

    Pal, Jayanta Kumar ; Saha, Abhisek

  • fYear
    2010
  • fDate
    9-11 Aug. 2010
  • Firstpage
    452
  • Lastpage
    457
  • Abstract
    Recently, a huge wave of social media has generated significant impact in people´s perceptions about technological domains. They are captured in several blogs/forums, where the themes relate to products of several companies. One of the companies can be interested to track them as resources for customer perceptions and detect user sentiments. The keyword-based approaches for identifying such themes fail to give satisfactory level of accuracy. Here, we address the above problems using statistical text-mining of blog entries. The crux of the analysis lies in mining quantitative information from textual entries. Once the relevant blog entries for the company/its competitors are filtered out, the theme identification is performed using a highly accurate novel technique termed as ´Best Separators Algorithm´. Logistic regression coupled with dimension reduction technique (singular value decomposition) is used to identify the tonality of those blogs. The final analysis shows significant improvement in terms of accuracy over popular approaches.
  • Keywords
    consumer behaviour; customer profiles; data mining; logistics data processing; pattern clustering; singular value decomposition; social networking (online); text analysis; best separator algorithm; customer perception; logistic regression; sentiment detection; social media; statistical text mining; theme identification; web blog; Accuracy; Blogs; Classification algorithms; Ink; Media; Particle separators; Training; Best Separators Algorithm; Logistic Regression; Sentiment Analysis; Singular Value Decomposition; Theme identification;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Advances in Social Networks Analysis and Mining (ASONAM), 2010 International Conference on
  • Conference_Location
    Odense
  • Print_ISBN
    978-1-4244-7787-6
  • Electronic_ISBN
    978-0-7695-4138-9
  • Type

    conf

  • DOI
    10.1109/ASONAM.2010.25
  • Filename
    5563059