• DocumentCode
    1780480
  • Title

    Detecting malicious tweets in trending topics using clustering and classification

  • Author

    Soman, Saini Jacob ; Murugappan, S.

  • Author_Institution
    Fac. of CSE, Sathyabama Univ., Chennai, India
  • fYear
    2014
  • fDate
    10-12 April 2014
  • Firstpage
    1
  • Lastpage
    6
  • Abstract
    Detection of spam Twitter social networks is one of the significant research areas to discover unauthorized user accounts. A number of research works have been carried out to solve these issues but most of the existing techniques had not focused on various features and doesn´t group similar user trending topics which become their major limitation. Trending topics collects the current Internet trends and topics of argument of each and every user. In order to overcome the problem of feature extraction,this work initially extracts many features such as user profile features, user activity features, location based features and text and content features. Then the extracted text features use Jenson-Shannon Divergence (JSD) measure to characterize each labeled tweet using natural language models. Different features are extracted from collected trending topics data in twitter. After features are extracted, clusters are formed to group similar trending topics of tweet user profile. Fuzzy K-means (FKM) algorithm primarily cluster the similar user profiles with same trending topics of tweet and centers are determined to similar user profiles with same trending topics of tweet from fuzzy membership function. Moreover, Extreme learning machine (ELM) algorithm is applied to analyze the growing characteristics of spam with similar topics in twitter from clustering result and acquire necessary knowledge in the detection of spam. The results are evaluated with F-measure, True Positive Rate (TPR), False Positive Rate (FPR) and Classification Accuracy with improved detection results.
  • Keywords
    Internet; feature extraction; learning (artificial intelligence); pattern classification; pattern clustering; social networking (online); text analysis; ELM algorithm; FKM algorithm; FPR; Internet trends; JSD measure; Jenson-Shannon divergence measure; TPR; Twitter social networks; classification accuracy; content features; extreme learning machine algorithm; f-measure; false positive rate; feature extraction; fuzzy k-means algorithm; fuzzy membership function; location based features; malicious tweet detection; natural language models; similar user profile clustering; spam detection; text features; trending topics; true positive rate; tweet user profile; unauthorized user account discovery; user activity features; user profile features; Accuracy; Clustering algorithms; Feature extraction; Market research; Support vector machines; Twitter; Extreme learning machine algorithm; Fuzzy KMeans Clustering algorithm; Social network; Spam detection;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Recent Trends in Information Technology (ICRTIT), 2014 International Conference on
  • Conference_Location
    Chennai
  • Type

    conf

  • DOI
    10.1109/ICRTIT.2014.6996188
  • Filename
    6996188