• DocumentCode
    589059
  • Title

    Detecting Offensive Language in Social Media to Protect Adolescent Online Safety

  • Author

    Ying Chen ; Yilu Zhou ; Sencun Zhu ; Heng Xu

  • Author_Institution
    Dept. of Comput. Sci. & Eng., Pennsylvania State Univ., University Park, PA, USA
  • fYear
    2012
  • fDate
    3-5 Sept. 2012
  • Firstpage
    71
  • Lastpage
    80
  • Abstract
    Since the textual contents on online social media are highly unstructured, informal, and often misspelled, existing research on message-level offensive language detection cannot accurately detect offensive content. Meanwhile, user-level offensiveness detection seems a more feasible approach but it is an under researched area. To bridge this gap, we propose the Lexical Syntactic Feature (LSF) architecture to detect offensive content and identify potential offensive users in social media. We distinguish the contribution of pejoratives/profanities and obscenities in determining offensive content, and introduce hand-authoring syntactic rules in identifying name-calling harassments. In particular, we incorporate a user´s writing style, structure and specific cyber bullying content as features to predict the user´s potentiality to send out offensive content. Results from experiments showed that our LSF framework performed significantly better than existing methods in offensive content detection. It achieves precision of 98.24% and recall of 94.34% in sentence offensive detection, as well as precision of 77.9% and recall of 77.8% in user offensive detection. Meanwhile, the processing speed of LSF is approximately 10msec per sentence, suggesting the potential for effective deployment in social media.
  • Keywords
    computational linguistics; content management; social networking (online); text analysis; LSF; adolescent online safety protection; cyberbullying content; hand authoring syntactic rule; lexical syntactic feature; message level offensive language detection; name calling harassment identification; offensive content detection; online social media; textual content; user level offensiveness detection; user writing style; Context; Educational institutions; Feature extraction; History; Media; Syntactics; Text mining; adolescent safety; cyberbullying; offensive languages; social media;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Privacy, Security, Risk and Trust (PASSAT), 2012 International Conference on and 2012 International Confernece on Social Computing (SocialCom)
  • Conference_Location
    Amsterdam
  • Print_ISBN
    978-1-4673-5638-1
  • Type

    conf

  • DOI
    10.1109/SocialCom-PASSAT.2012.55
  • Filename
    6406271