• DocumentCode
    2554853
  • Title

    Detecting Hoaxes, Frauds, and Deception in Writing Style Online

  • Author

    Afroz, S. ; Brennan, Margaret ; Greenstadt, Rachel

  • Author_Institution
    Dept. of Comput. Sci., Drexel Univ., Philadelphia, PA, USA
  • fYear
    2012
  • fDate
    20-23 May 2012
  • Firstpage
    461
  • Lastpage
    475
  • Abstract
    In digital forensics, questions often arise about the authors of documents: their identity, demographic background, and whether they can be linked to other documents. The field of stylometry uses linguistic features and machine learning techniques to answer these questions. While stylometry techniques can identify authors with high accuracy in non-adversarial scenarios, their accuracy is reduced to random guessing when faced with authors who intentionally obfuscate their writing style or attempt to imitate that of another author. While these results are good for privacy, they raise concerns about fraud. We argue that some linguistic features change when people hide their writing style and by identifying those features, stylistic deception can be recognized. The major contribution of this work is a method for detecting stylistic deception in written documents. We show that using a large feature set, it is possible to distinguish regular documents from deceptive documents with 96.6% accuracy (F-measure). We also present an analysis of linguistic features that can be modified to hide writing style.
  • Keywords
    Internet; data privacy; document handling; forensic science; fraud; learning (artificial intelligence); linguistics; F-measure; deceptive documents; digital forensics; frauds; hoaxes detection; linguistic features; machine learning techniques; nonadversarial scenarios; random guessing; regular documents; stylistic deception; stylometry techniques; writing style online; written documents; Accuracy; Blogs; Context; Feature extraction; Pragmatics; Privacy; Writing; deception; machine learning; privacy; stylometry;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Security and Privacy (SP), 2012 IEEE Symposium on
  • Conference_Location
    San Francisco, CA
  • ISSN
    1081-6011
  • Print_ISBN
    978-1-4673-1244-8
  • Electronic_ISBN
    1081-6011
  • Type

    conf

  • DOI
    10.1109/SP.2012.34
  • Filename
    6234430