• DocumentCode
    1797361
  • Title

    Deobfuscation based on edit distance algorithm for spam filitering

  • Author

    Xinwang Zhong

  • Author_Institution
    Dept. of Comput. Sci. & Eng., South China Univ. of Technol., Guangzhou, China
  • Volume
    1
  • fYear
    2014
  • fDate
    13-16 July 2014
  • Firstpage
    109
  • Lastpage
    114
  • Abstract
    Spamming problem has been grown rapidly in the Internet. An adversary obfuscates the spam message by misspelling or inserting useless characters to mislead the decision of the spam filter. Humans still can understand the original meaning of the camouflaged words but the spam filter cannot recognize them. This paper focuses on the well-known obfuscation problem which uses non-alphabetical characters, e.g. Viagra is modified to V!@gr@. The string edit distance algorithm is revised for handling the non-alphabetical characters. The proposed deobfuscation method outperforms than the traditional string edit distance algorithm in the experiment.
  • Keywords
    formal languages; information filtering; support vector machines; unsolicited e-mail; SVM; backtrack algorithm; deobfuscation method; nonalphabetical character handling; obfuscation problem; spam filtering; spam message; spamming problem; string edit distance algorithm; support vector machine; Abstracts; Barium; Indexes; Support vector machines; Unsolicited electronic mail; Backtrack algorithm; SVM; Spam Filter; String Edit Distance algorithm;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Machine Learning and Cybernetics (ICMLC), 2014 International Conference on
  • Conference_Location
    Lanzhou
  • ISSN
    2160-133X
  • Print_ISBN
    978-1-4799-4216-9
  • Type

    conf

  • DOI
    10.1109/ICMLC.2014.7009101
  • Filename
    7009101