• DocumentCode
    2643093
  • Title

    Latent semantic analysis and keyword extraction for phishing classification

  • Author

    L´Huillier, Gaston ; Hevia, Alejandro ; Weber, Richard ; Rios, Sebastian

  • Author_Institution
    Dept. of Comput. Sci., Univ. of Chile, Santiago, Chile
  • fYear
    2010
  • fDate
    23-26 May 2010
  • Firstpage
    129
  • Lastpage
    131
  • Abstract
    Phishing email fraud has been considered as one of the main cyber-threats over the last years. Its development has been closely related to social engineering techniques, where different fraud strategies are used to deceit a naïve email user. In this work, a latent semantic analysis and text mining methodology is proposed for the characterisation of such strategies, and further classification using supervised learning algorithms. Results obtained showed that the feature set obtained in this work is competitive against previous phishing feature extraction methodologies, achieving promising results over different benchmark machine learning classification techniques.
  • Keywords
    Algorithm design and analysis; Data mining; Feature extraction; Linear discriminant analysis; Logistics; Machine learning; Machine learning algorithms; Support vector machine classification; Support vector machines; Text mining; Latent Semantic Analysis; Phishing detection; Text mining;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Intelligence and Security Informatics (ISI), 2010 IEEE International Conference on
  • Conference_Location
    Vancouver, BC, Canada
  • Print_ISBN
    978-1-4244-6444-9
  • Type

    conf

  • DOI
    10.1109/ISI.2010.5484762
  • Filename
    5484762