• DocumentCode
    2780606
  • Title

    Multi-layer features based personalized spam filtering

  • Author

    Xu, Weiran ; Wang, Zhanyi ; Liu, Dongxin ; Guo, Jun ; Hu, Rile

  • Author_Institution
    Sch. of Inf. & Commun. Eng., Beijing Univ. of Posts & Telecommun., Beijing, China
  • fYear
    2009
  • fDate
    6-8 Nov. 2009
  • Firstpage
    368
  • Lastpage
    373
  • Abstract
    In this paper, we face a new challenge that the filter is expected to converge much faster, e.g. within 10 labeled SMSs or less. Topic model based dimension reduction can minimize the structural risk with limited training data. But dimension reduction will go against the completeness of feature space. It is very difficult to obtain the convergence rate and the completeness at the same time only by one kind of feature. This paper uses supervised dual-PLSA for dimensionality reduction and presents a multi-layer features model, which employs two layer features and adopts a novel method to combine them. Experiments show that multi-layer features model have the best performance.
  • Keywords
    e-mail filters; learning (artificial intelligence); unsolicited e-mail; convergence rate; dimensionality reduction; feature space completeness; multilayer features; personalized spam filtering; supervised dual-PLSA; Convergence; Information filtering; Information filters; Probability distribution; Statistical learning; Text categorization; Training data; Unsolicited electronic mail; Multi-layer features; PLSA; Personalized Filtering; Spam Filtering; dual-PLSA;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Network Infrastructure and Digital Content, 2009. IC-NIDC 2009. IEEE International Conference on
  • Conference_Location
    Beijing
  • Print_ISBN
    978-1-4244-4898-2
  • Electronic_ISBN
    978-1-4244-4900-6
  • Type

    conf

  • DOI
    10.1109/ICNIDC.2009.5360803
  • Filename
    5360803