• DocumentCode
    85911
  • Title

    Global and Local Structure Preservation for Feature Selection

  • Author

    Xinwang Liu ; Lei Wang ; Jian Zhang ; Jianping Yin ; Huan Liu

  • Author_Institution
    Sch. of Comput. Sci., Nat. Univ. of Defense Technol., Changsha, China
  • Volume
    25
  • Issue
    6
  • fYear
    2014
  • fDate
    Jun-14
  • Firstpage
    1083
  • Lastpage
    1095
  • Abstract
    The recent literature indicates that preserving global pairwise sample similarity is of great importance for feature selection and that many existing selection criteria essentially work in this way. In this paper, we argue that besides global pairwise sample similarity, the local geometric structure of data is also critical and that these two factors play different roles in different learning scenarios. In order to show this, we propose a global and local structure preservation framework for feature selection (GLSPFS) which integrates both global pairwise sample similarity and local geometric data structure to conduct feature selection. To demonstrate the generality of our framework, we employ methods that are well known in the literature to model the local geometric data structure and develop three specific GLSPFS-based feature selection algorithms. Also, we develop an efficient optimization algorithm with proven global convergence to solve the resulting feature selection problem. A comprehensive experimental study is then conducted in order to compare our feature selection algorithms with many state-of-the-art ones in supervised, unsupervised, and semisupervised learning scenarios. The result indicates that: 1) our framework consistently achieves statistically significant improvement in selection performance when compared with the currently used algorithms; 2) in supervised and semisupervised learning scenarios, preserving global pairwise similarity is more important than preserving local geometric data structure; 3) in the unsupervised scenario, preserving local geometric data structure becomes clearly more important; and 4) the best feature selection performance is always obtained when the two factors are appropriately integrated. In summary, this paper not only validates the advantages of the proposed GLSPFS framework but also gains more insight into the information to be preserved in different feature selection tasks.
  • Keywords
    convergence; data handling; data structures; learning (artificial intelligence); GLSPFS; feature selection; geometric data structure; global and local structure preservation framework for feature selection; global convergence; global pairwise sample similarity; local geometric data structure; semisupervised learning scenarios; Algorithm design and analysis; Convergence; Data models; Data structures; Laplace equations; Matrix decomposition; Optimization; Feature selection; global similarity preservation; local geometric structure; similarity preservation; similarity preservation.;
  • fLanguage
    English
  • Journal_Title
    Neural Networks and Learning Systems, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    2162-237X
  • Type

    jour

  • DOI
    10.1109/TNNLS.2013.2287275
  • Filename
    6657801