DocumentCode :
1765852
Title :
Automated Feature Design for Numeric Sequence Classification by Genetic Programming
Author :
Harvey, Dustin Y. ; Todd, Michael D.
Author_Institution :
Dept. of Struct. Eng., Univ. of California, San Diego, La Jolla, CA, USA
Volume :
19
Issue :
4
fYear :
2015
fDate :
Aug. 2015
Firstpage :
474
Lastpage :
489
Abstract :
Pattern recognition methods rely on maximum-information, minimum-dimension feature sets to reliably perform classification and regression tasks. Many methods exist to reduce feature set dimensionality and construct improved features from an initial set; however, there are few general approaches for the design of features from numeric sequences. Any information lost in preprocessing or feature measurement cannot be recreated during pattern recognition. General approaches are needed to extend pattern recognition to include feature design and selection for numeric sequences, such as time series, within the learning process itself. This paper proposes a novel genetic programming (GP) approach to automated feature design called Autofead. In this method, a GP variant evolves a population of candidate features built from a library of sequence-handling functions. Numerical optimization methods, included through a hybrid approach, ensure that the fitness of candidate algorithms is measured using optimal parameter values. Autofead represents the first automated feature design system for numeric sequences to leverage the power and efficiency of both numerical optimization and standard pattern recognition algorithms. Potential applications include the monitoring of electrocardiogram signals for indications of heart failure, network traffic analysis for intrusion detection systems, vibration measurement for bearing condition determination in rotating machinery, and credit card activity for fraud detection.
Keywords :
data reduction; feature selection; genetic algorithms; learning (artificial intelligence); pattern classification; regression analysis; time series; Autofead; GP approach; automated feature design system; bearing condition determination; candidate algorithms; credit card activity; electrocardiogram signal monitoring; feature measurement; feature selection; feature set dimensionality reduction; fraud detection; genetic programming; heart failure; information lost; intrusion detection systems; learning process; maximum-informationfeature sets; minimum-dimension feature sets; network traffic analysis; numeric sequence classification; numerical optimization; numerical optimization methods; optimal parameter values; pattern recognition methods; regression tasks; rotating machinery; sequence-handling functions; time series; Algorithm design and analysis; Classification algorithms; Genetic programming; Pattern recognition; Standards; Time series analysis; Vegetation; Feature design; genetic programming; machine learning; pattern recognition; sequence classification; time series classification; time series data mining;
fLanguage :
English
Journal_Title :
Evolutionary Computation, IEEE Transactions on
Publisher :
ieee
ISSN :
1089-778X
Type :
jour
DOI :
10.1109/TEVC.2014.2341451
Filename :
6861439
Link To Document :
بازگشت