Title :
Predicting flight arrival times with a multistage model
Author_Institution :
Dept. of Math. & Comput. Sci., Szechenyi Istvan Univ., Györ, Hungary
Abstract :
Airlines are constantly looking for ways to cut flight delays, in order to enhance service quality and reduce operational costs. The goal of the data science contest, GE Flight Quest (https://www.gequest.com/c/flight), was to make flights more efficient by improving the accuracy of arrival time estimates. The data set of the contest was 128 GB in size and contained 252 data columns arranged in 34 tables. This paper presents my solution that won third prize under team name Taki. The solution employs a 6-stage model consisting of successive ridge regressions and gradient boosting machines, built on 56 features constructed from the raw data. The hardware environment used for training and running the model was a 64 core machine with 1 terabyte of memory.
Keywords :
Big Data; air traffic; data analysis; learning (artificial intelligence); regression analysis; traffic engineering computing; airlines; flight arrival times prediction; gradient boosting machines; multistage model; real-time big data analysis; ridge regressions; Airports; Atmospheric modeling; Delays; Logic gates; Meteorology; Predictive models; Training; GE Flight Quest; gradient boosting machine; parallelization; ridge regression;
Conference_Titel :
Big Data (Big Data), 2014 IEEE International Conference on
Conference_Location :
Washington, DC
DOI :
10.1109/BigData.2014.7004435