Abstract :
Summary form only given. This presentation illustrates how big data forces change on algorithmic techniques and the goals of machine learning, bringing along challenges and opportunities. 1. The theoretical foundations of statistical machine learning traditionally assume that training data is scarce. If one assumes instead that data is abundant and that the bottleneck is the computation time, stochastic algorithms with poor optimization performance become very attractive learning algorithms. These algorithms quickly became the backbone of large-scale machine learning and are the object of very active research. 2. Increasing the training set size cannot improve average errors indefinitely. However this diminishing returns problem vanishes if we measure instead the diversity of conditions in which the trained system performs well. In other words, big data is not an opportunity to increase the average accuracy, but an opportunity to increase coverage. Machine learning research must broaden its statistical framework in order to embrace all the (changing) aspects of real big data problems. Transfer learning, causal inference, and deep learning are successful steps in this direction.