Title :
Distributed delayed stochastic optimization
Author :
Agarwal, Abhishek ; Duchi, John C.
Author_Institution :
Microsoft Res., New York, NY, USA
Abstract :
We analyze the convergence of gradient-based optimization algorithms that base their updates on delayed stochastic gradient information. The main application of our results is to gradient-based distributed optimization algorithms where a master node performs parameter updates while worker nodes compute stochastic gradients based on local information in parallel, which may give rise to delays due to asynchrony. We take motivation from statistical problems where the size of the data is so large that it cannot fit on one computer; with the advent of huge datasets in biology, astronomy, and the internet, such problems are now common. Our main contribution is to show that for smooth stochastic problems, the delays are asymptotically negligible and we can achieve order-optimal convergence results. We show n-node architectures whose optimization error in stochastic problems-in spite of asynchronous delays-scales asymptotically as O(1/√nT) after T iterations. This rate is known to be optimal for a distributed system with n nodes even in the absence of delays. We additionally complement our theoretical results with numerical experiments on a logistic regression task.
Keywords :
convergence; delays; distributed control; regression analysis; stochastic systems; asynchronous delays; delayed stochastic gradient information; distributed delayed stochastic optimization; gradient-based distributed optimization; logistic regression; order-optimal convergence; statistical problems; Computer architecture; Conferences; Convergence; Convex functions; Delay; Optimization; Stochastic processes;
Conference_Titel :
Decision and Control (CDC), 2012 IEEE 51st Annual Conference on
Conference_Location :
Maui, HI
Print_ISBN :
978-1-4673-2065-8
Electronic_ISBN :
0743-1546
DOI :
10.1109/CDC.2012.6426626