شماره ركورد كنفرانس :
5263
عنوان مقاله :
A Novel Decay Step Size For Stochastic Gradient Descent
پديدآورندگان :
Soheil Shamaee Mahsa soheilshamaee@kashanu.ac.ir Department of Computer Science, Faculty of Mathematical Science, University of Kashan, Kashan, Iran. , Fathi Hafshejani Sajad sajad.fathihafshejan@uleth.ca Department of Mathematics Computer Science, University of Lethbridge, Lethbridge, Canada.
كليدواژه :
Stochastic gradient descent , decay step size , convergence rate
عنوان كنفرانس :
54 امين كنفرانس رياضي ايران
چكيده فارسي :
In this paper, we propose a modified version of the $frac{1}{sqrt{t}}$ step size and demonstrate its convergence rate of $O(frac{ln T}{sqrt{T}})$ for smooth non-convex functions without the Polyak-Łojasiewicz condition. Through experiments on the FashionMNIST, CIFAR10, and CIFAR100 datasets, we show that the modified step size significantly improves both test accuracy and training loss in comparison to the original $frac{1}{sqrt{t}}$ step size.