New paper: Asynchronous SGD with arbitrary delays

Last updated on Aug 1, 2022 1 min read

My first ever optimization project was an ICML paper about an asynchronous gradient method. At the time, I was quite confused by the fact that no matter what I was doing, Asynchronous gradient descent still converged. Five years later, I can finally give an answer: Because Asynchronous SGD doesn’t care about the delays, which we proved in https://arxiv.org/abs/2206.07638 our new paper. For a short summary, you can read my twitter thread about the paper or check my slides.

New paper: Asynchronous SGD with arbitrary delays

Konstantin Mishchenko

Research Scientist