Corelab Seminar

Vaggelis Chatziafratis
On the Computational Power of Online Gradient Descent

How efficiently can we compute the weight vector of online gradient descent after T steps? We prove that the evolution of weight vectors in online gradient descent can encode arbitrary polynomial-space computations, even in very simple learning settings. Our results imply that, under weak complexity-theoretic assumptions, it is impossible to reason efficiently about the fine-grained behavior of online gradient descent. Specifically, during the talk, we will see how even in the extremely simple learning setting of soft-margin SVMs (support vector machines), the weight updates can encode an instance of the PSPACE-complete C-Path problem. Our reduction and our results also extend to simple ReLU neural networks.
(Full paper:
Based on joint work with Tim Roughgarden (Columbia) and Josh Wang (Google).