From shallow to deep: rigorous guarantees for training neural networks

Friday, July 27, 2018, 11:00 am - 12:00 pm PSTiCal
6th Floor Conf Rm (#689)
This event is open to the public.
AI Seminar
Mahdi Soltanolkotabi, USC
Video Recording:


Neural network architectures (a.k.a. deep learning) have recently emerged as powerful tools for automatic knowledge extraction from data, leading to major breakthroughs in applications spanning visual object classification to speech recognition and natural language processing. Despite their wide empirical use the mathematical success of these architectures remains a mystery. One challenge is that training neural networks correspond to extremely high-dimensional and nonconvex optimization problems and it is not clear how to provably solve them to global optimality. While training neural networks is known to be intractable in general, simple local search heuristics are often surprisingly effective at finding global/high quality optima on real or randomly generated data. In this talk I will discuss some results explaining the success of these heuristics. First, I will discuss results characterizing the training landscape of single hidden layer networks demonstrating that when the number of hidden units are sufficiently large then the optimization landscape has favorable properties that guarantees global convergence of (stochastic) gradient descent to a model with zero training error. Second, I introduce a de-biased variant of gradient descent called Centered Gradient Descent (CGD).  I will show that unlike gradient descent, CGD enjoys fast convergence guarantees for arbitrarily deep convolutional neural networks with large stride lengths.


Mahdi Soltanolkotabi is currently an assistant professor in the Ming Hsieh Department of Electrical Engineering at the University of Southern California. Prior to joining USC, he completed his PhD in electrical engineering at Stanford in 2014. He was a postdoctoral researcher in the EECS department at UC Berkeley during the 2014-2015 academic year. Mahdi is a recipient of the 2017 Google faculty research and the 2018 AFOSR young investigator awards. His research focuses on the design and mathematical understanding of computationally efficient algorithms for optimization, high dimensional statistics, machine learning, signal processing and computational imaging. A main focus of his research has been on developing and analyzing algorithms for nonconvex optimization, with provable guarantees of convergence to global optima.

« Return to Upcoming Events