DSC 243 – Advanced Optimization

Lectures: MWF 10:00–10:50 AM in PCYNH 121

Office Hour: Monday 11:00 AM in HDSI 330

Course summary
Pre-requisites
Course outline
Grading policy
Textbooks and references

Course summary

This course develops the foundations of the modern optimization methods that form the backbone of machine learning and AI. Building on the prerequisite introductory course in optimization (DSC 211), this course focuses on the design, analysis, and practical behavior of algorithms for large-scale, high-dimensional problems. The emphasis is on gaining both theoretical insight and algorithmic intuition—understanding why methods work, how to choose between them, and how they scale to real-world settings. By the end of the course, students will be equipped to think critically about optimization in contemporary applications, from training machine learning models to solving complex decision and inference problems.

Pre-requisites

Introductory course on optimization (DSC 211 or equivalent)
Comfort with advanced linear algebra and multivariate calculus

Course outline

Week(s)	Topics	Lecture notes and HW
1	Gradient descent Acceleration by Chebyshev stepsizes Conjugate Gradient method	Week 1 Notes HW1
2–3	Proximal gradient descent and accelerated gradient descent for convex optimization Lower complexity bounds for convex optimization Mirror descent and accelerated mirror descent Optimization of relatively smooth functions
4–5	Mirror descent for nonsmooth convex functions Frank-Wolfe algorithm as subgradient method in the dual
7	Smoothing algorithms Monotone operators and variational inequalities Extragradient and Chambolle-Pock algorithms
8	Stochastic gradient method Coordinate descent Variance reduction: SVRG, SPIDER, STORM Adaptive algorithms: ADAGRAD, ADAM, RMSPROP
9–10	SGD and STORM for nonconvex stochastic optimization Polyak–Łojasiewicz condition and convergence of gradient systems Examples in deep learning, sampling, control, and reinforcement learning

Grading policy

All grades will be based on homework sets, which will be due roughly every two weeks.

Canvas

I will use Canvas only for communicating by email with the class. All course material will appear on this webpage.

Textbooks and references

This course will largely be self-contained with no required textbook. Relevant sources and follow-up reading material will be cited during the lectures. In particular, I will draw on some material in these sources:

A. Beck. Introduction to Nonlinear Optimization – Theory, Algorithms and Applications. SIAM, 2014.
S. Bubeck. Convex Optimization: Algorithms and Complexity.
S. Boyd and L. Vandenberghe. Convex Optimization.
D. Drusvyatskiy. Convex Analysis and Nonsmooth Optimization.
J. Duchi. Introductory Lectures on Stochastic Convex Optimization.
Y. T. Lee and S. Vempala. Techniques in Optimization and Sampling.
G. Lan. First-order and Stochastic Optimization Methods for Machine Learning. Springer, 2020.
Y. Nesterov. Introductory Lectures on Convex Optimization: A Basic Course. Springer, 2004.
B. Recht and S. J. Wright. Optimization for Modern Data Analysis. Cambridge University Press, 2022.
A. Sidford. Optimization Algorithms.