PhD Defense of Mehmet Fatih Sahin
On April 27th 2023, Mehmet Fatih Sahin, a PhD student at LIONS lab, successfully defended his PhD thesis. The thesis, entitled "Augmented Lagrangian Methods for Provable and Scalable Machine Learning" was supervised by Prof. Volkan Cevher.
Non-convex constrained optimization problems have become a powerful framework for modeling a wide range of machine learning problems, with applications in k-means clustering, large-scale semidefinite programs (SDPs), and various other tasks. As the performance of these methods on real-world tasks has shown promising results, the need to develop efficient algorithms and understand their theoretical properties has become increasingly important.
Augmented Lagrangian methods provide strong tools for solving constrained problems by breaking them down into a sequence of relatively easier unconstrained subproblems. Theoretical properties of these methods have been extensively studied for problems where both the objective function and constraints are convex. Additionally, there have been efforts to provide convergence guarantees to the first-order stationary points of constrained problems when only the objective function is non-convex. However, the scenario in which the objective function and constraints are both non-convex has yet to be sufficiently explored.
This thesis is dedicated to the development and theoretical analysis of algorithms that are grounded in the Augmented Lagrangian method, with an emphasis on efficiency and practicality.
First, we introduce a generic inexact Augmented Lagrangian framework that enables the use of either first-order or second-order sub-solvers in the subproblems to attain convergence guarantees for the first (or second) order stationary points of the original constrained problem.
Second, we direct our attention to a more specialized algorithm aimed at solving semidefinite programming (SDP) problems by factorizing the decision variable. We obtain the rate of convergence with an augmented Lagrangian method which combines aspects of both linearized and inexact augmented Lagrangian methods.
Finally, we present a linearized alternating direction method of multipliers (ADMM) framework that is more practical and requires only two consecutive proximal gradient steps on the primal variables and a gradient ascent step in the dual.
The numerical evidence on various machine learning tasks, including causal learning, clustering and maximum cut problems shows the practicality of proposed algorithms.