PhD Defense of Paul Rolland

Defense of Paul Rolland© V. Cevher/2022 EPFL

Defense of Paul Rolland© V. Cevher/2022 EPFL

On October 14th 2022, Paul Rolland, a PhD student at LIONS lab, successfully defended his PhD thesis. The thesis, entitled "Predicting in Uncertain Environments: Methods for Robust Machine Learning" was supervised by Prof. Volkan Cevher.

Paul's thesis aims at developing algorithms for training models that are robust to potential distribution shift between the training and the testing environments. I considered three different cases.
1- The first one is by involving an adversary that can modify the environment, or the predictor itself during training. The two models (learner and adversary) are trained simultaneously leading to a minimax problem. This problem can be phrased as a minimax problem over probabilistic strategies, which By allowing the players to act probabilistically, the problem becomes a linear minimax problem over distributions, ensuring the existence of a (mixed-strategy) Nash equilibrium. It can then be solved using sampling-based algorithms, exploiting the Langevin dynamics.
2- The second method is by using regularizers. Adversarial robustness is directly related to the Lipschitz constant of the classifier, which precisely quantifies the worst-case sensitivity of the output under change in the input. We first propose an algorithm for upper bounding this constant in the case of neural networks. Then, we propose an algorithm for penalizing this constant during training using the 1-path norm regularization.
3- The final approach is related to the choice of specific features to train the model. It is believed that adversarial attacks arise because of the high dimensionality of the data, and are due to the presence of "non-robust" features used by the model. Hence, choosing the right variables to train the model can be key to robustness. In particular, using causal features leads to models that are not affected under certain distribution shifts. We hence propose an algorithm for inferring the causal structure from observational data.