Fermat's Theorem Revisited: A Novel Characterization of Optimality

Prof. Thomas Weber © 2016 EPFL

Prof. Thomas Weber © 2016 EPFL

A recent paper by Prof. Weber, forthcoming in the Journal of Optimization Theory and Applications, introduces a new approach to one-dimensional optimization, which leads to a simple and complete characterization of all optimal solutions to the problem of globally maximizing a differentiable function f on an interval [a,b]. Since high school many know that setting the derivative to zero, i.e., using Fermat’s Theorem, does little in the way of finding the global maximum of the function on the interval. The reason is that the derivative f’ is zero wherever f happens to have a local extremum or a saddle point, or wherever the function has a stretch of constancy. Moreover, if the function attains its largest value at an interval endpoint, e.g., at the point b of the interval [a,b], Fermat’s Theorem does not need to hold at all. Additional higher-order optimality conditions, for example those obtained by checking the sign of the function’s second derivative f’’ at a point where f’ is zero— also taught in high school, similarly cannot guarantee GLOBAL optimality, not to mention that they additionally require higher-order differentiability of the function f to even be applicable.

To maximize a differentiable function f on an interval [a,b], Fermat’s Theorem states that any solution x, not equal to either a or b, must be such that the derivative f’ at that point vanishes, i.e., x must be such that f’(x)=0. It is well known that this rule says nothing about the optimality of a point where the derivative vanishes. Indeed, the derivative of the function vanishes at any local minimum, saddle point, or at a local maximum, all of which may not be optimal. It would also be zero whenever f is just constant. Thus, instead of limiting attention to the behavior of the function around a given point, Prof. Weber noticed that thinking “globally” helps in finding a solution to the global optimization problem and, moreover, can be used to describe all possible solutions exactly.

Very simply put, f is globally maximal on [a,b] at a point x if and only if there is no right-sided improvement available on the interval [x,b] and no left-sided improvement on the interval [a,x]. By introducing “adjoint variables” – similar to the ones commonly used in optimal control problems – to quantify the one-sided improvement of the objective function f relative to a given point x, characterizing an optimal solution becomes simple: x is optimal if and only if the adjoint variables are zero. And this simple rule holds also at the interval endpoints. Regarding the question of how to compute the adjoint variables, Prof. Weber’s paper, entitled “Global Optimization on an Interval,” proposes an iterative solution of a differential equation with initial condition. It is shown that this “adjoint equation” has a unique solution which is usually obtained after a small number of iterations. The new characterization of global optimality can be used in the future to prove specific results regarding, for example, the comparative statics of globally optimal solutions with respect to changes in problem parameters. In a related paper, Prof. Weber uses the approach to solve a general cash-flow switching problem, which can be interpreted as a deterministic version of the well-known “multi-armed bandit problem.”

The results were presented at several international academic conferences earlier this year, notably at the 14th EUROPT Workshop on Advances in Continuous Optimization in Warsaw, Poland, and at the Mathematical Optimization Down Under (MODU) Workshop in Melbourne, Australia.

Links