Bennet
Gebken
Universität Paderborn
First- and second-order descent methods for nonsmooth optimization based on deterministic gradient sampling
Abstract.
In nonsmooth optimization, it is well known that the classical gradient is not a suitable object for describing the local behavior of a function. Instead, generalized derivatives from nonsmooth analysis, like the Clarke subdifferential, have to be employed. While in theory, the Clarke subdifferential inherits many useful properties from the classical gradient, there is a large discrepancy in practice: It is unstable, and for a general locally Lipschitz continuous function, it is impossible to compute. Thus, in practice, the Clarke subdifferential has to be approximated. A simple strategy to achieve this, known as gradient sampling, is based on approximating it by taking the convex hull of classical gradients evaluated at smooth points from a small neighborhood of a given point. In this talk, I will present two descent methods for nonsmooth optimization that are based on this idea. However, in contrast to the standard gradient sampling approach, where the gradients are sampled randomly, both methods will be deterministic. I will begin with a first-order method, where new gradients are computed using a bisection subroutine. Afterwards, I will demonstrate how the gradient sampling methodology can be generalized to second-order derivates by sampling Hessian matrices in addition to gradients. This will lead to a second-order descent method that, at least in numerical experiments, shows a high speed of convergence.