Berliner Oberseminar Optimization, Control and Inverse Problems   📅

Institute
Head
Caroline Geiersbach, René Henrion, Michael Hintermüller, Dietmar Hömberg, Gabriele Steidl, and Andrea Walther
Number of talks
9
Mon, 22.04.24 at 13:30
WIAS ESH
Approximations of Rockafellians, Lagrangians, and Dual Functions. The case for solving surrogates instead of actual optimization problems
Abstract. Optimization problems are notorious for being unstable in the sense that small changes in their parameters can cause large changes in solutions. However, Rockafellian relaxations, Lagrangian relaxations, and dual problems are typically more stable. While focusing on the nonconvex case, we develop sufficient conditions under which approximations of Rockafellian relaxations, Lagrangian relaxations, and dual problems convergence, epigraphically or hypographically, to limiting counterparts, and quantify the rate of convergence. The conditions are milder than those required by approximations of the actual problems confirming the importance of these surrogate problems. We illustrate the results in the context of composite problems, stochastic optimization, and Rockafellians constructed by augmentation.
Mon, 13.11.23 at 13:30
WIAS ESH
Differential Inclusions and Optimal Control on Wasserstein spaces
Abstract. Optimal Control on Wasserstein spaces addresses control of systems with large number of agents. Recently many models arising in social sciences use these metric spaces of Borel probability measures. The aim of this talk is to demonstrate that for Lipschitz kind dynamics, some corner stone results of classical control theory known in the Euclidean framework have their analogues in Wasserstein spaces. In this talk I will first discuss an extension of the theory of differential inclusions to the setting of general Wasserstein spaces. Indeed, it is well known that for optimal control of ODEs, the differential inclisions theory provides useful tools to investigate existence of optimal controls, necessary optimality conditions and Hamilton-Jacobi-Bellman equations. Same happens for Wasserstein spaces. In particular, I will present necessary and sufficient conditions for the existence of solutions to state-constrained continuity inclusions from [2] building on a suitable notion of contingent cones in Wasserstein spaces leading to viability and invariance theorems. They were already applied in [5], [6] to investigate stability of controlled continuity equations and uniqueness of solutions to HJB equations and will be recalled by the end of the talk. References [1] BONNET B. & FRANKOWSKA H., Carathéodory Theory and a Priori Estimates for Continuity Inclusions in the Space of Probability Measures, preprint https://arxiv.org/pdf/2302.00963.pdf, 2023. [2] BONNET B. & FRANKOWSKA H., On the Viability and Invariance of Proper Sets under Continuity Inclusions in Wasserstein Spaces, SIAM Journal on Mathematical Analysis, to appear. [3] BONNET B. & FRANKOWSKA H., Differential inclusions in Wasserstein spaces: the Cauchy-Lipschitz framework, Journal of Diff. Eqs. 271: 594 - 637, 2021. [4] BONNET B. & FRANKOWSKA H., Mean-field optimal control of continuity equations and differential inclusions, Proceedings of 59th IEEE Conference on Decision and Control, Republic of Korea, December 8-11, 2020: 470 - 475, 2020. [5] BONNET B. & FRANKOWSKA H., Viability and exponentially stable trajectories for differential inclusions in Wasserstein spaces, Proceedings of 61st IEEE Conference on Decision and Control, Mexico, December 6-9, 2022: 5086 - 5091, 2022. [6] BADREDDINE Z. & FRANKOWSKA H., Solutions to Hamilton-Jacobi equation on a Wasserstein space, Calculus of Variations and PDEs 81: 9, 2022.
Wed, 21.06.23 at 13:00
WIAS ESH
The Geometry of Adversarial Machine Learning
Abstract. It is well-known that despite their aptness for complicated tasks like image classification, modern neural networks are prone to insusceptible input perturbations (a.k.a. adversarial attacks) which can lead to severe misclassifications. Adversarial training is a state-of-the-art method to train classifiers which are more robust against these adversarial attacks. The method features minimization of a robust risk and has interpretations as game-theoretic problem, distributionally robust optimization problem, dual of an optimal transport problem, or nonlocal geometric regularization problem. In this talk I will focus on the last interpretation which allows for the application of tools from calculus of variations and geometric measure theory to study existence, regularity, and asymptotic behavior of minimizers. In particular, I will show that adversarial training of binary agnostic classifiers is equivalent to a nonlocal and weighted perimeter regularization of the decision boundary. Furthermore, I will show Gamma-convergence of this perimeter to a local anisotropic perimeter as the strength of the adversary tends to zero, thereby establishing an asymptotic regularization effect of adversarial training. Lastly, I will discuss probabilistic relaxations of adversarial training which exhibit better clean accuracies and also have a perimeter-regularization interpretation.
Mon, 30.05.22 at 15:00
Computational Imaging and Sensing: Theory and Applications
Abstract. The revolution in sensing, with the emergence of many new imaging techniques, offers the possibility of gaining unprecedented access to the physical world, but this revolution can only bear fruit through the skilful interplay between the physical and computational realms. This is the domain of computational imaging which advocates that, to develop effective imaging systems, it will be necessary to go beyond the traditional decoupled imaging pipeline where device physics, image processing and the end-user application are considered separately. Instead, we need to rethink imaging as an integrated sensing and inference model. In the first part of the talk we highlight the centrality of sampling theory in computational imaging and investigate new sampling modalities which are inspired by the emergence of new sensing mechanisms. We discuss time-based sampling which is connected to event-based cameras where pixels behave like neurons and fire when an event happens. We derive sufficient conditions and propose novel algorithms for the perfect reconstruction of classes of non-bandlimited functions from time-based samples. We then develop the interplay between learning and computational imaging and present a model-based neural network for the reconstruction of video sequences from events. The architecture of the network is model-based and is designed using the unfolding technique, some element of the acquisition device are part of the network and are learned with the reconstruction algorithm. In the second part of the talk, we focus on the heritage sector which is experiencing a digital revolution driven in part by the increasing use of non-invasive, non-destructive imaging techniques. These new imaging methods provide a way to capture information about an entire painting and can give us information about features at or below the surface of the painting. We focus on Macro X-Ray Fluorescence (XRF) scanning which is a technique for the mapping of chemical elements in paintings and introduce a method that can process XRF scanning data from paintings. The results presented show the ability of our method to detect and separate weak signals related to hidden chemical elements in the paintings. We analyse the results on Leonardo's 'The Virgin of the Rocks' and show that our algorithm is able to reveal, more clearly than ever before, the hidden drawings of a previous composition that Leonardo then abandoned for the painting that we can now see. This is joint work with R. Alexandru, R. Wang, Siying Liu, J. Huang and Y.Su from Imperial College London; C. Higgitt and N. Daly from The National Gallery in London and Thierry Blu from the Chinese University of Hong Kong. Bio:Pier Luigi Dragotti is Professor of Signal Processing in the Electrical and Electronic Engineering Department at Imperial College London and Fellow of the IEEE. He received the Laurea Degree (summa cum laude) in Electronic Engineering from the University Federico II, Naples, Italy, in 1997; the Master degree in Communications Systems from the Swiss Federal Institute of Technology of Lausanne (EPFL), Switzerland in 1998; and PhD degree from EPFL, Switzerland, in 2002. He has held several visiting positions. In particular, he was a visiting student at Stanford University, Stanford, CA in 1996, a summer researcher in the Mathematics of Communications Department at Bell Labs, Lucent Technologies, Murray Hill, NJ in 2000, a visiting scientist at Massachusetts Institute of Technology (MIT) in 2011 and a visiting scholar at Trinity College Cambridge in 2020. Dragotti was Editor-in-Chief of the IEEE Transactions on Signal Processing (2018-2020), Technical Co-Chair for the European Signal Processing Conference in 2012, Associate Editor of the IEEE Transactions on Image Processing from 2006 to 2009. He was also Elected Member of the IEEE Computational Imaging Technical Committee and the recipient of an ERC starting investigator award for the project RecoSamp. Currently, he is IEEE SPS Distinguished Lecturer. His research interests include sampling theory, wavelet theory and its applications, computational imaging and sparsity-driven signal processing.
Mon, 06.12.21 at 15:00
Bilevel learning for inverse problems
Abstract. In recent years, novel optimization ideas have been applied to several inverse problems in combination with machine learning approaches, to improve the inversion by optimally choosing different quantities/functions of interest. A fruitful approach in this sense is bilevel optimization, where the inverse problems are considered as lower-level constraints, while on the upper-level a loss function based on a training set is used. When confronted with inverse problems with nonsmooth regularizers or nonlinear operators, however, the bilevel optimization problem structure becomes quite involved to be analyzed, as classical nonlinear or bilevel programming results cannot be directly utilized. In this talk, I will discuss on the different challenges that these problems pose, and provide some analytical results as well as a numerical solution strategy.
Mon, 05.07.21 at 15:00
Computing disconnected bifurcation diagrams of partial differential equations
Abstract. Computing the distinct solutions $u$ of an equation $f(u, \lambda) = 0$ as a parameter $\lambda \in \mathbb{R}$ is varied is a central task in applied mathematics and engineering. The solutions are captured in a bifurcation diagram, plotting (some functional of) $u$ as a function of $\lambda$. In this talk I will present a new algorithm, deflated continuation, for this task. Deflated continuation has three advantages. First, it is capable of computing disconnected bifurcation diagrams; previous algorithms only aimed to compute that part of the bifurcation diagram continuously connected to the initial data. Second, its implementation is very simple: it only requires a minor modification to an existing Newton-based solver. Third, it can scale to very large discretisations if a good preconditioner is available; no auxiliary problems must be solved. We will present applications to hyperelastic structures, liquid crystals, and Bose-Einstein condensates, among others.
Mon, 14.06.21 at 15:00
Data driven large-scale convex optimisation
Abstract. This joint work with Jevgenjia Rudzusika (KTH), Sebastian Banert (Lund University) and Jonas Adler (DeepMind) introduces a framework for using deep-learning to accelerate optimisation solvers with convergence guarantees. The approach builds on ideas from the analysis of accelerated forward-backward schemes, like FISTA. Instead of the classical approach of proving convergence for a choice of parameters, such as a step-size, we show convergence whenever the update is chosen in a specific set. Rather than picking a point in this set through a handcrafted method, we train a deep neural network to pick the best update. The method is applicable to several smooth and non-smooth convex optimisation problems and it outperforms established accelerated solvers.
Mon, 03.05.21 at 15:00
A Machine Learning Framework for Mean Field Games and Optimal Control
Abstract. We consider the numerical solution of mean field games and optimal control problems whose state space dimension is in the tens or hundreds. In this setting, most existing numerical solvers are affected by the curse of dimensionality (CoD). To mitigate the CoD, we present a machine learning framework that combines the approximation power of neural networks with the scalability of Lagrangian PDE solvers. Specifically, we parameterize the value function with a neural network and train its weights using the objective function with additional penalties that enforce the Hamilton Jacobi Bellman equations. A key benefit of this approach is that no training data is needed, e.g., no numerical solutions to the problem need to be computed before training. We illustrate our approach and its efficacy using numerical experiments. To show the framework's generality, we consider applications such as optimal transport, deep generative modeling, mean field games for crowd motion, and multi-agent optimal control.
Mon, 29.03.21 at 15:00
On a multilevel Levenberg-Marquardt method for the training of artificial neural networks and its application to the solution of partial differential equations
Abstract. We propose a new multilevel Levenberg-Marquardt optimizer for the training of artificial neural networks with quadratic loss function. When the least-squares problem arises from the training of artificial neural networks, the variables subject to optimization are not related by any geometrical constraints and the standard interpolation and restriction operators cannot be employed any longer. A heuristic, inspired by algebraic multigrid methods, is then proposed to construct the multilevel transfer operators. We test the new optimizer on an important application: the approximate solution of partial differential equations by means of artificial neural networks. The learning problem is formulated as a least squares problem, choosing the nonlinear residual of the equation as a loss function, whereas the multilevel method is employed as a training method. Numerical experiments show encouraging results related to the efficiency of the new multilevel optimization method compared to the corresponding one-level procedure in this context.