Research Seminar on Mathematical Statistics

Math. Statistics

Wed, 16.07.25 at 10:00

Weierstrass-Insti...

Claudia Strauch Universität Heidelberg

Statistical analysis of reflected diffusions and random-time reversals in generative modelling

Math. Statistics

Wed, 09.07.25 at 10:00

HVP 11 a, R.313, ...

Vincent Rivoirard Université Dauphine, Paris

Math. Statistics

Wed, 02.07.25 at 10:00

HVP 11 a, R.313, ...

Frank Konietschke Charité Berlin

Math. Statistics

Wed, 02.07.25 at 10:00

Weierstrass-Insti...

Merle Munko Otto-von-Guericke University Magdeburg

Multiple Tests for Mean Functions of Functional Data

Abstract.

Functional data analysis is becoming increasingly popular to study data from real-valued random functions. Nevertheless, there is a lack of multiple testing procedures for such data. These are particularly important in factorial designs for comparing different groups or inferring factor effects. We propose a new class of testing procedures for arbitrary linear hypotheses in general factorial designs with functional data. Our methods allow global as well as multiple inference of both univariate and multivariate mean functions without assuming particular error distributions or homoscedasticity. That is, we allow for different structures of the covariance functions between groups. We analyse the (joint) asymptotic behaviour of suitable test statistics and propose a resampling approach to approximate the limit distributions. The resulting global and multiple testing procedures are asymptotically valid under weak conditions and applicable in general functional MANOVA settings. We evaluate their small-sample performance in extensive simulations and finally illustrate their applicability by analysing a data set.

Math. Statistics

Wed, 25.06.25 at 10:00

HVP 11 a, R.313, ...

Ryan Tibshirani Berkeley University

Math. Statistics

Wed, 18.06.25 at 10:00

HVP 11 a, R.313, ...

Lasse Vuursteen University of Pennsylvania

Adaptive Estimation under Differential Privacy Constraints

Abstract.

Estimation guarantees in nonparametric models typically depend on underlying function classes (or hyperparameters) that are seldom known in practice. Adaptive estimators provide simultaneous near-optimal performance across multiple such function classes. In this talk, I will discuss recent work with co-authors Tony Cai and Abhinav Chakraborty, in which we study adaptation under differential privacy constraints. Differential privacy fundamentally limits the information that can be revealed about each individual datum by each data holder. We develop a general theory for adaptation under differential privacy in the context of estimating linear functionals of a density. Our framework characterizes the difficulty of private adaptation problems through a specific 'between-class modulus of continuity' that exactly describes the optimal achievable performance for private estimators that must adapt across two or more function classes. Our theory reveals and quantifies the extent to which adaptation between specific function classes suffers as a consequence of imposing differential privacy constraints.

Math. Statistics

Wed, 11.06.25 at 10:00

HVP 11 a, R.313

Dimitri Konen University of Cambridge

Data assimilation with the 2D Navier-Stokes equations: Optimal Gaussian asymptotics for the posterior measure

Abstract.

A functional Bernstein von Mises theorem is proved for posterior measures arising in a data assimilation problem with the two-dimensional Navier-Stokes equation where a Gaussian process prior is assigned to the initial condition of the system. The posterior measure, which provides the update in the space of all trajectories arising from a discrete sample of the dynamics, is shown to be approximated by a Gaussian random function arising from the solution to a linear parabolic PDE with Gaussian initial condition. The approximation holds in the strong sense of the supremum norm on the regression functions, showing that predicting future states of Navier-Stokes systems admits root(N)-consistent estimators even for commonly used nonparametric models. Consequences to credible bands and uncertainty quantification are discussed, and a functional minimax theorem is derived that describes the Cramer-Rao lower bound for estimating the state of the non-linear system, which is attained by the data assimilation algorithm.

Math. Statistics

Wed, 04.06.25 at 10:00

HVP 11 a, R.313, ...

Sebastian Kassing TU Berlin

Stochastic Optimization: From Quadratic Programs to the Training of Neural Networks

Abstract.

Many foundational results in (stochastic) optimization—ranging from convergence guarantees and rates to asymptotic normality—have been derived under strong convexity assumptions or even for quadratic objective functions. However, such assumptions often fail to hold in modern machine learning applications, where objectives are typically non-convex. This talk explores a recent line of research that extends classical results in stochastic gradient-based optimization to broader classes of functions satisfying the Polyak–Łojasiewicz (PL) inequality, a condition that is significantly more relevant for practical deep learning models. We consider typical acceleration techniques such as Polyak’s Heavy Ball and Ruppert-Polyak averaging and use a geometric interpretation of the PL-inequality to show that many algorithmic properties extend to a more general and realistic class of objectives.

Math. Statistics

Wed, 28.05.25 at 10:00

HVP 11 a, R.313, ...

Jason Klusowski Princeton University

Statistical-computational Trade-offs for Recursive Adaptive Partitioning Estimators

Abstract.

Recursive adaptive partitioning estimators, like decision trees and their ensembles, are effective for high-dimensional regression but usually rely on greedy training, which can become stuck at suboptimal solutions. We study this phenomenon in estimating sparse regression functions over binary features, showing that when the true function satisfies a certain structural property, greedy training achieves low estimation error with only a logarithmic number of samples in the feature count. However, when this property is absent, estimation becomes exponentially more difficult. Interestingly, this dichotomy between efficient and inefficient estimation resembles the behavior of two-layer neural networks trained with SGD in the mean-field regime. Meanwhile, ERM-trained recursive adaptive partitioning estimators always achieve low estimation error with logarithmically many samples, revealing a fundamental statistical-computational trade-off for greedy training.

Math. Statistics

Wed, 21.05.25 at 10:00

HVP 11 a, R.313, ...

Yi Yu University of Warwick

Contextual Dynamic Pricing: Algorithms, Optimality and Local Differential Privacy Constraints

Abstract.

We study contextual dynamic pricing problems where a firm sells products to $T$ sequentially-arriving consumers, behaving according to an unknown demand model. The firm aims to minimize its regret over a clairvoyant that knows the model in advance. The demand follows a generalized linear model (GLM), allowing for stochastic feature vectors in $mathbb R^d$ encoding product and consumer information. We first show the optimal regret is of order $sqrtdT$, up to logarithmic factors, improving existing upper bounds by a $sqrtd$ factor. This optimal rate is materialized by two algorithms: an explore-then-commit (ETC) algorithm and a confidence bound-type algorithm. A key insight is an intrinsic connection between dynamic pricing and contextual multi-armed bandit problems with many arms with a careful discretization. We further extend our study to adversarial contexts and propose algorithms that are statistically and computationally more efficient than existing methods in the literature. We further study contextual dynamic pricing under local differential privacy (LDP) constraints. We propose a stochastic gradient descent-based ETC algorithm achieving regret upper bounds of order $dsqrtT/epsilon$, up to logarithmic factors, where $epsilon>0$ is the privacy parameter. The upper bounds with and without LDP constraints are matched by newly constructed minimax lower bounds, characterizing costs of privacy. Moreover, we extend our study to dynamic pricing under mixed privacy constraints, improving the privacy-utility tradeoff by leveraging public data. This is the first time such setting is studied in the dynamic pricing literature and our theoretical results seamlessly bridge dynamic pricing with and without LDP. Extensive numerical experiments and real data applications are conducted to illustrate the efficiency and practical value of our algorithms.

Math. Statistics

Wed, 14.05.25 at 10:00

HVP 11 a, R.313, ...

Vladimir Spokoiny WIAS/HU Berlin

Estimation and inference for Deep Neuronal Networks and inverse problems

Abstract.

The talk discusses two important issues in modern high-dimensional statistics. Success of DNN in practical applications is at the same time a great challenge for statistical theory due to the 'curse of dimensionality' problem. Manifold type assumptions are not really helpful and do not explain the double descent phenomenon when the DNN accuracy improves with of overparametrization. We offer a different view on the problem based on the notion of effective dimension and a calming device. The idea is to decouple the structural DNN relation by extending the parameter space and use a proper regularization without any substantial increase of the effective dimension. The other related issue is the choice of regulation in inverse problems. We show that a simple ridge penalty (Tikhonov regularization) does a good job in any inverse problem for which the operator is more regular than the unknown signal. In the opposite case, one should use a model reduction technique like spectral cut-off. Estimation and inference for Deep Neuronal Networks and inverse problems.

Math. Statistics

Wed, 30.04.25 at 10:00

HVP 11 a, R.313

Sebastian Kassing TU Berlin

Math. Statistics

Wed, 30.04.25 at 10:00

HVP 11 a, R.313, ...

Ratmir Miftachov HU Berlin

Early Stopping for Regression Trees

Abstract.

We develop early stopping rules for growing regression tree estimators. The fully data-driven stopping rule is based on monitoring the global residual norm. The best-first search and the breadth-first search algorithms together with linear interpolation give rise to generalized projection or regularization flows. A general theory of early stopping is established. Oracle inequalities for the early-stopped regression tree are derived without any smoothness assumption on the regression function, assuming the original CART splitting rule, yet with a much broader scope. The remainder terms are of smaller order than the best achievable rates for Lipschitz functions in dimension . In real and synthetic data the early stopping regression tree estimators attain the statistical performance of cost-complexity pruning while significantly reducing computational costs.

Math. Statistics

Wed, 12.02.25 at 10:00

WIAS Erhard-Schmi...

Judith Rousseau University of Oxford/ Paris Dauphine-PSL University

Convergence of diffusion models under the manifold hypothesis in high-dimensions

Abstract.

Denoising Diffusion Probabilistic Models (DDPM) are powerful state-of-the-art methods used to generate synthetic data from high-dimensional data distributions and are widely used for image, audio and video generation as well as many more applications in science and beyond. The \textit{manifold hypothesis} states that high-dimensional data often lie on lower-dimensional manifolds within an ambient space of large dimension D , and is widely believed to hold in provided examples. While recent results have provided invaluable insight into how diffusion models adapt to the manifold hypothesis, they do not capture the great empirical success of these models. In this work, we study DDPMs under the manifold hypothesis and prove that they achieve rates independent of the ambient dimension in terms of learning the score. In terms of sampling, we obtain rates independent of the ambient dimension w.r.t.\ the Kullback-Leibler divergence, and $O(\sqrt{D})$ w.r.t.\ the Wasserstein distance. We do this by developing a new framework connecting diffusion models to the well-studied theory of extrema of Gaussian Processes. This is a joint work with I. Azangulov and G. Deligliannidis (University of Oxford)

Math. Statistics

Wed, 05.02.25 at 10:00

WIAS Erhard-Schmi...

Sophie Langer University of Twente

Deep learning theory - what's next?

Abstract.

Since several years, deep learning has emerged as a transformative field, with its theory involving several disciplines such as approximation theory, statistics and optimization. Despite remarkable advances, the rapid evolution of AI-driven methods continually outpaces our theoretical understanding. New challenges, from overparametrization and diffusion models to Transformer learning arise almost yearly, underscoring the gap between theory and practice. In this talk, we delve into key theoretical breakthroughs, with a particular focus on statistical results. We critically question the prevailing frameworks and introduce a novel statistical approach to image analysis. Rather than treating images as high-dimensional data entities, our framework reconceptualized them as structured objects shaped by geometric deformations like shifts, scales, and orientations. The goal of the classification rule is then to learn the uninformative deformations, resulting in convergence rates with more favorable tradeoffs between input dimension and sample size. This fresh perspective not only provides new guarantees for approximation and convergence in deep learning-based image classification but also redefines how we approach image analysis with the potential of broader applications to other learning tasks. We conclude by discussing emerging research directions and riflecting on the role of theory in the field. This talk is based on joint work with Johannes Schmidt-Hieber and Juntong Chen.

Math. Statistics

Wed, 22.01.25 at 10:00

HVP 11 a, R.313

Vincent Rivoirard Université Daupine, Paris

PCA for point processes

Math. Statistics

Wed, 15.01.25 at 10:00

online event only

Xiaorui Zuo National University of Singapore

Cryptos have rough volatility and correlated jumps

Math. Statistics

Wed, 08.01.25 at 10:00

HVP 11 a, R.313

Johannes Schmidt-Hieber University of Twente

Statistical estimation using zeroth-order optimization

Abstract.

In this talk, we study statistical properties of zeroth-order optimization schemes, which do not have access to the gradient of the loss and rely solely on evaluating the loss function. Such methods are often considered to be suboptimal for high-dimensional problems, as their convergence rates to the minimizer of the objective function are typically slower than those of gradient-based methods. This performance gap becomes more pronounced as the number of parameters increases. Considering the linear model, we show that reusing the same data point for multiple zeroth-order updates can overcome the gap in the estimation rates. Additionally, we demonstrate that zeroth-order optimization methods can achieve the optimal estimation rate when only queries from the linear regression model are available. Special attention will be given to the non-standard minimax lower bound in the query model. This is joint work with Thijs Bos, Niklas Dexheimer and Wouter Koolen.

Math. Statistics

Wed, 11.12.24 at 10:00

HVP 11 a, R.313

Chloé Rouyer Universität Potsdam

Foundations of online learning for easy and worst-case data.

Abstract.

Online learning is a well-studied framework used to represent learning problems where the learner only has access to one data-point at the time and has to learn sequentially. This problem is particularly challenging in the bandit framework, which is a repeated game between the learner and the environment. In this game, the learner is faced with a list of actions and the environment generates losses associated with these actions. Then, the learner repeadly needs to play an action within this list in order to minimize their cumulative loss, but they can only observe the loss associated with the action they played. This means that at each round, the learner has to balance exploration (gathering information on less studied actions) and exploitation (using the already gathered information to play an action with a supposed small loss). Developing learner strategies for this problem depends on the assumptions made on the environment. There have been two major lines of research in this field, one assuming that these losses follow some unknown stochastic distributions and the other only assuming that these losses are bounded and independent of the learner's actions. In this talk, we introduce the recent field of best-of-both worlds sequential learning, which aims to develop algorithms that are optimal for both types of losses simultaneously.

Math. Statistics

Wed, 27.11.24 at 10:00

HVP 11 a, R.313

Julien Chhor Toulouse School of Economics

Locally sharp goodness-of-fit testing in sup norm for high-dimensional counts

Math. Statistics

Wed, 20.11.24 at 10:00

HVP 11 a, R.313

Bernhard Stankewitz Universität Potsdam

Contraction rates for conjugate gradient and Lanczos approximate posteriors in Gaussian process regression

Math. Statistics

Wed, 06.11.24 at 10:00

WIAS Erhard-Schmi...

Vladimir Spokoiny WIAS und HU Berlin

Math. Statistics

Wed, 30.10.24 at 10:00

HVP 11 a, R.313

Olga Klopp ESSEC Business School, Paris

Adaptive density estimation under low-rank constraints

Abstract.

In this talk, we address the challenge of bivariate probability density estimation under low-rank constraints for both discrete and continuous distributions. For discrete distributions, we model the target as a low-rank probability matrix. In the continuous case, we assume the density function is Lipschitz continuous over an unknown compact rectangular support and can be decomposed into a sum of K separable components, each represented as a product of two one-dimensional functions. We introduce an estimator that leverages these low-rank constraints, achieving significantly improved convergence rates. We also derive lower bounds for both discrete and continuous cases, demonstrating that our estimators achieve minimax optimal convergence rates within logarithmic factors.

Math. Statistics

Wed, 23.10.24 at 10:00

HVP 11 a, R.313

Weining Wang University of Groningen

Conditional nonparametric variable screening by neural factor regression

Abstract.

High-dimensional covariates often admit linear factor structure. To effectively screen correlated covariates in high-dimension, we propose a conditional variable screening test based on non-parametric regression using neural networks due to their representation power. We ask the question whether individual covariates have additional contributions given the latent factors or more generally a set of variables. Our test statistics are based on the estimated partial derivative of the regression function of the candidate variable for screening and a observable proxy for the latent factors. Hence, our test reveals how much predictors contribute additionally to the non-parametric regression after accounting for the latent factors. Our derivative estimator is the convolution of a deep neural network regression estimator and a smoothing kernel. We demonstrate that when the neural network size diverges with the sample size, unlike estimating the regression function itself, it is necessary to smooth the partial derivative of the neural network estimator to recover the desired convergence rate for the derivative. Moreover, our screening test achieves asymptotic normality under the null after finely centering our test statistics that makes the biases negligible, as well as consistency for local alternatives under mild conditions. We demonstrate the performance of our test in a simulation study and two real world applications.

Math. Statistics

Wed, 16.10.24 at 10:00

HVP 11 a, R.313

Botond Szabo Bocconi Milan

Privacy constrained semiparametric inference

Abstract.

For semi-parametric problems differential private estimators are typically constructed in a case-by-case basis. In this work we develop a privacy constrained semi-parametric plug-in approach, which can be used in general, over a collection of semi-parametric problems. We derive minimax lower and matching upper bounds for this approach and provide an adaptive procedure in case of irregular (atomic) functionals. Joint work with Lukas Steinberger (Vienna) and Thibault Randrianarisoa (Toronto, Vector Institute).

Math. Statistics

Wed, 10.07.24 at 10:00

WIAS Erhard-Schmi...

Anya Katsevich MIT, Cambridge, MA

Laplace asymptotics in high-dimensional Bayesian inference

Abstract.

Computing integrals against a high-dimensional posterior is the major computational bottleneck in Bayesian inference. A popular technique to reduce this computational burden is to use the Laplace approximation (LA), a Gaussian distribution, in place of the true posterior. We derive a new, leading order asymptotic decomposition of integrals against a high-dimensional Laplace-type posterior which sheds valuable insight on the accuracy of the LA in high dimensions. In particular, we determine the tight dimension dependence of the approximation error, leading to the tightest known Bernstein von Mises result on the asymptotic normality of the posterior. The decomposition also leads to a simple modification to the LA which yields a higher-order accurate approximation to the posterior. Finally, we prove the validity of the high-dimensional Laplace asymptotic expansion to arbitrary order, which opens the door to approximating the partition function, of use in high-dimensional model selection and many other applications beyond statistics.

Math. Statistics

Wed, 03.07.24 at 10:00

WIAS Erhard-Schmi...

Celine Duval Université de Lille

Geometry of excursion sets: Computing the surface area from discretized points

Abstract.

The excursion sets of a smooth random field carries relevant information in its various geometric measures. After an introduction of these geometrical quantities showing how they are related to the parameters of the field, we focus on the problem of discretization. From a computational viewpoint, one never has access to the continuous observation of the excursion set, but rather to observations at discrete points in space. It has been reported that for specific regular lattices of points in dimensions 2 and 3, the usual estimate of the surface area of the excursions remains biased even when the lattice becomes dense in the domain of observation. We show that this limiting bias is invariant to the locations of the observation points and that it only depends on the ambient dimension. (based on joint works with H. Biermé, R. Cotsakis, E. Di Bernardino and A. Estrade).

Math. Statistics

Wed, 26.06.24 at 10:00

R. 3.13 im HVP 11a

Clément Berenfeld Universität Potsdam

A theory of stratification learning

Abstract.

Given i.i.d. sample from a stratified mixture of immersed manifolds of different dimensions, we study the minimax estimation of the underlying stratified structure. We provide a constructive algorithm allowing to estimate each mixture component at its optimal dimension-specific rate adaptively. The method is based on an ascending hierarchical co-detection of points belonging to different layers, which also identifies the number of layers and their dimensions, assigns each data point to a layer accurately, and estimates tangent spaces optimally. These results hold regardless of any ambient assumption on the manifolds or on their intersection configurations. They open the way to a broad clustering framework, where each mixture component models a cluster emanating from a specific nonlinear correlation phenomenon.

Math. Statistics

Wed, 12.06.24 at 10:00

R.406, 4. OG

Marc Hallin Université Libre de Bruxelles

The long quest for quantiles and ranks in Rd and on manifolds

Abstract.

Quantiles are a fundamental concept in probability, and an essential tool in statistics, from descriptive to inferential. Still, despite half a century of attempts, no satisfactory and fully agreed-upon definition of the concept, and the dual notion of ranks, is available beyond the well-understood case of univariate variables and distributions. The need for such a definition is particularly critical for varia- bles taking values in Rd, for directional variables (values on the hypersphere), and, more generally, for variables with values on manifolds. Unlike the real line, indeed, no canonical ordering is available on the- se domains. We show how measure transportation brings a solution to this problem by characterizing distribution-specific (data-driven, in the empirical case) orderings and center-outward distribution and quantile functions (ranks and signs in the empirical case) that satisfy all the properties expected from such concepts while reducing, in the case of real-valued variables, to the classical univariate notion.

Math. Statistics

Wed, 05.06.24 at 10:00

WIAS Erhard-Schmi...

Jia-Jie Zhu WIAS Berlin

Wasserstein and beyond: Optimal transport and gradient flows for machine learning and optimization

Abstract.

In the first part of the talk, I will provide an overview of gradient flows over non-negative and probability measures and their application in modern machine learning tasks, such as variational inference, sampling, training of over-parameterized models, and robust optimization. Then, I will present our recent results on the analysis of a couple of particularly relevant gradient flows, including the settings of Wasserstein, Hellinger/Fisher-Rao, and reproducing kernel Hilbert space. The focus is on the global exponential decay of the entropy functionals along the gradient flows such as Hellinger-Kantorovich (a.k.a. Wasserstein-Fisher-Rao) and a new type of gradient flow geometries that guarantee convergence of minimizing a maximum-mean discrepancy, which we term the interaction-force transport.

Math. Statistics

Wed, 29.05.24 at 10:00

WIAS Erhard-Schmi...

Tailen Hsing University of Michigan

A functional-data perspective in spatial data analysis

Abstract.

More and more spatiotemporal data nowadays can be viewed as functional data. The first part of the talk focuses on the Argo data, which is a modern oceanography dataset that provides unprecedented global coverage of temperature and salinity measurements in the upper 2,000 meters of depth of the ocean. I will discuss a functional kriging approach to predict temperature and salinity as a smooth function of depth, as well as a co-kriging approach of predicting oxygen concentration based on temperature and salinity data. In the second part of the talk, I will give an overview on some related topics, including spectral density estimation and variable selection for functional data.

Math. Statistics

Wed, 22.05.24 at 10:00

WIAS Erhard-Schmi...

Vladimir Spokoiny WIAS Berlin

Gaussian variational inference in high dimension

Abstract.

We consider the problem of approximating a high-dimensional distribution by a Gaussian one by minimizing the Kullback-Leibler divergence. The main result extends Katsevich and Rigollet (2023) and claims that the minimiser can be well approximated by the Gaussian distribution with the mean and variance as for the underlying measure. We also describe the accuracy of approximation and the range of applicability for such approximation in terms of efficient dimension. The obtained results can be used for analysis of various sampling scheme in optimization.

Math. Statistics

Wed, 15.05.24 at 10:00

WIAS Erhard-Schmi...

Fabian Telschow HU Berlin

Estimation of the expected Euler characteristic of excursion sets of random fields and applications to simultaneous confidence bands

Abstract.

The expected Euler characteristic (EEC) of excursion sets of a smooth Gaussian-related random field over a compact manifold can be used to approximate the distribution of its supremum for high thresholds. Viewed as a function of the excursion threshold, the EEC of a Gaussian-related field is expressed by the Gaussian kinematic formula (GKF) as a finite sum of known functions multiplied by the Lipschitz–Killing curvatures (LKCs) of the generating Gaussian field. In the first part of this talk we present consistent estimators of the LKCs as linear projections of ''pinned" Euler characteristic (EC) curves obtained from realizations of zero-mean, unit variance Gaussian processes. As observed data seldom is Gaussian, we generalize these LKC estimators by an unusual use of the Gaussian multiplier bootstrap to obtain consistent estimates of the LKCs of Gaussian limiting fields of non-stationary statistics. In the second part, we explain applications of LKC estimation and the GKF to simultaneous familywise error rate inference, for example, by constructing simultaneous confidence bands and CoPE sets for spatial functional data over complex domains such as fMRI and climate data and discuss their benefits and drawbacks compared to other methodologies.

Math. Statistics

Wed, 08.05.24 at 10:00

WIAS Erhard-Schmi...

Georg Keilbar HU Berlin and Ratmir Miftachov HU Berlin

Math. Statistics

Wed, 24.04.24 at 10:00

WIAS Erhard-Schmi...

Nicolas Verzelen INRAE Montpellier

Computational trade-offs in high-dimensional clustering

Math. Statistics

Wed, 17.04.24 at 10:00

WIAS Erhard-Schmi...

Gil Kur ETH Zürich

Connections between minimum norm interpolation and local theory of Banach spaces

Math. Statistics

Wed, 14.02.24 at 10:00

WIAS Erhard-Schmi...

Martin Wahl Universität Bielefeld

Heat kernel PCA with applications to Laplacian eigenmaps

Abstract.

Laplacian eigenmaps and diffusion maps are nonlinear dimensionality reduction methods that use the eigenvalues and eigenvectors of (un)normalized graph Laplacians. Both methods are applied when the data is sampled from a low-dimensional manifold, embedded in a high-dimensional Euclidean space. From a mathematical perspective, the main problem is to understand these empirical Laplacians as spectral approximations of the underlying Laplace-Beltrami operator. In this talk, we study Laplacian eigenmaps through the lens of kernel PCA, and consider the heat kernel as reproducing kernel feature map. This leads to novel points of view and allows to leverage results for empirical covariance operators in infinite dimensions.

Math. Statistics

Wed, 07.02.24 at 10:00

WIAS Erhard-Schmi...

Evgenii Chzhen LMO Orsay, Paris

Math. Statistics

Wed, 31.01.24 at 10:00

WIAS 406, 4. OG

Gianluca Finocchio Universität Wien

An extended latent factor framework for ill-posed linear regression

Abstract.

The classical latent factor model for linear regression is extended by assuming that, up to an unknown orthogonal transformation, the features consist of subsets that are relevant and irrelevant for the response. Furthermore, a joint low-dimensionality is imposed only on the relevant features vector and the response variable. This framework allows for a comprehensive study of the partial-least-squares (PLS) algorithm under random design. In particular, a novel perturbation bound for PLS solutions is proven and the high-probability L²-estimation rate for the PLS estimator is obtained. This novel framework also sheds light on the performance of other regularisation methods for ill-posed linear regression that exploit sparsity or unsupervised projection. The theoretical findings are confirmed by numerical studies on both real and simulated data.

Math. Statistics

Wed, 24.01.24 at 10:00

WIAS Erhard-Schmi...

Simon Wood University of Edinburgh

On neighbourhood cross validation

Abstract.

Cross validation comes in many varieties, but some of the more interesting flavours require multiple model fits with consequently high cost. This talk shows how the high cost can be side-stepped for a wide range of models estimated using a quadratically penalized smooth loss, with rather low approximation error. Once the computational cost has the same leading order as a single model fit, it becomes feasible to efficiently optimize the chosen cross-validation criterion with respect to multiple smoothing/precision parameters. Interesting applications include cross-validating smooth additive quantile regression models, and the use of leave-out-neighbourhood cross validation for dealing with nuisance short range autocorrelation. The link between cross validation and the jackknife can be exploited to obtain reasonably well calibrated uncertainty quantification in these cases

Math. Statistics

Wed, 17.01.24 at 10:00

WIAS Erhard-Schmi...

Matteo Giordano Università degli Studi di Torino

Likelihood methods for low frequency diffusion data

Abstract.

The talk will consider the problem of nonparametric inference in multi-dimensional diffusion models from low-frequency data. Implementation of likelihood-based procedures in such settings is a notoriously delicate task, due to the computational intractability of the likelihood. For the nonlinear inverse problem of inferring the diffusivity in a stochastic differential equation, we propose to exploit the underlying PDE characterisation of the transition densities, which allows the numerical evaluation of the likelihood via standard numerical methods for elliptic eigenvalue problems. A simple Metropolis-Hastings-type MCMC algorithm for Bayesian inference on the diffusivity is then constructed, based on Gaussian process priors. Furthermore, the PDE approach also yields a convenient characterisation of the gradient of the likelihood via perturbation techniques for parabolic PDEs, allowing the construction of gradient-based inference methods including MLE and Langevin-type MCMC. The performance of the algorithms is illustrated via the results of numerical experiments. Joint work with Sven Wang.

Math. Statistics

Wed, 10.01.24 at 10:00

WIAS Erhard-Schmi...

Eric Moulines Ecole Polytechnique

Score-based diffusion models and applications

Abstract.

Deep generative models represent an advanced frontier in machine learning. These models are adept at fitting complex data sets, whether they consist of images, text or other forms of high-dimensional data. What makes them particularly noteworthy is their ability to provide independent samples from these complicated distributions at a cost that is both computationally efficient and resource efficient. However, the task of accurately sampling a target distribution presents significant challenges. These challenges often arise from the high dimensionality, multimodality or a combination of these factors. This complexity can compromise the effectiveness of traditional sampling methods and make the process either computationally prohibitive or less accurate. In my talk, I will address recent efforts in this area that aim to improve traditional inference and sampling algorithms. My major focus will be on score-based diffusion models. By utilizing the concept of score matching and time-reversal of stochastic differential equations, they offer a novel and powerful approach to generating high-quality samples. I will discuss how these models work, their underlying principles and how they are used to overcome the limitations of conventional methods. The talk will also cover practical applications, demonstrating their versatility and effectiveness in solving complex real-world problems.

Math. Statistics

Fri, 15.12.23 at 10:00

WIAS Erhard-Schmi...

Laura Sangalli MOX Milano, Italien

Physics-informed spatial and functional data analysis

Abstract.

Recent years have seen an explosive growth in the recording of increasingly complex and high-dimensional data, whose analysis calls for the definition of new methods, merging ideas and approaches from statistics and applied mathematics. My talk will focus on spatial and functional data observed over non-Euclidean domains, such as linear networks, two-dimensional manifolds and non-convex volumes. I will present an innovative class of methods, based on regularizing terms involving Partial Differential Equations (PDEs), defined over the complex domains being considered. These Physics-Informed statistical learning methods enable the inclusion of the available problem specific information, suitably encoded in the regularizing PDE. Illustrative applications from environmental and life sciences will be presented.

Math. Statistics

Wed, 13.12.23 at 10:00

WIAS Erhard-Schmi...

Boris Buchmann ANU Canberra, Australia

Weak subordination of multivariate Levy processes

Abstract.

Subordination is the operation which evaluates a Levy process at a subordinator, giving rise to a pathwise construction of a "time-changed" process. In probability semigroups, subordination was applied to create the variance gamma process, which is prominently used in financial modelling. However, subordination may not produce a levy process unless the subordinate has independent components or the subordinate has indistinguishable components. We introduce a new operation known as weak subordination that always produces a Levy process by assigning the distribution of the subordinate conditional on the value of the subordinator, and matches traditional subordination in law in the cases above. Weak subordination is applied to extend the class of variance-generalised gamma convolutions and to construct the weak variance-alpha-gamma process. The latter process exhibits a wider range of dependence than using traditional subordination. Joint work with Kevin W. LU - Australian National University (Australia) & Dilip B. Madan - University of Maryland (USA)

Math. Statistics

Wed, 29.11.23 at 10:00

R. 3.13 im HVP 11a

Martin Spindler Universität Hamburg

High-dimensional L2-boosting: Rate of convergence (hybrid talk)

Math. Statistics

Wed, 22.11.23 at 10:00

WIAS 406, 4. OG

Marc Hoffmann Université Paris-Dauphine

On estimating multidimensional diffusions from discrete data

Math. Statistics

Wed, 08.11.23 at 10:00

WIAS Erhard-Schmi...

Sven Wang Humboldt-Universität zu Berlin

Math. Statistics

Wed, 01.11.23 at 10:00

WIAS Erhard-Schmi...

Victor Panaretos EPFL Lausanne

Optimal transport for covariance operators

Abstract.

Covariance operators are fundamental in functional data analysis, providing the canonical means to analyse functional variation via the celebrated Karhunen-Loève expansion. These operators may themselves be subject to variation, for instance in contexts where multiple functional populations are to be compared. Statistical techniques to analyse such variation are intimately linked with the choice of metric on covariance operators, and the intrinsic infinite-dimensionality and of these operators. I will describe how the geometry and tools of optimal transportation can be leveraged to construct natural and effective statistical summaries and inference tools for covariance operators, taking full advantage of the nature of their ambient space. Based on joint work with Valentina Masarotto (Leiden), Leonardo Santoro (EPFL), and Yoav Zemel (EPFL).

Math. Statistics

Wed, 25.10.23 at 10:00

WIAS Erhard-Schmi...

Denis Belomestny Universität Duisburg-Essen

Provable benefits of policy learning from human preferences

Abstract.

A crucial task in reinforcement learning (RL) is a reward construction. It is common in practice that no obvious choice of reward function exists. Thus, a popular approach is to introduce human feedback during training and leverage such feedback to learn a reward function. Among all policy learning methods that use human feedback, preference-based methods have demonstrated substantial success in recent empirical applications such as InstructGPT. In this work, we develop a theory that provably shows the benefits of preference-based methods in tabular and linear MDPs. The main idea of our method is to use KL-regularization with respect to the learned policy to ensure more stable learning.

Research Seminar on Mathematical Statistics 📅