This workshop aims to bring together researchers in stochastic analysis, statistics and theoretical machine learning for an exchange of ideas at the forefront of the field. The workshop will coincide with the visit of Professor Gerard Ben Arous, a leading expert in stochastic analysis and high-dimensional statistics, whose insights into deep learning theory offer an exceptional opportunity for meaningful collaborations. The event will feature a series of presentations and discussions around the mathematical underpinnings of modern machine learning techniques, including topics such as:

  • Theoretical analysis of deep learning architectures
  • High-dimensional statistics and learning theory
  • Diffusion models
  • Connections between stochastic differential equations and neural networks

Confirmed speakers

Schedule

9.15 – 9.30 Registration and Welcome

9.30 – 10.00  Andrew Duncan

10.00 – 10.30  Nicola M. Cirone 

 10.30 – 11.00 Deniz Akyildiz

 11.00 – 11.30 Coffee break

 11.30 – 12.15  Gerard Ben Arous 

 12.15 – 12.45  Will Turner

 12.45 – 14.00 Lunch @ The Works, Sir Michael Uren Building, London W12 0BZ (by invitation only)

 14.00 – 14.45   Arnaud Doucet 

 14.45 – 15.15  James Foster

 15.15 – 15.45 Coffee break

 15.45 – 16.30  Harald Oberhauser

 16.30 – 17.00  Yingzhen Li

18.30 – Conference dinner @ The Broadcaster, 89 Wood Ln, London W12 7FX (by invitation only)

Titles and abstracts


Deniz Akyildiz

Title: Diffusion-based Learning of Latent Variable Models 

Abstract: In this talk, I will summarize recent progress and challenges in maximum marginal likelihood estimation (MMLE) for learning latent variable models (LVMs) – focusing on the methods based on Langevin diffusions. I will first introduce the problem and the necessary background on Langevin diffusions, together with recent results on Langevin-based MMLE estimators, detailing the interacting particle Langevin algorithm (IPLA) which is a recent Langevin-based MMLE method with explicit theoretical guarantees akin to Langevin Monte Carlo methods. I will then move on to outline recent progress, specifically accelerated variants, and methods for MMLE in nondifferentiable statistical models with convergence and complexity results. Finally, if time permits, I will talk about the application of IPLA to inverse problems.


Yingzhen Li

Title: On the Identifiability of Switching Dynamical Systems
Abstract: One of my research dreams is to build a high-resolution video generation model that enables granularity controls in e.g., the scene appearance and the interactions between objects. I tried, and then realised the need of me inventing deep learning tricks for this goal is due to the issue of non-identifiability in my sequential deep generative models. In this talk I will discuss our research towards developing identifiable deep generative models in sequence modelling, and share some recent and on-going works regarding switching dynamic models. In particular, we first show conditions of identifiability for Markov Switching Models (or auto-regressive HMMs) with non-linear transitions, with a new proof technique different from the algebraic approach of the seminal HMM identifiability work by Allman et al. 2009. Then we lift the Markov Switching Model to latent space and leverage existing results to show identifiability. If time permits, I will also show recent developments that build in more flexible structures in the latent switching dynamical prior.

Title: Efficient, Accurate and Stable Gradients for Neural Differential Equations

Abstract: Neural differential equations (NDEs) sit at the intersection of two dominant modelling paradigms – neural networks and differential equations. One of their features is that they can be trained with a small memory footprint through adjoint equations. This can be helpful in high-dimensional applications since the memory usage of standard backpropagation scales linearly with depth (or, in the NDE case, the number of steps taken by the solver). However, adjoint equations have seen little use in practice as the resulting gradients are often inaccurate. Fortunately, there has emerged a class of numerical methods which allow NDEs to be trained using gradients that are both accurate and memory efficient. These solvers are known as “algebraically reversible” and produce numerical solutions which can be reconstructed backwards in time. Whilst algebraically reversible solvers have seen some success in large-scale applications, they are known to have stability issues. In this talk, we propose a methodology for constructing reversible NDE solvers from non-reversible ones. We show that the resulting reversible solvers converge in the ODE setting, can achieve high order convergence, and even have stability regions. We conclude with a few examples demonstrating the memory efficiency of our approach. Joint work with Samuel McCallum.

Getting here