This workshop aims to bring together researchers in stochastic analysis, statistics and theoretical machine learning for an exchange of ideas at the forefront of the field. The workshop will coincide with the visit of Professor Gerard Ben Arous, a leading expert in stochastic analysis and high-dimensional statistics, whose insights into deep learning theory offer an exceptional opportunity for meaningful collaborations. The event will feature a series of presentations and discussions around the mathematical underpinnings of modern machine learning techniques, including topics such as:
- Theoretical analysis of deep learning architectures
- High-dimensional statistics and learning theory
- Diffusion models
- Connections between stochastic differential equations and neural networks
Confirmed speakers
- Gerard Ben Arous (Courant Institute, NYU)
- Arnaud Doucet (DeepMind)
- Andrew Duncan (Imperial)
- Yingzhen Li (Imperial)
- Deniz Akyildiz (Imperial)
- Nicola M. Cirone (Imperial)
- James Foster (Bath)
- Will Turner (Imperial)
- Harald Oberhauser (Oxford)
Schedule
9.15 – 9.30 Registration and Welcome
9.30 – 10.00 Andrew Duncan
10.00 – 10.30 Nicola M. Cirone
10.30 – 11.00 Deniz Akyildiz
11.00 – 11.30 Coffee break
11.30 – 12.15 Gerard Ben Arous
12.15 – 12.45 Will Turner
12.45 – 14.00 Lunch @ The Works, Sir Michael Uren Building, London W12 0BZ (by invitation only)
14.00 – 14.45 Arnaud Doucet
14.45 – 15.15 James Foster
15.15 – 15.45 Coffee break
15.45 – 16.30 Harald Oberhauser
16.30 – 17.00 Yingzhen Li
18.30 – Conference dinner @ The Broadcaster, 89 Wood Ln, London W12 7FX (by invitation only)
Titles and abstracts
Title: Diffusion-based Learning of Latent Variable Models
Abstract: In this talk, I will summarize recent progress and challenges in maximum marginal likelihood estimation (MMLE) for learning latent variable models (LVMs) – focusing on the methods based on Langevin diffusions. I will first introduce the problem and the necessary background on Langevin diffusions, together with recent results on Langevin-based MMLE estimators, detailing the interacting particle Langevin algorithm (IPLA) which is a recent Langevin-based MMLE method with explicit theoretical guarantees akin to Langevin Monte Carlo methods. I will then move on to outline recent progress, specifically accelerated variants, and methods for MMLE in nondifferentiable statistical models with convergence and complexity results. Finally, if time permits, I will talk about the application of IPLA to inverse problems.
Title: Efficient, Accurate and Stable Gradients for Neural Differential Equations
Abstract: Neural differential equations (NDEs) sit at the intersection of two dominant modelling paradigms – neural networks and differential equations. One of their features is that they can be trained with a small memory footprint through adjoint equations. This can be helpful in high-dimensional applications since the memory usage of standard backpropagation scales linearly with depth (or, in the NDE case, the number of steps taken by the solver). However, adjoint equations have seen little use in practice as the resulting gradients are often inaccurate. Fortunately, there has emerged a class of numerical methods which allow NDEs to be trained using gradients that are both accurate and memory efficient. These solvers are known as “algebraically reversible” and produce numerical solutions which can be reconstructed backwards in time. Whilst algebraically reversible solvers have seen some success in large-scale applications, they are known to have stability issues. In this talk, we propose a methodology for constructing reversible NDE solvers from non-reversible ones. We show that the resulting reversible solvers converge in the ODE setting, can achieve high order convergence, and even have stability regions. We conclude with a few examples demonstrating the memory efficiency of our approach. Joint work with Samuel McCallum.