Workshop on Recent Developments in Theoretical Machine Learning

Event
Workshop

Date: Monday, 13 January 2025

Time: 9.00 - 17.00 GMT
Location: LRT608A (Lecture Theatre on Level 6), Translation and Innovation Hub (I-HUB)
Campus: White City Campus
Accessibility information

Audience: Open to all
Cost: £10 GBP
Tickets: Tickets to be purchased in advance

For further details:

Contact: Cristopher Salvi

This workshop aims to bring together researchers in stochastic analysis, statistics and theoretical machine learning for an exchange of ideas at the forefront of the field. The workshop will coincide with the visit of Professor Gerard Ben Arous, a leading expert in stochastic analysis and high-dimensional statistics, whose insights into deep learning theory offer an exceptional opportunity for meaningful collaborations. The event will feature a series of presentations and discussions around the mathematical underpinnings of modern machine learning techniques, including topics such as:

Theoretical analysis of deep learning architectures
High-dimensional statistics and learning theory
Diffusion models
Connections between stochastic differential equations and neural networks

Confirmed speakers

Gerard Ben Arous (Courant Institute, NYU)
Arnaud Doucet (DeepMind)
Andrew Duncan (Imperial)
Yingzhen Li (Imperial)
Deniz Akyildiz (Imperial)
Nicola M. Cirone (Imperial)
James Foster (Bath)
Will Turner (Imperial)
Harald Oberhauser (Oxford)

Schedule

9.15 – 9.30 Registration and Welcome

9.30 – 10.00 Andrew Duncan

10.00 – 10.30 Nicola M. Cirone

10.30 – 11.00 Deniz Akyildiz

11.00 – 11.30 Coffee break

11.30 – 12.15 Gerard Ben Arous

12.15 – 12.45 Will Turner

12.45 – 14.00 Lunch @ The Works, Sir Michael Uren Building, London W12 0BZ (by invitation only)

14.00 – 14.45 Arnaud Doucet

14.45 – 15.15 James Foster

15.15 – 15.45 Coffee break

15.45 – 16.30 Harald Oberhauser

16.30 – 17.00 Yingzhen Li

18.30 – Conference dinner @ The Broadcaster, 89 Wood Ln, London W12 7FX (by invitation only)

Titles and abstracts

Deniz Akyildiz

Title: Diffusion-based Learning of Latent Variable Models

Abstract: In this talk, I will summarize recent progress and challenges in maximum marginal likelihood estimation (MMLE) for learning latent variable models (LVMs) – focusing on the methods based on Langevin diffusions. I will first introduce the problem and the necessary background on Langevin diffusions, together with recent results on Langevin-based MMLE estimators, detailing the interacting particle Langevin algorithm (IPLA) which is a recent Langevin-based MMLE method with explicit theoretical guarantees akin to Langevin Monte Carlo methods. I will then move on to outline recent progress, specifically accelerated variants, and methods for MMLE in nondifferentiable statistical models with convergence and complexity results. Finally, if time permits, I will talk about the application of IPLA to inverse problems.

Yingzhen Li

Title: On the Identifiability of Switching Dynamical Systems

Abstract: One of my research dreams is to build a high-resolution video generation model that enables granularity controls in e.g., the scene appearance and the interactions between objects. I tried, and then realised the need of me inventing deep learning tricks for this goal is due to the issue of non-identifiability in my sequential deep generative models. In this talk I will discuss our research towards developing identifiable deep generative models in sequence modelling, and share some recent and on-going works regarding switching dynamic models. In particular, we first show conditions of identifiability for Markov Switching Models (or auto-regressive HMMs) with non-linear transitions, with a new proof technique different from the algebraic approach of the seminal HMM identifiability work by Allman et al. 2009. Then we lift the Markov Switching Model to latent space and leverage existing results to show identifiability. If time permits, I will also show recent developments that build in more flexible structures in the latent switching dynamical prior.

James Foster

Title: Efficient, Accurate and Stable Gradients for Neural Differential Equations

Abstract: Neural differential equations (NDEs) sit at the intersection of two dominant modelling paradigms – neural networks and differential equations. One of their features is that they can be trained with a small memory footprint through adjoint equations. This can be helpful in high-dimensional applications since the memory usage of standard backpropagation scales linearly with depth (or, in the NDE case, the number of steps taken by the solver). However, adjoint equations have seen little use in practice as the resulting gradients are often inaccurate. Fortunately, there has emerged a class of numerical methods which allow NDEs to be trained using gradients that are both accurate and memory efficient. These solvers are known as “algebraically reversible” and produce numerical solutions which can be reconstructed backwards in time. Whilst algebraically reversible solvers have seen some success in large-scale applications, they are known to have stability issues. In this talk, we propose a methodology for constructing reversible NDE solvers from non-reversible ones. We show that the resulting reversible solvers converge in the ODE setting, can achieve high order convergence, and even have stability regions. We conclude with a few examples demonstrating the memory efficiency of our approach. Joint work with Samuel McCallum.

Getting here

See all events