Opening Remarks 

Charline Le Lan (Oxford) 
Invited Talk: On the use of density models for anomaly detection
Thanks to the tractability of their likelihood, some deep generative models show promise for seemingly straightforward but important applications like anomaly detection. However, the likelihood values empirically attributed to anomalies conflict with the expectations these proposed applications suggest.
This talk will review some of these densitybased anomaly detection methods that have widely been used in the machine learning literature and question the expectation that density estimation should always enable anomaly detection.
In particular, we will examine the extent of the issues that can arise from these practices and look at some practical consequences.
Finally, the talk will also cover some promising directions for reliably detecting anomalies through density, in particular highlighting the importance of prior knowledge.
This project was joint work with Laurent Dinh.
 
Yingzhen Li (ICL) 
Invited Talk: Inference with scores: slices, diffusions and flows
In this talk I will discuss our recent efforts on developing Stein's method for approximate inference and model learning.
I will start from an introduction of the score matching and Stein discrepancy, with a comparison to KL divergence based approaches.
Then I will discuss our recent works that tries to address the curse of dimensionality issues in existing Stein discrepancies.
The idea is based on slicing, and an important step within the approach is to measure the score difference in a different basis of $\mathbb{R}^d$.
Lastly we extend the basis modification idea to measuring score difference with local basis, and discuss an ongoing work that aims to connect this approach with normalising flows.
This talk will also feature Wenbo Gong, a student collaborator with me on theory & applications of Stein’s method.
 
Poster Spotlights I 

Poster Session I Poster Room 1  Poster Room 2  Presenting papers 

Phiala Shanahan (MIT) 
Invited Talk: Flow models for theoretical particle and nuclear physics
I will discuss opportunities for machine learning, in particular approaches based on normalizing flows, to accelerate firstprinciples lattice quantum field theory calculations in particle and nuclear physics.
Particular challenges in this context include incorporating complex (gauge) symmetries into model architectures, and scaling models to the large number of degrees of freedom of stateoftheart numerical studies.
I will show the results of proofofprinciple studies that demonstrate that sampling from generative models can be orders of magnitude more efficient than traditional Hamiltonian/hybrid Monte Carlo approaches in this context.
 
Marcus Brubaker (York) 
Invited Talk: Wavelet Flow: Fast Training of High Resolution Normalizing Flows
This talk will introduce Wavelet Flow, a novel normalizing flow architecture which explicitly represents the scalespace structure of signals in the architecture of the normalizing flow through the use of wavelets.
The result is a generative model which automatically includes models of images at resolutions small than that used for training and is able to perform superresolution with not additional effort.
Further, because of the structure of the architecture, each scale can be trained completely independently, leading to significant improvements in training efficiency and enabling the first reported normalizing flow model for 1024x1024 resolution images.
This project is joint work with Jason Yu and Kosta Derpanis.


Break 

Stefano Ermon (Stanford) 
Invited Talk: Maximum Likelihood Training of ScoreBased Diffusion Models
Existing generative models are typically based on explicit representations of probability distributions (e.g., autoregressive or VAEs) or implicit sampling procedures (e.g., GANs).
We propose an alternative approach based on modeling directly the vector field of gradients of the data distribution (scores).
Our framework allows flexible architectures, requires no sampling during training or the use of adversarial training methods.
Additionally, scorebased generative models enable exact likelihood evaluation through connections with normalizing flows.
We produce samples comparable to GANs, achieving new stateoftheart inception scores, and competitive likelihoods on image datasets.


AnnKathrin Dombrowski 
Contributed Talk I: Diffeomorphic Explanations with Normalizing Flows
Normalizing flows are diffeomorphisms which are parameterized by neural networks.
As a result, they can induce coordinate transformations in the tangent space of the data manifold.
In this work, we demonstrate that such transformations can be used to generate interpretable explanations for decisions of neural networks.
More specifically, we perform gradient ascent in the base space of the flow to generate counterfactuals which are classified with great confidence as a specified target class.
We analyze this generation process theoretically using Riemannian differential geometry and establish a rigorous theoretical connection between gradient ascent on the data manifold and in the base space of the flow.
 
Maximilian Nickel (Facebook) 
Invited Talk: Modeling SpatioTemporal Events via Normalizing Flows  
Aditya Ramesh (OpenAI) 
Invited Talk: TBA  
Marylou Gabrié 
Contributed Talk II: Efficient Bayesian Sampling Using Normalizing Flows to Assist Markov Chain Monte Carlo Methods
Normalizing flows can generate complex target distributions and thus show promise in many applications in Bayesian statistics as an alternative or complement to MCMC for sampling posteriors.
Since no data set from the target posterior distribution is available beforehand, the flow is typically trained using the reverse KullbackLeibler (KL) divergence that only requires samples from a base distribution.
This strategy may perform poorly when the posterior is complicated and hard to sample with an untrained normalizing flow.
Here we explore a distinct training strategy, using the direct KL divergence as loss, in which samples from the posterior are generated by
(i) assisting a local MCMC algorithm on the posterior with a normalizing flow to accelerate its mixing rate and
(ii) using the data generated this way to train the flow. The method only requires a limited amount of a priori input about the posterior, and can be used to estimate the evidence required for model validation, as we illustrate on examples.
 
Poster Spotlights II 

Poster Session II Poster Room 1  Poster Room 2  Presenting papers 