Multimodal base distributions in conditional flow matching generative models


Shane Josias (University of Stellenbosch), Willie Brink (Stellenbosch University)
The 35th British Machine Vision Conference

Abstract

Normalising flows are a flexible class of generative models that provide exact likelihoods, and are often trained through maximum likelihood estimation. Recent work suggests that these models can assign undesirably high likelihood to out-of-distribution data, questioning their reliability for applications where likelihoods are important (e.g. outlier detection). We show that continuous normalising flows trained with the conditional flow matching objective, instead of maximum likelihood, also provide unreliable likelihoods. We then argue for and investigate the utility of incorporating multimodality in the base distribution, through a Gaussian mixture model (GMM) centred at the empirical means of a target distribution’s modes. The GMM has an additional benefit in that samples can be generated from specified modes. We find that the GMM base distribution leads to performance comparable to a standard (unimodal) base distribution for in- and out-of-distribution likelihoods, at little to no extra cost in training and inference times. Interestingly, samples generated by models that use a GMM base have higher precision but significantly lower recall compared to the standard base. We also find support for the hypothesis that continuous flows depend too strongly on pixel values, rather than semantic content.

Citation

@inproceedings{Josias_2024_BMVC,
author    = {Shane Josias and Willie Brink},
title     = {Multimodal base distributions in conditional flow matching generative models},
booktitle = {35th British Machine Vision Conference 2024, {BMVC} 2024, Glasgow, UK, November 25-28, 2024},
publisher = {BMVA},
year      = {2024},
url       = {https://papers.bmvc2024.org/0492.pdf}
}


Copyright © 2024 The British Machine Vision Association and Society for Pattern Recognition
The British Machine Vision Conference is organised by The British Machine Vision Association and Society for Pattern Recognition. The Association is a Company limited by guarantee, No.2543446, and a non-profit-making body, registered in England and Wales as Charity No.1002307 (Registered Office: Dept. of Computer Science, Durham University, South Road, Durham, DH1 3LE, UK).

Imprint | Data Protection