On Partial Prototype Collapse in the DINO Family of Self-Supervised Methods


Hariprasath Govindarajan (Qualcomm Inc, QualComm), Per Sidén (Linkoping University), Jacob Roll (Qualcomm Inc, QualComm), Fredrik Lindsten (Linkoping University)
The 35th British Machine Vision Conference

Abstract

A prominent self-supervised learning paradigm is to model the representations as clusters, or more generally as a mixture model. Learning to map the data samples to compact representations and fitting the mixture model simultaneously leads to the representation collapse problem. Regularizing the distribution of data points over the clusters is the prevalent strategy to avoid this issue. While this is sufficient to prevent full representation collapse, we show that a partial prototype collapse problem still exists in the DINO family of methods, that leads to significant redundancies in the prototypes. Such prototype redundancies serve as shortcuts for the method to achieve a marginal latent class distribution that matches the prescribed prior. We show that by encouraging the model to use diverse prototypes, the partial prototype collapse can be mitigated. We study the downstream impact of effective utilization of the prototypes during pre-training. We show that it enables the methods to learn more fine-grained clusters, encouraging more informative representations. We demonstrate that this is especially beneficial when pre-training on a long-tailed fine-grained dataset.

Citation

@inproceedings{Govindarajan_2024_BMVC,
author    = {Hariprasath Govindarajan and Per Sidén and Jacob Roll and Fredrik Lindsten},
title     = {On Partial Prototype Collapse in the DINO Family of Self-Supervised Methods},
booktitle = {35th British Machine Vision Conference 2024, {BMVC} 2024, Glasgow, UK, November 25-28, 2024},
publisher = {BMVA},
year      = {2024},
url       = {https://papers.bmvc2024.org/0949.pdf}
}


Copyright © 2024 The British Machine Vision Association and Society for Pattern Recognition
The British Machine Vision Conference is organised by The British Machine Vision Association and Society for Pattern Recognition. The Association is a Company limited by guarantee, No.2543446, and a non-profit-making body, registered in England and Wales as Charity No.1002307 (Registered Office: Dept. of Computer Science, Durham University, South Road, Durham, DH1 3LE, UK).

Imprint | Data Protection