RETRO: Reusing teacher projection head for efficient embedding distillation on Lightweight Models via Self-supervised Learning


Khanh-Binh Nguyen (Deakin University), Chae Jung Park (National Cancer Center)
The 35th British Machine Vision Conference

Abstract

Self-supervised learning (SSL) is gaining attention for its ability to learn effective representations with large amounts of unlabeled data. Lightweight models can be distilled from larger self-supervised pre-trained models using contrastive and consistency constraints, but the different sizes of the projection heads make it challenging for students to accurately mimic the teacher's embedding. We propose \textsc{Retro}, which reuses the teacher's projection head for students, and our experimental results demonstrate significant improvements over the state-of-the-art on all lightweight models. For instance, when training EfficientNet-B0 using ResNet-50/101/152 as teachers, our approach improves the linear result on ImageNet to $66.9\%$, $69.3\%$, and $69.8\%$, respectively, with significantly fewer parameters.

Citation

@inproceedings{Nguyen_2024_BMVC,
author    = {Khanh-Binh Nguyen and Chae Jung Park},
title     = {RETRO: Reusing teacher projection head for  efficient embedding distillation on  Lightweight Models via Self-supervised  Learning},
booktitle = {35th British Machine Vision Conference 2024, {BMVC} 2024, Glasgow, UK, November 25-28, 2024},
publisher = {BMVA},
year      = {2024},
url       = {https://papers.bmvc2024.org/0424.pdf}
}


Copyright © 2024 The British Machine Vision Association and Society for Pattern Recognition
The British Machine Vision Conference is organised by The British Machine Vision Association and Society for Pattern Recognition. The Association is a Company limited by guarantee, No.2543446, and a non-profit-making body, registered in England and Wales as Charity No.1002307 (Registered Office: Dept. of Computer Science, Durham University, South Road, Durham, DH1 3LE, UK).

Imprint | Data Protection