SAM Helps SSL: Mask-guided Attention Bias for Self-supervised Learning


Kensuke Taguchi (Kyocera Corporation), Takehiko Kawai (Kyocera Corporation), Wataru Imaeda (Kyocera Corporation), Hironobu Fujiyoshi (DENSO CORPORATION)
The 35th British Machine Vision Conference

Abstract

The vision transformer(ViT) and self-supervised learning(SSL) are key technologies for accelerating data scalability, contributing to the emergence of a foundation model in computer vision. In this paper, we focus on the potential of masks generated by the Segment Anything Model(SAM), a foundation model for image segmentation, and propose a novel method for SSL, named ``mask-guided attention bias''. Mask-guided attention bias is designed to encode SAM-generated masks, which are spatially and semantically decomposed information about an image. It is applied to the self-attention of ViT as guidance for an SSL process. Since self-attention can capture a wide range of spatial dependencies, mask-guided attention bias effectively adds spatial and semantic guidance to various forms of SSL, thus improving the decodability and labeling efficiency of SSL representations. We show that our method improves the accuracy of linear probing, few-shot learning, and fine-tuning in general. In particular, our method achieves 81.3\% linear probing accuracy (outperforming vanilla MAE by 3.2\%) and 89.5\% fine-tune accuracy (outperforming vanilla DINO by 0.4\%) on ImageNet100.

Citation

@inproceedings{Taguchi_2024_BMVC,
author    = {Kensuke Taguchi and Takehiko Kawai and Wataru Imaeda and Hironobu Fujiyoshi},
title     = {SAM Helps SSL: Mask-guided Attention Bias for Self-supervised Learning},
booktitle = {35th British Machine Vision Conference 2024, {BMVC} 2024, Glasgow, UK, November 25-28, 2024},
publisher = {BMVA},
year      = {2024},
url       = {https://papers.bmvc2024.org/0240.pdf}
}


Copyright © 2024 The British Machine Vision Association and Society for Pattern Recognition
The British Machine Vision Conference is organised by The British Machine Vision Association and Society for Pattern Recognition. The Association is a Company limited by guarantee, No.2543446, and a non-profit-making body, registered in England and Wales as Charity No.1002307 (Registered Office: Dept. of Computer Science, Durham University, South Road, Durham, DH1 3LE, UK).

Imprint | Data Protection