Extract More from Less: Efficient Fine-Grained Visual Recognition in Low-Data Regimes


Dmitry Demidov (Mohamed bin Zayed University of Artificial Intelligence), Abduragim Shtanchaev (Mohamed bin Zayed University of Artificial Intelligence), Mihail Minkov Mihaylov (Mohamed bin Zayed University of Artificial Intelligence), Mohammad Almansoori (Mohamed bin Zayed University of Artificial Intelligence)
The 35th British Machine Vision Conference

Abstract

The emerging task of fine-grained image classification in low-data regimes assumes the presence of low inter-class variance and large intra-class variation along with a highly limited amount of training samples per class. However, traditional ways of separately dealing with fine-grained categorisation and extremely scarce data may be inefficient under both these harsh conditions presented together. In this paper, we present a novel framework, called AD-Net, aiming to enhance deep neural network performance on this challenge by leveraging the power of Augmentation and Distillation techniques. Specifically, our approach is designed to refine learned features through self-distillation on augmented samples, mitigating harmful overfitting. We conduct comprehensive experiments on popular fine-grained image classification benchmarks where our AD-Net demonstrates consistent improvement over traditional fine-tuning and state-of-the-art low-data techniques. Remarkably, with the smallest data available, our framework shows an outstanding relative accuracy increase of up to 45 % compared to standard ResNet-50 and up to 27 % compared to the closest SOTA runner-up. We emphasise that our approach is practically architecture-independent and adds zero extra cost at inference time. Additionally, we provide an extensive study on the impact of every framework’s component, highlighting the importance of each in achieving optimal performance. Source code and trained models are publicly available at github.com/demidovd98/fgic_lowd.

Citation

@inproceedings{Demidov_2024_BMVC,
author    = {Dmitry Demidov and Abduragim Shtanchaev and Mihail Minkov Mihaylov and Mohammad Almansoori},
title     = {Extract More from Less: Efficient Fine-Grained Visual Recognition in Low-Data Regimes},
booktitle = {35th British Machine Vision Conference 2024, {BMVC} 2024, Glasgow, UK, November 25-28, 2024},
publisher = {BMVA},
year      = {2024},
url       = {https://papers.bmvc2024.org/0859.pdf}
}


Copyright © 2024 The British Machine Vision Association and Society for Pattern Recognition
The British Machine Vision Conference is organised by The British Machine Vision Association and Society for Pattern Recognition. The Association is a Company limited by guarantee, No.2543446, and a non-profit-making body, registered in England and Wales as Charity No.1002307 (Registered Office: Dept. of Computer Science, Durham University, South Road, Durham, DH1 3LE, UK).

Imprint | Data Protection