ATLANTIS: A Framework for Automated Targeted Language-guided Augmentation Training for Robust Image Search


Inderjeet Singh (Fujitsu Research of Europe Limited), Roman Vainshtein (Fujitsu Research and Development Center Co. Ltm.), Alon Zolfi (Ben Gurion University of the Negev), Asaf Shabtai (Ben-Gurion University of the Negev), Tu Bui (Fujitsu Research and Development Center Co. Ltm.), Jonathan Brokman (Technion - Israel Institute of Technology, Technion - Israel Institute of Technology), Omer Hofman (Fujitsu Research and Development Center Co. Ltm.), Fumiyoshi Kasahara (Fujitsu Research and Development Center Co. Ltm.), Kentaro Tsuji (Fujitsu Research and Development Center Co. Ltm.), Hisashi Kojima (Fujitsu Research and Development Center Co. Ltm.)
The 35th British Machine Vision Conference

Abstract

Recent content-based image retrieval (CBIR) systems dominantly rely on deep metric learning (DML) for extracting representative image features; however, their generalisation is limited by the dependency on large volumes of high-quality, diverse and unbiased training data. We introduce ATLANTIS, a framework with a novel methodology that automatically identifies training data deficiencies and then performs targeted and controlled synthetic data augmentation. Our framework comprises a Data Insight Generator for extracting contextual insights and the deficiencies from the existing training data, an Augmentation Protocol Selector to define dynamic, context-aware augmentation strategies, and an Outlier Removal and Diversity Control module to control the synthetic data's semantic coherence and diversity. ATLANTIS leverages image-to-text transformations, large language models, and text-to-image synthesis to iteratively generate and refine synthetic data while ensuring alignment with the original data and augmenting training data diversity in a controlled manner. Our comprehensive empirical evaluations reveal that ATLANTIS surpasses state-of-art in challenging domain-scarce and class-imbalanced data scenarios while also enhancing adversarial robustness, thus underscoring the generalisation gains. ATLANTIS also sets new benchmarks in standard balanced DML tasks, thereby establishing it as a robust and scalable framework for CBIR.

Citation

@inproceedings{Singh_2024_BMVC,
author    = {Inderjeet Singh and Roman Vainshtein and Alon Zolfi and Asaf Shabtai and Tu Bui and Jonathan Brokman and Omer Hofman and Fumiyoshi Kasahara and Kentaro Tsuji and Hisashi Kojima},
title     = {ATLANTIS: A Framework for Automated Targeted Language-guided Augmentation Training for Robust Image Search},
booktitle = {35th British Machine Vision Conference 2024, {BMVC} 2024, Glasgow, UK, November 25-28, 2024},
publisher = {BMVA},
year      = {2024},
url       = {https://papers.bmvc2024.org/0584.pdf}
}


Copyright © 2024 The British Machine Vision Association and Society for Pattern Recognition
The British Machine Vision Conference is organised by The British Machine Vision Association and Society for Pattern Recognition. The Association is a Company limited by guarantee, No.2543446, and a non-profit-making body, registered in England and Wales as Charity No.1002307 (Registered Office: Dept. of Computer Science, Durham University, South Road, Durham, DH1 3LE, UK).

Imprint | Data Protection