Schedule Mon Tue Wed Thu

BMVC conference papers, supplementary material and video presentations can be found at: BMVC Papers

BMVC workshop papers can be found at: BMVC Workshop Papers

Keynote - Mubarak Shah
09:00 - 10:00
09:00 - 10:00 Title: Privacy Preservation and Bias Mitigation in Human Action Recognition

Abstract: Advances in action recognition have enabled a wide range of real-world applications, e.g. elderly person monitoring systems, autonomous vehicles, sports analysis. As these techniques are being used in the real world two important issues have emerged: privacy and bias. Most of these video understanding applications involve extensive computation, for which a user needs to share the video data to the cloud computation server, where the user also ends up sharing the private visual information like gender, skin color, clothing, background objects etc. Therefore, there is a pressing need for solutions to privacy preserving action recognition. Beyond privacy protection, bias in video understanding can lead to unfair and incorrect decision making. Action recognition models may predict specific actions based on gender stereotypes, such as associating a perceived female subject with hands near her face as applying makeup or brushing hair, even with nothing in hand, or they may suffer from background bias (i.e., inferring actions based on background cues) and foreground bias (i.e., relying on subject appearance). In this talk, I will present our recent work on Privacy Preservation and Bias Mitigation in human action recognition.

https://www.crcv.ucf.edu/person/mubarak-shah/

Room: M1
Keynote Session - Salauddin Sohag
14:00 - 14:30
14:00 - 14:30 Title: Technovative Solutions

Abstract: In this keynote, Sohag will chart the company's journey, recounting major innovations, achievements, and the challenges it has navigated to reach its current position. Sohag will also connect Technovative Solutions' advancements to emerging industry trends, illustrating how the company is strategically positioned to address current and future market demands through cutting-edge solutions and agile methodologies.

https://technovativesolutions.co.uk/

Room: M1
Poster Sessions
10:00 - 11:45 / 15:45 - 17:30
10:00 - 11:45
Papers Presented
9 Federated Learning for Face Recognition via Intra-subject Self-supervised Learning Hansol Kim, Hoyeol choi, Youngjun Kwak
42 Spatial-Temporal NAS for Fast Surgical Segmentation Matthew Lee, Felix John Samuel Bragman, Ricardo Sanchez-Matilla, Imanol Luengo, Danail Stoyanov
76 SagaGAN: Style Applied using Gram matrix Attribution based on StarGAN v2 Yongseon Yoo, Seonggyu Kim, Jong-Min Lee
136 PhysFlow: Skin tone transfer for remote heart rate estimation through conditional normalizing flows Joaquim Comas Martínez, Antonia Alomar, Adria Ruiz, Federico Sukno
145 Privacy-preserving datasets by capturing feature distributions with Conditional VAEs Francesco Di Salvo, David Tafler, Sebastian Doerrich, Christian Ledig
201 Infrared and Visible Image Fusion Using Multi-level Adaptive Fractional Differential Kang Zhang, Xinnian Guo
205 From Black-box to Label-only: a Plug-and-Play Attack Network for Model Inversion Huan Bao, Kaimin Wei, Yao Chen, Hanting Hou, Jinpeng Chen, Yongdong WU
287 A Super-pixel-based Approach to the Stable Interpretation of Neural Networks Shizhan Gong, Jingwei Zhang, Qi Dou, Farzan Farnia
330 Towards Better Zero-Shot Anomaly Detection under Distribution Shift with CLIP Jiyao Gao, Chengxin He, Lei Duan, Jie Zuo
339 FastForensics: Efficient Two-Stream Design for Real-Time Image Manipulation Detection zhangyangxiang, Yuezun Li, Ao Luo, Jiaran Zhou, Junyu Dong
346 Backdoor Defense through Self-Supervised and Generative Learning Ivan Sabolic, Ivan Grubišić, Siniša Šegvić
375 A Novel Divide and Merge Approach for Improved Classification of Functional Data wei zhao, Xiao-Jun Zeng, Chengdong shi, Ching-Hsun Tseng, Yue Chang
420 Layout Free Scene Graph to Image Generation RAMESHWAR MISHRA, A. Subramanyam
427 Lightweight Human Pose Estimation with Enhanced Knowledge Review Hao Xu, Shengye Yan, Wei Zheng
440 Explaining Multi-modal Large Language Models by Analyzing their Vision Perception Loris Giulivi, Giacomo Boracchi
457 LLM-guided Instance-level Image Manipulation with Diffusion U-Net Cross-Attention Maps Andrey Palaev, Adil Khan, Syed M Ahsan Kazmi
492 Multimodal base distributions in conditional flow matching generative models Shane Josias, Willie Brink
508 Semantic Image Synthesis of Anime Characters Based on Conditional Generative Adversarial Networks Xuhui Zhu, feng jiang, Jing Wen, yi wang, qiang gao
510 ML-2SN: A Hybrid Two-Stream System for Sitting Posture Detection Kehang Jia, Gaorui Zhang, Yixuan Yang, Guangwei Huang, Penghuan Wang, Cheng Cheng
532 Input-dependent Input-Prompts for Adapting Frozen Vision Transformers Jochem Loedeman, Maarten C. Stol, Tengda Han, Yuki M Asano
595 A Prototype Unit for Image De-raining using Time-Lapse Data Jaehoon Cho, Minjung Yoo, Jini Yang, Sunok Kim
606 VEMIC: View-aware Entropy model for Multi-view Image Compression Susmija Jabbireddy, Davit Soselia, Max Ehrlich, Christopher Metzler, Amitabh Varshney
609 Guidance-base Diffusion Models for Improving Photoacoustic Image Quality Tatsuhiro Eguchi, Shumpei Takezaki, Mihoko Shimano, Takayuki Yagi, Ryoma Bise
686 TransHuPR: Cross-View Fusion Transformer for Human Pose Estimation Using mmWave Radar Niraj Prakash Kini, Ruey-Horng Shiue, ryan chandra, Wen-Hsiao Peng, Ching-Wen Ma, Jenq-Neng Hwang
692 Spatio-Temporal Transformer with Rotary Position Embedding and Bone Priors for 3D Human Pose Estimation Cheng Chen, Jiang Liu, Liaoyuan Zeng, Fang Duan, Sean McGrath, Tian Dan
707 QUD: Unsupervised Knowledge Distillation for Deep Face Recognition Jan Niklas Kolf, Naser Damer, Fadi Boutros
736 Rectifying Shortcut Learning through Cellular Differentiation in Deep Learning Neurons Hongjing Niu, Hanting Li, Guoping Wu, Bin Li, Feng Zhao
738 CosFairNet:A Parameter-Space based Approach for Bias Free Learning Rajeev Ranjan Dwivedi, Priyadarshini Kumari, Vinod K. Kurmi
740 Frequency Decomposition to Tap the Potential of Single Domain for Generalization Hongjing Niu, Qingyue Yang, Pengfei Xia, Wei Zhang, Bin Li, Feng Zhao
745 Task-Related Feature Enhancement Network for Neuronal Morphology Classification Chunli Sun, Feng Zhao
769 Flexible Graph Convolutional Network for 3D Human Pose Estimation Abu Taib Mohammed Shahjahan, Abdessamad Ben Hamza
828 Multi-Scale Semantic Enrichment and Dual Angular Margin Contrast for Few-Shot Class Incremental Learning Riya Verma, Sukhendu Das
833 Anomaly Detection Based on Semi-Formula Driven Pre-training Dataset to Represent Subtle Difference and Anomaly Score Hiroki Kobayashi, Naoki Murakami, Naoto Hiramatsu, Takahiro Suzuki, Manabu Hashimoto
853 Budget-aware Dynamic Spatially Adaptive Inference Georgios Zampokas, Christos-Savvas Bouganis, Dimitris Tzovaras
857 Enhancing Radiology Report Generation: The Impact of Locally Grounded Vision and Language Training Sergio Sanchez Santiesteban, Muhammad Awais, Yi-Zhe Song, Josef Kittlers
865 APTPose: Anatomy-aware Pre-Training for 3D Human Pose Estimation Qing-Wen Yang, Kai-Wen Duan, Ting-Yi Lu, Kevin Lin, Cheng-Yen Yang, Lijuan Wang, Jenq-Neng Hwang, Shang-Hong Lai
866 A Deep Belief Network Approach to Scalable Compression of Light Field Data for Auto-Stereoscopic Displays Sally Khaidem, Mansi Sharma
902 Knowledge Distillation with Global Filters for Efficient Human Pose Estimation Kaushik Bhargav Sivangi, Fani Deligianni
922 UnSeGArmaNet: Unsupervised Image Segmentation using Graph Neural Networks with Convolutional ARMA Filters Kovvuri Sai Gopal Reddy, Saran Bodduluri, A. Mudit Adityaja, Saurabh Shigwan, Nitin Kumar, Snehasis Mukherjee
584 ATLANTIS: A Framework for Automated Targeted Language-guided Augmentation Training for Robust Image Search Inderjeet Singh, Roman Vainshtein, Alon Zolfi, Asaf Shabtai, Tu Bui, Jonathan Brokman, Omer Hofman, Fumiyoshi Kasahara, Kentaro Tsuji, Hisashi Kojima
670 Leveraging Inductive Bias in ViT for Medical Image Diagnosis Jungmin Ha, Euihyun-yoon, Sungsik Kim, Jinkyu Kim, Jaekoo Lee
863 CLIP with Generative Latent Replay: a Strong Baseline for Incremental Learning Emanuele Frascaroli, Aniello Panariello, Pietro Buzzega, Lorenzo Bonicelli, Angelo Porrello, Simone Calderara
913 Examining the Threat Landscape: Foundation Models and Model Theft Ankita Raj, Deepankar Varma, Chetan Arora
967 CVAM-Pose: Conditional Variational Autoencoder for Multi-Object Monocular Pose Estimation Jianyu Zhao, Wei Quan, Bogdan Matuszewski
Room: Hall 2
15:45 - 17:30
Papers Presented
328 DisCoM-KD: Cross-Modal Knowledge Distillation via Disentanglement Representation and Adversarial Learning Dino Ienco, Cassio Fraga Dantaso
133 MonoGS++: Fast and Accurate Monocular RGB Gaussian SLAM Ren-Wu Li, Wenjing Ke, Dong Li, Lu Tian, Emad Barsoum
787 MSA2Net: Multi-scale Adaptive Attention-guided Network for Medical Image Segmentation Sina Ghorbani Kolahi, Seyed Kamal Chaharsooghi, Toktam Khatibi, Afshin Bozorgpour, Reza Azad, Moein Heidari, Ilker Hacihaliloglu, Dorit Merhof
188 A self-supervised and adversarial approach to hyperspectral demosaicking and RGB reconstruction in surgical imaging Peichao Li, Oscar MacCormac, Jonathan Shapey, Tom Vercauteren
113 Text Removal In E-Commerce Images: A Comparison Of Inpainting Methods Hiya Roy, Bjorn Stenger
486 DAVINCI: A Single-Stage Architecture for Constrained CAD Sketch Inference Ahmet Serdar Karadeniz, Dimitrios Stefanos Mallis, Nesryne Mejri, Kseniya Cherenkova, Anis Kacem, Djamila Aouada
568 Beyond Static and Dynamic Quantization - Hybrid Quantization of Vision Transformers Piotr Kluska, Florian Scheidegger, Cristiano Malossi, Enrique S. Quintana-Orti
597 FADE: Few-shot/zero-shot Anomaly Detection Engine using Large Vision-Language Model Yuanwei Li, Elizaveta Ivanova, Martins Bruveris
299 RT-GS2: Real-Time Generalizable Semantic Segmentation for 3D Gaussian Representations of Radiance Fields Mihnea-Bogdan Jurca, Remco Royen, Ion Giosan, Adrian Munteanu
303 MixMask: Revisiting Masking Strategy for Siamese ConvNets Kirill Vishniakov, Eric P. Xing, Zhiqiang Shen
305 PEEKABOO: Hiding Parts of an Image for Unsupervised Object Localization Hasib Zunair, Abdessamad Ben Hamza
317 EIANet: A Novel Domain Adaptation Approach to Maximize Class Distinction with Neural Collapse Principles Zicheng Pan, Xiaohan Yu, Yongsheng Gao
358 Trimming the Fat: Efficient Compression of 3D Gaussian Splats through Pruning Muhammad Salman Ali, Maryam Qamar, Sung-Ho Bae, Enzo Tartaglione
361 Seg-HGNN: Unsupervised and Light-Weight Image Segmentation with Hyperbolic Graph Neural Networks Debjyoti Mondal, Rahul Mishra, Chandan Kumar Pandey
374 Text-Guided Mixup Towards Long-Tailed Image Categorization Richard Franklin, Jiawei Yao, Deyang Zhong, Qi Qian, Juhua Hu
391 PatchRot: Self-Supervised Training of Vision Transformers by Rotation Prediction Sachin Chhabra, Hemanth Venkateswara, Baoxin Li
425 GLCM-Adapter: Global-Local Content Matching for Few-shot CLIP Adaptation Shuo Wang, Xieenlong, Jinda Lu, Jinghan Li, Yanbin Hao
437 Difflare: Removing Image Lens Flare with Latent Diffusion Models Tianwen Zhou, Qihao Duan, Zitong YU
452 Drone-assisted Road Gaussian Splatting with Cross-view Uncertainty Saining Zhang, Baijun Ye, Xiaoxue Chen, Yuantao Chen, Zongzheng Zhang, Cheng Peng, Yongliang Shi, Hao Zhao
572 Multi-Scope Representation Learning for Causal Relation Discovery with new Challenging Datasets Jiageng Zhu, Hanchen Xie, Jianhua Wu, Mohamed E. Hussein, Mahyar Khayatkhoei, Jiazhi Li, Wael AbdAlmageed
577 AtomGS: Atomizing Gaussian Splatting for High-Fidelity Radiance Field Rong Liu, Rui Xu, Yue Hu, Meida Chen, Andrew Feng
650 Drawing Insights: Sequential Representation Learning in Comics Sam Titarsolej, Neil Cohn, Nanne Van Noord
775 SAE: Single Architecture Ensemble Neural Networks Martin Ferianc, Hongxiang Fan, Miguel R. D. Rodrigues
859 Extract More from Less: Efficient Fine-Grained Visual Recognition in Low-Data Regimes Dmitry Demidov, Abduragim Shtanchaev, Mihail Minkov Mihaylov, Mohammad Almansoori
878 Learning conditionally untangled latent spaces using Fixed Point Iteration Victor Enescu, Hichem Sahbi
945 Gaussian Splatting in Mirrors: Reflection-aware Rendering via Virtual Camera Optimization Zihan Wang, Shuzhe Wang, Matias Turkulainen, Junyuan Fang, Juho Kannala
949 On Partial Prototype Collapse in the DINO Family of Self-Supervised Methods Hariprasath Govindarajan, Per Sidén, Jacob Roll, Fredrik Lindsten
212 InPer: Whole-Process Domain Generalization via Intervention and Perturbation Luyao Tang, Yuxuan Yuan, Chaoqi Chen, Xinghao Ding, Yue Huang
933 SceneSAM: Integrating 2D Labels for Weakly Supervised 3D Scene Understanding Julius Koerner, Dogu Tamgac, David Rozenberszki
936 PV-SLAM: Panoptic Visual SLAM with Loop Closure and Online Bundle Adjustment Ashok Bandyopadhyay, Pranjal Baranwal, Arijit Sur, Rajeev UP
957 Putting the Segment Anything Model to the Test with 3D Knee MRI - A Comparison with State-of-the-Art Performance Oliver Mills, Nishant Ravikumar, Philip G Conaghan, Samuel D Relton
998 MV-Match: Multi-View Matching for Domain-Adaptive Identification of Plant Nutrient Deficiencies Jinhui Yi, Yanan Luo, Marion Deichmann, Gabriel Schaaf, Juergen Gall
929 TrakAthlete4D: Multi-View On-Field Player Position Tracking in Sports Nitish Agarwal, Steven Cadavid
932 Spatiotemporal Vision Transformer for Weakly Supervised Dense Prediction of Dynamic Brain Maps Behnam Kazemivash, Armin Iraji, Sergey M. Plis, Vince Calhoun
939 Deep Learning for GPS-Denied SAR Image Focusing and Vehicle Trajectory Estimation Christopher Beam, Andrew R. Willis, Kevin M Brink
947 Layer-wise Learning of CNNs by Self-tuning Learning Rate and Early Stopping at Each Layer Melika Sadeghi Tabrizi, Ali Karimi, Ahmad Kalhor, Babak N Araabi, Mona Ahmadian
954 Beyond Face Matching: A Facial Traits based Privacy Score for Synthetic Face Datasets Robero Leyva, Praveen Selvaraj, Andrew Elliott, Gregory Epiphaniou, Carsten Maple
986 Adaptive Weighted Co-Learning for Cross-Domain Few-Shot Learning Abdullah Alchihabi, Marzi Heidari, Yuhong Guo
991 iHAST: Integrating Hybrid Attention for Super-Resolution in Spatial Transcriptomics Xi Li, Jing Zhang, Ziheng Duan, Yi Dai, Siwei Xu
1020 Recovering SLAM Tracking Lost by Trifocal Pose Estimation using GPU-HC++ Chiang-Heng Chien, Ahmad Abdelfattah, Benjamin Kimia
927 GazeHELL: Gaze Estimation with Hybrid Encoders and Localised Losses with weighing Shubham Dokania, Vasudev Singh, Shuaib Ahmed
977 Improving Multimodal Learning with Multi-Loss Gradient Modulation Konstantinos Kontras, Christos Chatzichristos, Matthew B. Blaschko, Maarten De Vos
987 Guided Attention for Interpretable Motion Captioning KARIM RADOUANE, Julien Lagarde, Sylvie RANWEZ, Andon Tchechmedjiev
213 Enabling Local Editing in Diffusion Models by Joint and Individual Component Analysis Theodoros Kouzelis, Emmanouil Plitsis, Mihalis Nicolaou, Yannis Panagakis
959 SR+Codec: a Benchmark of Super-Resolution for Video Compression Bitrate Reduction Evgeney Bogatyrev, Ivan Molodetskikh, Dmitriy S. Vatolin
1013 Open-Vocabulary Temporal Action Localization using Multimodal Guidance Akshita Gupta, Aditya Arora, Sanath Narayan, Salman Khan, Fahad Khan, Graham W. Taylor
Room: Hall 2
Oral Session - Real World Applications
11:45 - 13:00
Chair: Carlos Moreno-Garcia 11:45 584
ATLANTIS: A Framework for Automated Targeted Language-guided Augmentation Training for Robust Image Search
Inderjeet Singh, Roman Vainshtein, Alon Zolfi, Asaf Shabtai, Tu Bui, Jonathan Brokman, Omer Hofman, Fumiyoshi Kasahara, Kentaro Tsuji, Hisashi Kojima
12:00 670
Leveraging Inductive Bias in ViT for Medical Image Diagnosis
Jungmin Ha, Euihyun-yoon, Sungsik Kim, Jinkyu Kim, Jaekoo Lee
12:15 863
CLIP with Generative Latent Replay: a Strong Baseline for Incremental Learning
Emanuele Frascaroli, Aniello Panariello, Pietro Buzzega, Lorenzo Bonicelli, Angelo Porrello, Simone Calderara
12:30 1020
Recovering SLAM Tracking Lost by Trifocal Pose Estimation using GPU-HC++
Chiang-Heng Chien, Ahmad Abdelfattah, Benjamin Kimia
12:45 967
CVAM-Pose: Conditional Variational Autoencoder for Multi-Object Monocular Pose Estimation
Jianyu Zhao, Wei Quan, Bogdan Matuszewski
Room: M1
Industrial Session (Sponsored by Technovative Solutions)
14:30 - 15:45
Chair: Chaitanya Kaul 14:30 486
DAVINCI: A Single-Stage Architecture for Constrained CAD Sketch Inference
Ahmet Serdar Karadeniz, Dimitrios Stefanos Mallis, Nesryne Mejri, Kseniya Cherenkova, Anis Kacem, Djamila Aouada
14:45 787
MSA2Net: Multi-scale Adaptive Attention-guided Network for Medical Image Segmentation
Sina Ghorbani Kolahi, Seyed Kamal Chaharsooghi, Toktam Khatibi, Afshin Bozorgpour, Reza Azad, Moein Heidari, Ilker Hacihaliloglu, Dorit Merhof
15:00 188
A self-supervised and adversarial approach to hyperspectral demosaicking and RGB reconstruction in surgical imaging
Peichao Li, Oscar MacCormac, Jonathan Shapey, Tom Vercauteren
15:15 113
Text Removal In E-Commerce Images: A Comparison Of Inpainting Methods
Hiya Roy, Bjorn Stenger
15:30 328
DisCoM-KD: Cross-Modal Knowledge Distillation via Disentanglement Representation and Adversarial Learning
Dino Ienco, Cassio Fraga Dantas
Room: M1

sponsors-logos