Schedule Mon Tue Wed Thu
Poster3
Chair: TBC
10:00 - 11:45 9 Federated Learning for Face Recognition via Intra-subject Self-supervised Learning
42 Spatial-Temporal NAS for Fast Surgical Segmentation
76 SagaGAN: Style Applied using Gram matrix Attribution based on StarGAN v2
136 PhysFlow: Skin tone transfer for remote heart rate estimation through conditional normalizing flows
145 Privacy-preserving datasets by capturing feature distributions with Conditional VAEs
201 Infrared and Visible Image Fusion Using Multi-level Adaptive Fractional Differential
205 From Black-box to Label-only: a Plug-and-Play Attack Network for Model Inversion
287 A Super-pixel-based Approach to the Stable Interpretation of Neural Networks
330 Towards Better Zero-Shot Anomaly Detection under Distribution Shift with CLIP
339 FastForensics: Efficient Two-Stream Design for Real-Time Image Manipulation Detection
346 Backdoor Defense through Self-Supervised and Generative Learning
375 A Novel Divide and Merge Approach for Improved Classification of Functional Data
420 Layout Free Scene Graph to Image Generation
427 Lightweight Human Pose Estimation with Enhanced Knowledge Review
440 Explaining Multi-modal Large Language Models by Analyzing their Vision Perception
457 LLM-guided Instance-level Image Manipulation with Diffusion U-Net Cross-Attention Maps
492 Multimodal base distributions in conditional flow matching generative models
508 Semantic Image Synthesis of Anime Characters Based on Conditional Generative Adversarial Networks
510 ML-2SN: A Hybrid Two-Stream System for Sitting Posture Detection
532 Input-dependent Input-Prompts for Adapting Frozen Vision Transformers
595 A Prototype Unit for Image De-raining using Time-Lapse Data
606 VEMIC: View-aware Entropy model for Multi-view Image Compression
609 Guidance-base Diffusion Models for Improving Photoacoustic Image Quality
686 TransHuPR: Cross-View Fusion Transformer for Human Pose Estimation Using mmWave Radar
692 Spatio-Temporal Transformer with Rotary Position Embedding and Bone Priors for 3D Human Pose Estimation
707 UKD: Unsupervised Knowledge Distillation for Face Recognition
736 Rectifying Shortcut Learning through Cellular Differentiation in Deep Learning Neurons
738 CosFairNet:A Parameter-Space based Approach for Bias Free Learning
740 Frequency Decomposition to Tap the Potential of Single Domain for Generalization
745 Task-Related Feature Enhancement Network for Neuronal Morphology Classification
769 Flexible Graph Convolutional Network for 3D Human Pose Estimation
828 Multi-Scale Semantic Enrichment and Dual Angular Margin Contrast for Few-Shot Class Incremental Learning
833 Anomaly Detection Based on Semi-Formula Driven Pre-training Dataset to Represent Subtle Difference and Anomaly Score
853 Budget-aware Dynamic Spatially Adaptive Inference
857 Enhancing Radiology Report Generation: The Impact of Locally Grounded Vision and Language Training
865 APTPose: Anatomy-aware Pre-Training for 3D Human Pose Estimation
866 A Deep Belief Network Approach to Scalable Compression of Light Field Data for Auto-Stereoscopic Displays
902 Knowledge Distillation with Global Filters for Efficient Human Pose Estimation
922 UnSeGArmaNet: Unsupervised Image Segmentation using Graph Neural Networks with Convolutional ARMA Filters
584 ATLANTIS: A Framework for Automated Targeted Language-guided Augmentation Training for Robust Image Search
670 Leveraging Inductive Bias in ViT for Medical Image Diagnosis
863 CLIP with Generative Latent Replay: a Strong Baseline for Incremental Learning
913 Examining the Threat Landscape: Foundation Models and Model Theft
967 CVAM-Pose: Conditional Variational Autoencoder for Multi-Object Monocular Pose Estimation
Oral 3
Chair: TBC
11:45 584 ATLANTIS: A Framework for Automated Targeted Language-guided Augmentation Training for Robust Image Search
12:00 670 Leveraging Inductive Bias in ViT for Medical Image Diagnosis
12:15 863 CLIP with Generative Latent Replay: a Strong Baseline for Incremental Learning
12:30 913 Examining the Threat Landscape: Foundation Models and Model Theft
12:45 967 CVAM-Pose: Conditional Variational Autoencoder for Multi-Object Monocular Pose Estimation
Industry
Chair: TBC
14:30 133 MonoGS++: Fast and Accurate Monocular RGB Gaussian SLAM
14:45 787 MSA2Net: Multi-scale Adaptive Attention-guided Network for Medical Image Segmentation
15:00 188 A self-supervised and adversarial approach to hyperspectral demosaicking and RGB reconstruction in surgical imaging
15:15 113 Text Removal In E-Commerce Images: A Comparison Of Inpainting Methods
15:30 328 DisCoM-KD: Cross-Modal Knowledge Distillation via Disentanglement Representation and Adversarial Learning
Poster 4
Chair: TBC
15:45 - 17:30 328 DisCoM-KD: Cross-Modal Knowledge Distillation via Disentanglement Representation and Adversarial Learning
133 MonoGS++: Fast and Accurate Monocular RGB Gaussian SLAM
787 MSA2Net: Multi-scale Adaptive Attention-guided Network for Medical Image Segmentation
188 A self-supervised and adversarial approach to hyperspectral demosaicking and RGB reconstruction in surgical imaging
113 Text Removal In E-Commerce Images: A Comparison Of Inpainting Methods
486 DAVINCI: A Single-Stage Architecture for Constrained CAD Sketch Inference
568 Beyond Static and Dynamic Quantization - Hybrid Quantization of Vision Transformers
597 FADE: Few-shot/zero-shot Anomaly Detection Engine using Large Vision-Language Model
299 RT-GS2: Real-Time Generalizable Semantic Segmentation for 3D Gaussian Representations of Radiance Fields
303 MixMask: Revisiting Masking Strategy for Siamese ConvNets
305 PEEKABOO: Hiding Parts of an Image for Unsupervised Object Localization
317 EIANet: A Novel Domain Adaptation Approach to Maximize Class Distinction with Neural Collapse Principles
358 Trimming the Fat: Efficient Compression of 3D Gaussian Splats through Pruning
361 Seg-HGNN: Unsupervised and Light-Weight Image Segmentation with Hyperbolic Graph Neural Networks
374 Text-Guided Mixup Towards Long-Tailed Image Categorization
391 PatchRot: Self-Supervised Training of Vision Transformers by Rotation Prediction
425 GLCM-Adapter: Global-Local Content Matching for Few-shot CLIP Adaptation
437 Difflare: Removing Image Lens Flare with Latent Diffusion Models
452 Drone-assisted Road Gaussian Splatting with Cross-view Uncertainty
572 Multi-Scope Representation Learning for Causal Relation Discovery with new Challenging Datasets
577 AtomGS: Atomizing Gaussian Splatting for High-Fidelity Radiance Field
650 Drawing Insights: Sequential Representation Learning in Comics
775 SAE: Single Architecture Ensemble Neural Networks
859 Extract More from Less: Efficient Fine-Grained Visual Recognition in Low-Data Regimes
878 Learning conditionally untangled latent spaces using Fixed Point Iteration
945 Gaussian Splatting in Mirrors: Reflection-aware Rendering via Virtual Camera Optimization
949 On Partial Prototype Collapse in the DINO Family of Self-Supervised Methods
212 InPer: Whole-Process Domain Generalization via Intervention and Perturbation
933 SceneSAM: Integrating 2D Labels for Weakly Supervised 3D Scene Understanding
936 PV-SLAM: Panoptic Visual SLAM with Loop Closure and Online Bundle Adjustment
957 Putting the Segment Anything Model to the Test with 3D Knee MRI - A Comparison with State-of-the-Art Performance
998 MV-Match: Multi-View Matching for Domain-Adaptive Identification of Plant Nutrient Deficiencies
929 TrakAthlete4D: Multi-View On-Field Player Position Tracking in Sports
932 Spatiotemporal Vision Transformer for Weakly Supervised Dense Prediction of Dynamic Brain Maps
939 Deep Learning for GPS-Denied SAR Image Focusing and Vehicle Trajectory Estimation
947 Layer-wise Learning of CNNs by Self-tuning Learning Rate and Early Stopping at Each Layer
954 Beyond Face Matching: A Facial Traits based Privacy Score for Synthetic Face Datasets
986 Adaptive Weighted Co-Learning for Cross-Domain Few-Shot Learning
991 iHAST: Integrating Hybrid Attention for Super-Resolution in Spatial Transcriptomics
1020 Recovering SLAM Tracking Lost by Trifocal Pose Estimation using GPU-HC++
927 GazeHELL: Gaze Estimation with Hybrid Encoders and Localised Losses with weighing
977 Improving Multimodal Learning with Multi-Loss Gradient Modulation
987 Guided Attention for Interpretable Motion Captioning
213 Enabling Local Editing in Diffusion Models by Joint and Individual Component Analysis
959 SR+Codec: a Benchmark of Super-Resolution for Video Compression Bitrate Reduction
1013 Open-Vocabulary Temporal Action Localization using Multimodal Guidance