Poster3 Chair: TBC |
10:00 - 11:45 | 9 | Federated Learning for Face Recognition via Intra-subject Self-supervised Learning |
---|---|---|---|
42 | Spatial-Temporal NAS for Fast Surgical Segmentation | ||
76 | SagaGAN: Style Applied using Gram matrix Attribution based on StarGAN v2 | ||
136 | PhysFlow: Skin tone transfer for remote heart rate estimation through conditional normalizing flows | ||
145 | Privacy-preserving datasets by capturing feature distributions with Conditional VAEs | ||
201 | Infrared and Visible Image Fusion Using Multi-level Adaptive Fractional Differential | ||
205 | From Black-box to Label-only: a Plug-and-Play Attack Network for Model Inversion | ||
287 | A Super-pixel-based Approach to the Stable Interpretation of Neural Networks | ||
330 | Towards Better Zero-Shot Anomaly Detection under Distribution Shift with CLIP | ||
339 | FastForensics: Efficient Two-Stream Design for Real-Time Image Manipulation Detection | ||
346 | Backdoor Defense through Self-Supervised and Generative Learning | ||
375 | A Novel Divide and Merge Approach for Improved Classification of Functional Data | ||
420 | Layout Free Scene Graph to Image Generation | ||
427 | Lightweight Human Pose Estimation with Enhanced Knowledge Review | ||
440 | Explaining Multi-modal Large Language Models by Analyzing their Vision Perception | ||
457 | LLM-guided Instance-level Image Manipulation with Diffusion U-Net Cross-Attention Maps | ||
492 | Multimodal base distributions in conditional flow matching generative models | ||
508 | Semantic Image Synthesis of Anime Characters Based on Conditional Generative Adversarial Networks | ||
510 | ML-2SN: A Hybrid Two-Stream System for Sitting Posture Detection | ||
532 | Input-dependent Input-Prompts for Adapting Frozen Vision Transformers | ||
595 | A Prototype Unit for Image De-raining using Time-Lapse Data | ||
606 | VEMIC: View-aware Entropy model for Multi-view Image Compression | ||
609 | Guidance-base Diffusion Models for Improving Photoacoustic Image Quality | ||
686 | TransHuPR: Cross-View Fusion Transformer for Human Pose Estimation Using mmWave Radar | ||
692 | Spatio-Temporal Transformer with Rotary Position Embedding and Bone Priors for 3D Human Pose Estimation | ||
707 | UKD: Unsupervised Knowledge Distillation for Face Recognition | ||
736 | Rectifying Shortcut Learning through Cellular Differentiation in Deep Learning Neurons | ||
738 | CosFairNet:A Parameter-Space based Approach for Bias Free Learning | ||
740 | Frequency Decomposition to Tap the Potential of Single Domain for Generalization | ||
745 | Task-Related Feature Enhancement Network for Neuronal Morphology Classification | ||
769 | Flexible Graph Convolutional Network for 3D Human Pose Estimation | ||
828 | Multi-Scale Semantic Enrichment and Dual Angular Margin Contrast for Few-Shot Class Incremental Learning | ||
833 | Anomaly Detection Based on Semi-Formula Driven Pre-training Dataset to Represent Subtle Difference and Anomaly Score | ||
853 | Budget-aware Dynamic Spatially Adaptive Inference | ||
857 | Enhancing Radiology Report Generation: The Impact of Locally Grounded Vision and Language Training | ||
865 | APTPose: Anatomy-aware Pre-Training for 3D Human Pose Estimation | ||
866 | A Deep Belief Network Approach to Scalable Compression of Light Field Data for Auto-Stereoscopic Displays | ||
902 | Knowledge Distillation with Global Filters for Efficient Human Pose Estimation | ||
922 | UnSeGArmaNet: Unsupervised Image Segmentation using Graph Neural Networks with Convolutional ARMA Filters | ||
584 | ATLANTIS: A Framework for Automated Targeted Language-guided Augmentation Training for Robust Image Search | ||
670 | Leveraging Inductive Bias in ViT for Medical Image Diagnosis | ||
863 | CLIP with Generative Latent Replay: a Strong Baseline for Incremental Learning | ||
913 | Examining the Threat Landscape: Foundation Models and Model Theft | ||
967 | CVAM-Pose: Conditional Variational Autoencoder for Multi-Object Monocular Pose Estimation | ||
Oral 3 Chair: TBC |
11:45 | 584 | ATLANTIS: A Framework for Automated Targeted Language-guided Augmentation Training for Robust Image Search |
12:00 | 670 | Leveraging Inductive Bias in ViT for Medical Image Diagnosis | |
12:15 | 863 | CLIP with Generative Latent Replay: a Strong Baseline for Incremental Learning | |
12:30 | 913 | Examining the Threat Landscape: Foundation Models and Model Theft | |
12:45 | 967 | CVAM-Pose: Conditional Variational Autoencoder for Multi-Object Monocular Pose Estimation | |
Industry Chair: TBC |
14:30 | -- | TBC |
14:45 | -- | TBC | |
15:00 | -- | TBC | |
15:15 | -- | TBC | |
15:30 | -- | TBC | |
Poster 4 Chair: TBC |
15:45 - 17:30 | 328 | DisCoM-KD: Cross-Modal Knowledge Distillation via Disentanglement Representation and Adversarial Learning |
568 | Beyond Static and Dynamic Quantization - Hybrid Quantization of Vision Transformers | ||
597 | FADE: Few-shot/zero-shot Anomaly Detection Engine using Large Vision-Language Model | ||
299 | RT-GS2: Real-Time Generalizable Semantic Segmentation for 3D Gaussian Representations of Radiance Fields | ||
303 | MixMask: Revisiting Masking Strategy for Siamese ConvNets | ||
305 | PEEKABOO: Hiding Parts of an Image for Unsupervised Object Localization | ||
317 | EIANet: A Novel Domain Adaptation Approach to Maximize Class Distinction with Neural Collapse Principles | ||
358 | Trimming the Fat: Efficient Compression of 3D Gaussian Splats through Pruning | ||
361 | Seg-HGNN: Unsupervised and Light-Weight Image Segmentation with Hyperbolic Graph Neural Networks | ||
374 | Text-Guided Mixup Towards Long-Tailed Image Categorization | ||
391 | PatchRot: Self-Supervised Training of Vision Transformers by Rotation Prediction | ||
425 | GLCM-Adapter: Global-Local Content Matching for Few-shot CLIP Adaptation | ||
437 | Difflare: Removing Image Lens Flare with Latent Diffusion Models | ||
452 | Drone-assisted Road Gaussian Splatting with Cross-view Uncertainty | ||
572 | Multi-Scope Representation Learning for Causal Relation Discovery with new Challenging Datasets | ||
577 | AtomGS: Atomizing Gaussian Splatting for High-Fidelity Radiance Field | ||
650 | Drawing Insights: Sequential Representation Learning in Comics | ||
775 | SAE: Single Architecture Ensemble Neural Networks | ||
859 | Extract More from Less: Efficient Fine-Grained Visual Recognition in Low-Data Regimes | ||
878 | Learning conditionally untangled latent spaces using Fixed Point Iteration | ||
945 | Gaussian Splatting in Mirrors: Reflection-aware Rendering via Virtual Camera Optimization | ||
949 | On Partial Prototype Collapse in the DINO Family of Self-Supervised Methods | ||
212 | InPer: Whole-Process Domain Generalization via Intervention and Perturbation | ||
933 | SceneSAM: Integrating 2D Labels for Weakly Supervised 3D Scene Understanding | ||
936 | PV-SLAM: Panoptic Visual SLAM with Loop Closure and Online Bundle Adjustment | ||
957 | Putting the Segment Anything Model to the Test with 3D Knee MRI - A Comparison with State-of-the-Art Performance | ||
998 | MV-Match: Multi-View Matching for Domain-Adaptive Identification of Plant Nutrient Deficiencies | ||
929 | TrakAthlete4D: Multi-View On-Field Player Position Tracking in Sports | ||
932 | Spatiotemporal Vision Transformer for Weakly Supervised Dense Prediction of Dynamic Brain Maps | ||
939 | Deep Learning for GPS-Denied SAR Image Focusing and Vehicle Trajectory Estimation | ||
947 | Layer-wise Learning of CNNs by Self-tuning Learning Rate and Early Stopping at Each Layer | ||
954 | Beyond Face Matching: A Facial Traits based Privacy Score for Synthetic Face Datasets | ||
986 | Adaptive Weighted Co-Learning for Cross-Domain Few-Shot Learning | ||
991 | iHAST: Integrating Hybrid Attention for Super-Resolution in Spatial Transcriptomics | ||
1020 | Recovering SLAM Tracking Lost by Trifocal Pose Estimation using GPU-HC++ | ||
927 | GazeHELL: Gaze Estimation with Hybrid Encoders and Localised Losses with weighing | ||
977 | Improving Multimodal Learning with Multi-Loss Gradient Modulation | ||
987 | Guided Attention for Interpretable Motion Captioning | ||
213 | Enabling Local Editing in Diffusion Models by Joint and Individual Component Analysis | ||
959 | SR+Codec: a Benchmark of Super-Resolution for Video Compression Bitrate Reduction | ||
1013 | Open-Vocabulary Temporal Action Localization using Multimodal Guidance |