Keynote - Laura Sevilla
09:00 - 10:00
09:00 - 10:00 | Title: Frontiers of Video Understanding
Abstract: Video Understanding is a fundamental skill of intelligent systems. From autonomous robots to virtual assistants, understanding the world in motion is necessary to be able to move and interact with it. The last few years have seen amazing improvements in Video Understanding research. Still there is a remarkable gap between the almost uncanny performance of models in other modalities such as language and still images, and the performance of video. In this talk I will discuss what I believe are the current barriers for video, including efficiency, a tricky relationship with language and finding the right tasks. For each of these topics I will discuss both my recent work on them, as well as what I believe are interesting directions that I hope can be inspiring for the community.
Room: M1
|
---|
Poster Sessions
10:00 - 11:45 / 15:15 - 17:00
10:00 - 11:45 |
Papers Presented
Room: Hall 2
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
15:15 - 17:00 |
Papers Presented
Room: Hall 2
|
Doctoral Consortium
10:00 - 13:00
Chair: Richard Menzies and George Killick | 10:00 - 10:15 | Fatemeh Amerehi | Toward Comprehensive Neural Network Robustness |
---|---|---|---|
10:15 - 10:30 | Zahra Babaiee | Unveiling the Unseen: Identifiable Clusters in Trained Depthwise Convolutional Kernels | |
10:30 - 10:45 | Jack Saunders | Style and Speech in Facial Animation | |
10:45 - 11:00 | Muhammad Akhtar Munir | Exploring Advanced Calibration Loss Techniques for Vision-Language Models | |
11:00 - 11:15 | Break | Break | |
11:15 - 11:30 | Filippos Gouidis | Recognizing object states by combining data-driven and symbolic methods | |
11:30 - 11:45 | Remco Royen | Addressing labelling, complexity, latency, and scalability in deep learning-based processing of point clouds | |
11:45 - 12:15 | Speaker: Md. Mostafa Kamal Sarker (Technovative Solutions LTD) | Dr Sarker is the Lead AI Research Scientist at Technovative Solutions LTD (TVS) and a Visiting Fellow at the University of Oxford. He's an expert in artificial intelligence, computer vision, and deep learning. His research has significantly impacted clinical AI, biomedical image analysis, and digital healthcare, evident in his 40+ peer-reviewed publications. At BMVC2024, he'll share his valuable insights and guide aspiring researchers on transitioning from academia to industry and discuss the exciting opportunities this path offers. | |
12:15 - 13:00 | Mentor Session | ||
Room: M2
|
Workshop Sessions
09:00 - 18:00
09:00 - 18:00 | Robust Recognition in the Open World
https://rrow2024.github.io
Room: M3
|
---|---|
14:00 - 18:00 | DIFA: Deep Learning-based Image Fusion and Its Applications
https://difa2024.github.io
Room: M2
|
Oral Session - Machine Vision in Challenging Scenarios
11:45 - 13:00
Chair: Amey Pore | 11:45 | 103 |
Prompting Diffusion Representations for Cross-Domain Semantic Segmentation
Han Sun, Julio Delgado Mangas, Luc Van Gool, Martin Danelljan, Nikolay Marin, Rui Gong
|
---|---|---|---|
12:00 | 200 |
Towards Generative Class Prompt Learning for Fine-grained Visual Recognition
Emanuele Vivoli, Josep Llados, Sanket Biswas, Soumitri Chattopadhyay
|
|
12:15 | 406 |
When Text and Images Don't Mix: Bias-Correcting Language-Image Similarity Scores for Anomaly Detection
Adam Goodge, Bryan Hooi, Wee Siong Ng
|
|
12:30 | 615 |
Measuring Physical Plausibility of 3D Human Poses Using Physics Simulation
Jason J Corso, Mahzad Khoshlessan, Nathan Louis
|
|
12:45 | 754 |
Revisiting Image Captioning Training Paradigm via Direct CLIP-based Optimization
Davide Caffagni, Lorenzo Baraldi, Marcella Cornia, Nicholas Moratelli, Rita Cucchiara
|
|
Room: M1
|
Oral Session - Image Quality Algorithms
14:00 - 15:15
Chair: Jefersson A. dos Santos | 14:00 | 14 |
Efficiency-preserving Scene-adaptive Object Detection
Vu Quang Truong, Zekun Zhang, Minh Hoai
|
---|---|---|---|
14:15 | 114 |
Key-point Guided Deformable Image Manipulation Using Diffusion Model
Guil Jung, Hyeonmin Bae, Hyuksool Kwon, Myeong-Gee Kim, Seok-Hwan Oh, Young-Min Kim, hyeonjik lee, Sang-yun Kim
|
|
14:30 | 416 |
Taming the Tail: Leveraging Asymmetric Loss and Padé Approximation to Overcome Long-Tailed Class Imbalance
Abhishek Tiwari, Kshitij Sharad Jadhav, Pankhi Kashyap, Pavni Tandon, Ritwik Kulkarni, Sunny Gupta
|
|
14:45 | 517 |
Interpretable Long-term Action Quality Assessment
Andrew Gilbert, Anthony Adeyemi-Ejeye, Wanqing Li, Xinran Liu, Xu Dong
|
|
15:00 | 545 |
Vision-Language Guidance for LiDAR-based Unsupervised 3D Object Detection
Christian Fruhwirth-Reisinger, Wei Lin, Dušan Malić, Horst Bischof, Horst Possegger
|
|
Room: M1
|