Mumpy: Multilateral Temporal-view Pyramid Transformer for Video Inpainting Detection


Ying Zhang (Ocean University of China), Yuezun Li (Ocean University of China), Bo Peng (Institute of automation, Chinese academy of science, Chinese Academy of Sciences), Jiaran Zhou (Ocean University of China), Huiyu Zhou (University of Leicester), Junyu Dong (Ocean University of China)
The 35th British Machine Vision Conference

Abstract

The task of video inpainting detection is to expose the pixel-level inpainted regions within a video sequence. Existing methods usually focus on leveraging spatial and temporal inconsistencies. However, these methods typically employ fixed operations to combine spatial and temporal clues, limiting their applicability in different scenarios. In this paper, we introduce a novel Multilateral Temporal-view Pyramid Transformer (MumPy) that collaborates spatial-temporal clues flexibly. Our method utilizes a newly designed multilateral temporal-view encoder to extract various collaborations of spatial-temporal clues and introduces a deformable window-based temporal-view interaction module to enhance the diversity of these collaborations. Subsequently, we develop a multi-pyramid decoder to aggregate the various types of features and generate detection maps. By adjusting the contribution strength of spatial and temporal clues, our method can effectively identify inpainted regions. We validate our method on existing datasets and also introduce a new challenging and large-scale Video Inpainting dataset based on the YouTube-VOS dataset, which employs several more recent inpainting methods. The results demonstrate the superiority of our method in both in-domain and cross-domain evaluation scenarios.

Citation

@inproceedings{Zhang_2024_BMVC,
author    = {Ying Zhang and Yuezun Li and Bo Peng and Jiaran Zhou and Huiyu Zhou and Junyu Dong},
title     = {Mumpy: Multilateral Temporal-view Pyramid Transformer for Video Inpainting Detection},
booktitle = {35th British Machine Vision Conference 2024, {BMVC} 2024, Glasgow, UK, November 25-28, 2024},
publisher = {BMVA},
year      = {2024},
url       = {https://papers.bmvc2024.org/0318.pdf}
}


Copyright © 2024 The British Machine Vision Association and Society for Pattern Recognition
The British Machine Vision Conference is organised by The British Machine Vision Association and Society for Pattern Recognition. The Association is a Company limited by guarantee, No.2543446, and a non-profit-making body, registered in England and Wales as Charity No.1002307 (Registered Office: Dept. of Computer Science, Durham University, South Road, Durham, DH1 3LE, UK).

Imprint | Data Protection