ICAF-4: An Integrated Framework of Category-level Articulated Object Perception and Manipulation for Embodied Intelligence


WenBo Xu (Hefei University of Technology), Li Zhang (University of Science and Technology of China), Qiankun Li (University of Science and Technology of China), Qi Wu (Shanghai Jiaotong University), Lin Yuanbo Wu (Swansea University), Liu Liu (Hefei University of Technology)
The 35th British Machine Vision Conference

Abstract

Articulated objects are common in human’s daily life. Current research on articulated objects often emphasizes visual understanding of articulations rather than high-level functional manipulation tasks from a single RGB-D or point cloud observation. In this paper, to study the problem of Category-level Visually Articulated object Perception task (C-VAP), we propose an Integrated Category-level visual Articulated object perception Framework, namely ICAF-4. Given the RGB and depth information as input, the ICAF-4 is capable of end-to-end processing of four mainstream tasks for articulated objects: object detection, part segmentation, pose estimation and manipulation. To support the C-VAP task, we re-annotate the rich functional grasping affordance and grasp poses by an automatic annotation generation way for two popular articulation benchmarks, ArtImage and ReArtMix, covering object-level and scene-level datasets. Accompanying the datasets, our ICAF-4 takes the part segmentation branch, pose estimation branch and manipulation prediction branch into a single forward pass. To boost the manipulation learning performance, we propose an anchor-based grasp pose estimation strategy where the "anchor" poses serve as references at multiple sizes and the grasp pose can be learned by the anchor selection and refinement process. Experiments demonstrate the superior performance of our ICAF-4 on integrating these visual tasks for articulation perception.All data and code will be made publicly available.

Citation

@inproceedings{Xu_2024_BMVC,
author    = {WenBo Xu and Li Zhang and Qiankun Li and Qi Wu and Lin Yuanbo Wu and Liu Liu},
title     = {ICAF-4: An Integrated Framework of Category-level Articulated Object Perception and Manipulation for Embodied Intelligence},
booktitle = {35th British Machine Vision Conference 2024, {BMVC} 2024, Glasgow, UK, November 25-28, 2024},
publisher = {BMVA},
year      = {2024},
url       = {https://papers.bmvc2024.org/0667.pdf}
}


Copyright © 2024 The British Machine Vision Association and Society for Pattern Recognition
The British Machine Vision Conference is organised by The British Machine Vision Association and Society for Pattern Recognition. The Association is a Company limited by guarantee, No.2543446, and a non-profit-making body, registered in England and Wales as Charity No.1002307 (Registered Office: Dept. of Computer Science, Durham University, South Road, Durham, DH1 3LE, UK).

Imprint | Data Protection