APTPose: Anatomy-aware Pre-Training for 3D Human Pose Estimation


Qing-Wen Yang (MediaTek Inc.), Kai-Wen Duan (National Tsinghua University), Ting-Yi Lu (National Tsinghua University), Kevin Lin (Microsoft), Cheng-Yen Yang (University of Washington), Lijuan Wang (Microsoft), Jenq-Neng Hwang (University of Washington, Seattle), Shang-Hong Lai (National Tsing Hua University)
The 35th British Machine Vision Conference

Abstract

This paper presents a novel anatomy-aware pre-training method for accurate 3D human pose estimation, named APTPose. We propose a Hierarchical Masked Pose Modeling (HMPM) subtask that decouples the body skeleton into several distinct body components for hierarchical modeling. It surpasses the limitations of earlier joint coordinate masking techniques by better capturing the dependencies of the human skeletal structure. Unlike previous methods focusing on 2D pose reconstruction in their pre-training task, we leverage a large number of 3D pseudo labels from existing datasets for pre-training. This allows us to better model the skeletal system in 3D space and improve the accuracy and robustness of 3D human pose estimation. Additionally, we introduce a geometric loss into the optimization process to boost correlations within the human skeleton. Experimental results show its superior robustness and generalization capabilities across challenging benchmarks, offering a favorable balance between accuracy and computational complexity, thus making it an appealing option for practical applications.

Citation

@inproceedings{Yang_2024_BMVC,
author    = {Qing-Wen Yang and Kai-Wen Duan and Ting-Yi Lu and Kevin Lin and Cheng-Yen Yang and Lijuan Wang and Jenq-Neng Hwang and Shang-Hong Lai},
title     = {APTPose: Anatomy-aware Pre-Training for 3D Human Pose Estimation},
booktitle = {35th British Machine Vision Conference 2024, {BMVC} 2024, Glasgow, UK, November 25-28, 2024},
publisher = {BMVA},
year      = {2024},
url       = {https://papers.bmvc2024.org/0865.pdf}
}


Copyright © 2024 The British Machine Vision Association and Society for Pattern Recognition
The British Machine Vision Conference is organised by The British Machine Vision Association and Society for Pattern Recognition. The Association is a Company limited by guarantee, No.2543446, and a non-profit-making body, registered in England and Wales as Charity No.1002307 (Registered Office: Dept. of Computer Science, Durham University, South Road, Durham, DH1 3LE, UK).

Imprint | Data Protection