J Shanghai Jiaotong Univ Sci ›› 2025, Vol. 30 ›› Issue (5): 855-865.doi: 10.1007/s12204-024-2697-0
收稿日期:
2023-08-08
接受日期:
2023-08-29
出版日期:
2025-09-26
发布日期:
2024-01-16
吴亚磊,李敬华,孔德慧,李倩星,尹宝才
Received:
2023-08-08
Accepted:
2023-08-29
Online:
2025-09-26
Published:
2024-01-16
摘要: 由于手的自遮挡和高自由度变化,基于单张RGB图像进行3D手姿态估计是一个极具挑战性的问题。图卷积网络利用图描述手关节之间的结构关系,在一定程度上可以提高3D手姿态回归的准确性,然而,图卷积神经网络不能有效描述非相邻手部关节点间的关系。近来,广受关注的超图卷积网络能够通过超边描述节点之间的多元高维关系。因此,本文提出了一种基于超图卷积网络的手三维姿态估计框架,能够更好提取相邻和非相邻手关节之间的关联关系。为了克服预定义超图结构的缺点,提出了一种动态超图卷积网络(DHGCN),其中超边是基于手部关节特征相似性动态构建的。为了更好地探索节点之间的局部语义关系,提出了一种语义动态超图卷积(SDHGCN)。该方法在公开的基准数据集上进行了评估。本文在两个公开的基准数据集STB、RHD上评估了所提出的方法。定性定量的实验结果均表明,相较于图卷积网络,超图卷积网络更适用于手部姿态估计任务,与现有方法的对比实验表明本文所提出的网络框架达到了主流水平。
中图分类号:
. 基于语义动态超图卷积的三维手姿态估计[J]. J Shanghai Jiaotong Univ Sci, 2025, 30(5): 855-865.
WU Yalei, LI Jinghua, KONG Dehui, LI Qianxing, YIN Baocai. 3D Hand Pose Estimation Using Semantic Dynamic Hypergraph Convolutional Networks[J]. J Shanghai Jiaotong Univ Sci, 2025, 30(5): 855-865.
[1] DOOSTI B, NAHA S, MIRBAGHERI M, et al. Hope-net: A graph-based model for hand-object pose estimation[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 6608-6617. [2] GE L H, REN Z, LI Y C, et al. 3D hand shape and pose estimation from a single RGB image[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019:10833-10842. [3] GUO S X, RIGALL E, QI L, et al. Graph-based CNNs with self-supervised module for 3d hand pose estimation from monocular RGB[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2020, 31(4): 1514-1525. [4] CHEN L J, LIN S Y, XIE Y S, et al. Temporal-aware self-supervised learning for 3d hand pose and mesh estimation in videos[C]// 2021 IEEE Winter Conference on Applications of Computer Vision. Waikoloa: IEEE, 2021: 1050-1059. [5] XIONG F, ZHANG B S, XIAO Y, et al. A2J: Anchor-to-joint regression network for 3d articulated pose estimation from a single depth image[C]// 2019 IEEE/CVF International Conference on Computer Vision. Seoul: IEEE, 2019: 793-802. [6] YUAN S X, GARCIA-HERNANDO G, STENGER B, et al. Depth-based 3d hand pose estimation: from current achievements to future goals[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 2636-2645. [7] ZIMMERMANN C, BROX T. Learning to estimate 3d hand pose from single RGB images[C]// 2017 IEEE International Conference on Computer Vision. Venice: IEEE, 2017: 4903-4911. [8] PANTELERIS P, ARGYROS A. Back to RGB: 3d tracking of hands and hand-object interactions based on short-baseline stereo[C]// 2017 IEEE International Conference on Computer Vision Workshops. Venice: IEEE, 2017: 575-584. [9] CAI Y J, GE L H, CAI J F, et al. Weakly-supervised 3d hand pose estimation from monocular RGB images[C]// Proceedings of the European Conference on Computer Vision, Munich: Springer, 2018: 666-682. [10] GUO S X, RIGALL E, JU Y K, et al. 3D hand pose estimation from monocular RGB with feature interaction module[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2022, 32(8): 5293-5306. [11] SIMON T, JOO H, MATTHEWS I, et al. Hand keypoint detection in single images using multiview bootstrapping[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu: IEEE, 2017: 1145-1153. [12] YU J, TAO D C, WANG M. Adaptive hypergraph learning and its application in image classification[J]. IEEE Transactions on Image Processing, 2012, 21(7): 3262-3272. [13] JIANG J W, WEI Y X, FENG Y F, et al. Dynamic hypergraph neural networks[C]// Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence. Freiburg: IJCAI, 2019: 2635-2641. [14] CAI Y, GE L, LIU J, et al. Exploiting spatial-temporal relationships for 3D pose estimation via graph convolutional networks[C]// 2019 IEEE/CVF International Conference on Computer Vision. Seoul: IEEE, 2019:2272-2281. [15] CAI Y J, GE L H, CAI J, et al. 3D hand pose estimation using synthetic data and weakly labeled RGB images[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 43(11): 3739-3753. [16] HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 770-778. [17] KIPF T N, WELLING M. Semi-supervised classification with graph convolutional networks[C]// Proceedings of 5th International Conference on Learning Representations. Toulon: ICLR, 2017. [18] FENG Y F, YOU H X, ZHANG Z Z, et al. Hypergraph neural networks[C]// The Thirty-Third AAAI Conference on Artificial Intelligence. Hilton Hawaiian Village: AAAI Press, 2019, 33(01): 3558-3565. [19] LIU S, LV P, ZHANG Y, et al. Semi-dynamic hypergraph neural network for 3d pose estimation[C]// Proceedings of the twenty-ninth International Joint Conference on Artificial Intelligence, Yokohama, Japan: Local Organizing Committee, 2020: 782-788. [20] XU X X, ZOU Q, LIN X. Adaptive hypergraph neural network for multi-person pose estimation[C]// Proceedings of the AAAI Conference on Artificial Intelligence. Vancouver: AAAI Press, 2022, 36(3): 2955-2963. [21] ZHAO L, PENG X, TIAN Y, et al. Semantic graph convolutional networks for 3d human pose regression[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 3425-3435. [22] ZHANG J W, JIAO J B, CHEN M L, et al. A hand pose tracking benchmark from stereo matching[C]// 2017 IEEE International Conference on Image Processing. Beijing: IEEE, 2017: 982-986. [23] ZIMMERMANN C, CEYLAN D, YANG J, et al. Freihand: a dataset for markerless capture of hand pose and shape from single RGB images[C]// 2019 IEEE/CVF International Conference on Computer Vision. Seoul:IEEE, 2019:813-822. [24] GE L H, CAI Y J, WENG J W, et al. Hand Pointnet: 3D hand pose estimation using point sets[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE 2018: 8417-8426. [25] YANG L X, LI J S, XU W Q, et al. Bihand: recovering hand mesh with multi-stage bisected hourglass networks[C]// Proceedings of the British Machine Vision Conference. Virtual: British Machine Vision Association, 2020. [26] SPURR A, SONG J, PARK S, et al. Cross-modal deep variational hand pose estimation[C]// 018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 89-98. [27] YANG L L, LI S L, LEE D, et al. Aligning latent spaces for 3d hand pose estimation[C]// 2019 IEEE/CVF International Conference on Computer Vision. Seoul: IEEE, 2019: 2335-2343. [28] THEODORIDIS T, CHATZIS T, SOLACHIDIS V, et al. Cross-modal variational alignment of latent spaces[C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Seattle: IEEE, 2020: 960-961. [29] STERGIOULAS A, CHATZIS T, KONSTANTINIDIS D, et al. 3D Hand pose estimation via aligned latent space injection and kinematic losses[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Nashville: IEEE, 2021: 1730-1739. [30] CUI Y, LI M, GAO Y, et al. Camera distance helps 3D hand pose estimated from a single RGB image[J]. Graphical Models, 2023, 127:101179. [31] KOURBANE I, GENC Y. A hybrid classification-regression approach for 3D hand pose estimation using graph convolutional networks[J]. Signal Processing Image Communication, 2022:101. [32] HASSON Y., VAROL G., TZIONAS D, et al. Learning joint reconstruction of hands and manipulated objects[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach:IEEE, 2019:11807–11816.
|
[1] | . 面向交通流量预测的多尺度动态超图卷积网络[J]. J Shanghai Jiaotong Univ Sci, 2025, 30(5): 880-888. |
[2] | . 基于Transformer对比学习的自动睡眠分期方法[J]. J Shanghai Jiaotong Univ Sci, 2025, 30(4): 720-732. |
[3] | . 基于两阶段卷积神经网络的焊缝缺陷监测[J]. J Shanghai Jiaotong Univ Sci, 2025, 30(2): 291-299. |
[4] | 柯晶1, 朱俊超2, 杨鑫1, 张浩林3, 孙宇翔1, 王嘉怡1, 鲁亦舟4, 沈逸卿5, 刘晟6, 蒋伏松7, 黄琴8. TshFNA-Examiner:甲状腺细胞学图像的核分割和癌症评估框架[J]. J Shanghai Jiaotong Univ Sci, 2024, 29(6): 945-957. |
[5] | 李明爱1, 2, 魏丽娜1. 基于朴素卷积神经网络和线性插值的运动想像分类[J]. J Shanghai Jiaotong Univ Sci, 2024, 29(6): 958-966. |
[6] | 耿宗盛1,赵东东1, 2,周兴文1,闫磊1, 阎石1, 2. 基于全分布式事件驱动控制的多智能体系统领导-跟随一致性研究[J]. J Shanghai Jiaotong Univ Sci, 2024, 29(4): 640-645. |
[7] | 刘增敏1, 2, 3, 4, 6, 王申涛5, 姚莉秀1, 2, 3, 蔡云泽1, 2, 3, 4, 6. 基于目标检测和特征提取网络的运动无人机平台下多目标跟踪[J]. J Shanghai Jiaotong Univ Sci, 2024, 29(3): 388-399. |
[8] | 张彦军1,4,5,6,7, 王碧云2,3 , 蔡云泽1,4,5,6,7. 基于注意力的多通道网络红外弱小目标检测[J]. J Shanghai Jiaotong Univ Sci, 2024, 29(3): 414-427. |
[9] | 王玉娟1,李文刚2,刘建勇3,陈广学4,汪军1. fiber;麻灰色原配色丝织物的颜色预测模型[J]. J Shanghai Jiaotong Univ Sci, 2023, 28(6): 802-808. |
[10] | . [J]. J Shanghai Jiaotong Univ Sci, 2022, 27(5): 715-722. |
[11] | . [J]. J Shanghai Jiaotong Univ Sci, 2022, 27(5): 737-746. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||