面向变电站近电作业的高精度多人姿态估计方法A high-precision multi-human body pose estimation approach for near-electricity work in substations
马靖,任博文,陈来军,马恒瑞,朱苏洵,陈铁滨
MA Jing,REN Bowen,CHEN Laijun,MA Hengrui,ZHU Suxun,CHEN Tiebin
摘要(Abstract):
在变电站近电作业场景中,人体姿态估计对于准确定位人体关键点信息至关重要。然而,由于肢体或设备的遮挡,传统检测方法常常面临精度低、漏检和误检等问题。为此,提出一种面向变电站近电作业的高精度多人姿态估计方法。首先,在骨干网络中嵌入DCN(可变形卷积网络),使模型能够自主学习人体关节特征,并增强了几何建模能力。其次,构建一个基于ConvNeXt v2 Block的特征金字塔网络作为颈部结构,通过跨尺度连接方式加强特征间的交互学习。在预测头中引入CA(协调注意力机制),以进一步捕获特征图的通道和方向信息。最后,通过改进原损失函数,加速了模型的收敛速度。结果表明,与基准模型相比,所提模型的平均检测精度P_(0.50)、P_(0.75)和P分别提高了2.7%、7.3%、4.2%,可为变电站复杂环境下近电作业人员的安全提供重要的技术支撑。
Accurate human pose estimation is crucial for precisely locating key points of human body during the near-electricity work in substations. However, traditional detection methods often suffer from low accuracy, missed detections, and misdetections due to occlusion by limb or equipment. To address these challenges, the paper proposes a high-precision multi-human body pose estimation method tailored for near-electricity work in substations.First, a deformable convolutional network(DCN) is embedded into the backbone network, enabling the model to autonomously learn human joint features and enhancing its geometric modeling capabilities. Second, a feature pyramid network is constructed based on the ConvNeXt v2 Block as the neck structure. This strengthens feature interaction learning through cross-scale connections. In the prediction head, the coordinate attention(CA) mechanism is introduced to further capture channel and spatial information of feature maps. Finally, by improving the original loss function, the model's convergence speed is accelerated. The results show that, compared to the baseline model, the proposed model's average detection accuracies P0.50, P0.75, and P have increased by 2.7%, 7.3%, and 4.2%, respectively. This provides significant technical support for the safety of near-electricity workers in complex substation environments.
关键词(KeyWords):
近电作业;人体姿态估计;YOLO v7;DCN v2模块;注意力机制
near-electricity work;human body pose estimation;YOLO v7;DCN v2 module;attention mechanism
基金项目(Foundation): 青海省十大国家级科技创新平台、多能互补绿色储能全国重点实验室建设科技项目(2023-ZJ-J04)
作者(Author):
马靖,任博文,陈来军,马恒瑞,朱苏洵,陈铁滨
MA Jing,REN Bowen,CHEN Laijun,MA Hengrui,ZHU Suxun,CHEN Tiebin
DOI: 10.19585/j.zjdl.202409011
参考文献(References):
- [1]马富齐,王波,董旭柱,等.电力工业安全影像解译:基本概念与技术框架[J].中国电机工程学报,2022,42(2):458-475.MA Fuqi,WANG Bo,DONG Xuzhu,et al.Safety image interpretation of power industry:basic concepts and technical framework[J].Proceedings of the CSEE,2022,42(2):458-475.
- [2]钱建国,张超,马伟,等.基于多维映射矩阵的变电站监控系统智能巡检技术[J].浙江电力,2022,41(1):118-124.QIAN Jianguo,ZHANG Chao,MA Wei,et al.An intelligent inspection technology of substation monitoring system based on multi-dimensional mapping matrix[J].Zhejiang Electric Power,2022,41(1):118-124.
- [3]魏伟明,茹惠东,金路.一种基于图像文字识别的变电站防误操作系统[J].浙江电力,2021,40(1):18-23.WEI Weiming,RU Huidong,JIN Lu. A substation antimisoperation system based on image character recognition[J].Zhejiang Electric Power,2021,40(1):18-23.
- [4]安妙,孔英会,沈辉,等.基于深度学习的行为识别及在电力系统的应用[J].电力科学与工程,2019,35(3):59-65.AN Miao,KONG Yinghui,SHEN Hui,et al.Action recognition based on deep learning and its application in power system[J]. Electric Power Science and Engineering,2019,35(3):59-65.
- [5]胡建芳,王熊辉,郑伟诗,等.RGB-D行为识别研究进展及展望[J].自动化学报,2019,45(5):829-840.HU Jianfang,WANG Xionghui,ZHENG Weishi,et al.RGB-D action recognition:recent advances and future perspectives[J]. Acta Automatica Sinica,2019,45(5):829-840.
- [6]韩贵金,赵勇.基于树形图结构模型的人体姿态估计[J].西安邮电大学学报,2013,18(3):83-86.HAN Guijin,ZHAO Yong.Human pose estimation based on tree-like picture model[J].Journal of Xi’an University of Posts and Telecommunications,2013,18(3):83-86.
- [7] CHENG B W,XIAO B,WANG J D,et al. HigherHRNet:scale-aware representation learning for bottomup human pose estimation[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR). June 13-19,2020. Seattle,WA,USA:IEEE,2020:5386-5395.
- [8]冯晓月,宋杰.二维人体姿态估计研究进展[J].计算机科学,2020,47(11):128-136.FENG Xiaoyue,SONG Jie.Research advance on 2D human pose estimation[J].Computer Science,2020,47(11):128-136.
- [9]卢官明,卢峻禾,陈晨.基于深度学习的二维人体姿态估计研究进展[J].南京邮电大学学报(自然科学版),2024,44(1):44-55.LU Guanming,LU Junhe,CHEN Chen. Research progress on two-dimensional human pose estimation based on deep learning[J]. Journal of Nanjing University of Posts and Telecommunications(Natural Science Edition),2024,44(1):44-55.
- [10]宜兴叶,王继旭,平河,等.基于人体姿态估计的教师动作分析算法[J].计算机与电气工程,2023,111:108915.YI Xingye,Wang Jixu,PING He,et al.An action analysis algorithm for teachers based on human pose estimation[J].Computers and Electrical Engineering,2023,111:108915.
- [11]龚法明,马玉辉,潘正德,等.基于多阶段卷积姿态机的深度模型方法识别海上钻井平台工人活动[J].工业过程中的安全与防护,2020,64:104043.Gong FaminG,Ma Yuhui,PAN Zhengde,et al. A deep model method for recognizing activities of workers on offshore drilling platform by multistage convolutional pose machine[J].Journal of Loss Prevention in the Process Industries,2020,64:104043.
- [12] KIM W,SUNG J,SAAKES D,et al.Ergonomic postural assessment using a new open-source human pose estimation technology(OpenPose)[J]. International Journal of Industrial Ergonomics,2021,84:103164.
- [13]朱建宝,许志龙,孙玉玮,等.基于OpenPose人体姿态识别的变电站危险行为检测[J].自动化与仪表,2020,35(2):47-51.ZHU Jianbao,XU Zhilong,SUN Yuwei,et al. Detection of dangerous behaviors in power stations based on open pose multi-person attitude recognition[J]. Automation&Instrumentation,2020,35(2):47-51.
- [14] MAJI D,NAGORI S,MATHEW M,et al.YOLO-pose:enhancing YOLO for multi person pose estimation using object keypoint similarity loss[C]//2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops(CVPRW). June 19-20,2022. New Orleans,LA,USA:IEEE,2022:2637-2646.
- [15] DAI J F,QI H Z,XIONG Y W,et al.Deformable convolutional networks[C]//2017 IEEE International Conference on Computer Vision(ICCV).October 22-29,2017.Venice:IEEE,2017:764-773.
- [16] ZHU X Z,HU H,LIN S,et al.Deformable ConvNets V2:more deformable,better results[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR). June 15-20,2019. Long Beach,CA,USA:IEEE,2019:9308-9316.
- [17]郑鑫,潘斌,张健.可变形网络与迁移学习相结合的电力塔遥感影像目标检测法[J].测绘学报,2020,49(8):1042-1050.ZHENG Xin,PAN Bin,ZHANG Jian.Power tower detection in remote sensing imagery based on deformable network and transfer learning[J].Acta Geodaetica et Cartographica Sinica,2020,49(8):1042-1050.
- [18]徐先哲,姜一琪,陈伟华,等.Damo-yolo:一份关于实时目标检测设计的报告[R].arXiv预印本,arXiv:2211.15444,2022.Xu Xianzhe,Jiang Yiqi,Chen Weihua,et al.Damo-yolo:A report on real-time object detection design[R].arXiv preprint arXiv:2211.15444,2022.
- [19] HOU Q B,ZHOU D Q,FENG J S.Coordinate attention for efficient mobile network design[C]//2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR). June 20-25,2021. Nashville,TN,USA:IEEE,2021:13713-13722.
- [20] GEVORGYAN Z.SIoU loss:more powerful learning for bounding box regression[EB/OL].[2024-01-02]. 2022:2205.12740.http://arxiv.org/abs/2205.12740v1.
- [21] LI J F,WANG C,ZHU H,et al. CrowdPose:efficient crowded scenes pose estimation and a new benchmark[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).June 15-20,2019.Long Beach,CA,USA:IEEE,2019:10863-10872.
- [22]吕成琪,张文伟,黄海安,等.Rtmdet:实时目标检测器设计的实证研究[J]. arXiv预印本,arXiv:2212.07784,2022.Lyu Chengqi,Zhang Wenwei,Huang Haian,et al.Rtmdet:An empirical study of designing real-time object detectors[J].arXiv preprint arXiv:2212.07784,2022.
- [23] WANG W H,XIE E Z,LI X,et al.Pyramid vision transformer:a versatile backbone for dense prediction without convolutions[C]//2021 IEEE/CVF International Conference on Computer Vision(ICCV).October 10-17,2021.Montreal,QC,Canada:IEEE,2021:568-578.
- 近电作业
- 人体姿态估计
- YOLO v7
- DCN v2模块
- 注意力机制
near-electricity work - human body pose estimation
- YOLO v7
- DCN v2 module
- attention mechanism