Computer Science ›› 2025, Vol. 52 ›› Issue (6A): 240400169-9.doi: 10.11896/jsjkx.240400169

• Image Processing & Multimedia Technology • Previous Articles     Next Articles

Human Pose Estimation Using Millimeter Wave Radar Based on Transformer and PointNet++

LI Yang1, LIU Yi2, LI Hao3,4, ZHANG Gang2, XU Mingfeng1, HAO Chongqing2   

  1. 1 China Academy of Information and Communications Technology,Beijing 100191,China
    2 School of Electrical Engineering,Hebei University of Science and Technology,Shijiazhuang 050018,China
    3 Chinese Academy of Sciences,Xiongan Institute of Innovation,Xiongan,Hebei 070001,China
    4 Hebei Provincial Key Laboratory of Cognitive Intelligence,Xiongan,Hebei 070001,China
  • Online:2025-06-16 Published:2025-06-12
  • About author:LI Yang,born in 1993,postgraduate,intermediate engineer.His main research interests include wireless AI for 6G,integrated sensing and communications.
    HAO Chongqing,born in 1981,Ph.D,associate professor.His main research interests include machine vision and biomimetic robots.
  • Supported by:
    National Key R&D Program of China(2022YFB2804402).

Abstract: Human pose estimation,as a hot research topic in the field of action recognition,is widely applied in medical,security,and monitoring fields,and is of great significance for promoting the intelligent development of related industries.However,currently image-based human pose estimation has high environmental requirements and poor privacy.Based on this,a human pose estimation method based on millimeter wave radar point cloud is proposed.This method uses PointNet++ to extract features from millimeter wave radar point cloud.Compared with CNN based pose estimation methods,it has lower MSE,MAE,and RMSE values at each joint point.In addition,to solve the problem of sparse point clouds in millimeter wave radar,a multi frame point cloud stitching strategy is used to increase the number of point clouds.The model that concatenates three frame point clouds as input reduces the MSE and MAE values by 0.22 cm and 0.72 cm respectively compared to the original model,effectively alleviating the problem of excessively sparse point clouds.Finally,in order to fully utilize the temporal features between different point clouds,Transformer is combined with PointNet++,and the effectiveness of the multi frame point cloud stitching strategy and the addition of Transformer structure are demonstrated through ablation experiments.The MSE and MAE values reaches 0.59 cm and 5.41 cm respectively,providing a new approach for achieving better performance RF human pose estimation.

Key words: Human pose estimation, Millimeter wave radar, PointNet++, Point cloud data, Transformer

CLC Number: 

  • TP391
[1]AN S,OGRAS U Y.Mars:Mmwave-Based Assistive Rehabilitation System for Smart Healthcare[J].ACM Transactions on Embedded Computing Systems(TECS),2021,20(5s):1-22.
[2]ZHENG J,SHI X,GORBAN A,et al.Multi-Modal 3D Human Pose Estimation With 2D Weak Supervision in Autonomous Driving[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2022:4478-4487.
[3]ÇETINKAYA M,ACARMAN T.Driver Activity RecognitionUsing Deep Learningand Human Pose Estimation[C]//2021 International Conference on INnovations in Intelligent SysTems and Applications(INISTA).IEEE,2021:1-5.
[4]MURPHY-CHUTORIAN E,TRIVEDI M M.Head Pose Estimationand Augmented Reality Tracking:An Integrated System and Evaluation for Monitoring Driver Awareness[J].IEEE Transactions on Intelligent Transportation Systems,2010,11(2):300-311.
[5]CHEN Y,WANG Z,PENG Y,et al.Cascaded Pyramid Net-workfor Multi-Person Pose Estimation[C]//Proceedings of The IEEE Conference on Computer Vision and Pattern Recognition.2018:7103-7112.
[6]SUN K,XIAO B,LIU D,et al.Deep High-Resolution Representation Learning for Human Pose Estimation[C]//Proceedings of The IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:5693-5703.
[7]INSAFUTDINOV E,PISHCHULIN L,ANDRES B,et al.DeeperCut:A Deeper,Stronger,and Faster Multi-Person Pose Estimation Model[C]//Proceedings of the European Conference on Computer Vision.2016:34-50.
[8]CAO Z,SIMON T,WEI S E,et al.Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields[C]//Proceedings of The IEEE Conference on Computer Vision and Pattern Recognition.2017:7291-7299.
[9]LI Y P,LIU T T,ZHANG L.A human action recognition method based on deep learning[J].Chinese Journal of Computer Application Research,2020,37(1):304-307,316.
[10]ZHAO M,LI T,ABU A M,et al.Through-Wall Human Pose Estimation Using Radio Signals[C]//Proceedings of The IEEE Conference on Computer Vision and Pattern Recognition.2018:7356-7365.
[11]FAN L,LI T,YUAN Y,et al.In-Home Daily-Life Captioning Using Radio Signals[C]//Computer Vision-ECCV 2020:16th European Conference,Glasgow,UK,August 23-28,2020,Proceedings,Part II 16.Springer International Publishing,2020:105-123.
[12]LI G,ZHANG Z,YANG H,et al.Capturing Human Pose Using Mmwave Radar[C]//2020 IEEE International Conference on Pervasive Computing and Communications Workshops(PerCom Workshops).IEEE,2020:1-6.
[13]CAO Z,DING W,CHEN R,et al.A Joint Global-Local Network for Human Pose Estimation with Millimeter Wave Radar[J].IEEE Internet of Things Journal,2022,10(1):434-446.
[14]JIN F,SENGUPTA A,CAO S,et al.Mmwave Radar PointCloud Segmentation Using Gmm in Multimodal Traffic Monitoring[C]//2020 IEEE International Radar Conference(RADAR).IEEE,2020:732-737.
[15]AN S,OGRAS U Y.Fast And Scalable Human Pose Estimation Using Mmwave Point Cloud[C]//Proceedings of The 59th ACM/IEEE Design Automation Conference.2022:889-894.
[16]QI C R,YI L,SU H,et al.Pointnet++:Deep Hierarchical Feature Learning On Point Sets In A Metric Space[J].Advances in Neural Information Processing Systems,2017,30:5099-5108.
[17]VASWANI A,SHAZEER N,PARMAR N,et al.Attention IsAll You Need[J].Advances in Neural Information Processing Systems,2017,30:5998-6008.
[18]LIN J H,YIN H B,XIONG R P.Indoor Personnel Positioning Method based on FMCW Radar Sensors[J].Chinese Journal of Sensors and Microsystems,2021,40(11):50-53.
[19]ZHONG J X,JIN L N.A robust adaptive clustering method for millimeter wave radar target point clouds[J].Chinese Journal of Science,Technology and Engineering,2022,22(5):1936-1943.
[20]YIN H B,XU Z M.Human Motion Trajectory Detection System based on FMCW Radar[J].Chinese Journal of Sensors and Microsystems,2020,39(9):116-118.
[21]QI C R,SU H,MO K,et al.Pointnet:Deep Learningon PointSets for 3D Classification and Segmentation[C]//Proceedings of The IEEE Conference on Computer Vision and Pattern Recognition.2017:652-660.
[22]INSTRUMENTS T.IWR1443BOOST[OL].[2024-04-11].https://www.ti.com/tool/IWR1443BOOST.
[23]INSTRUMENTS T.DCA1000EVM[OL].[2024-04-11].ht-tps://www.ti.com.
[24]Asus.Primesense carmine 1.09T[OL].[2024-04-11].http://xtionprolive.com/primesense-carmine-1.09.
[25]WANG R Z.Continuous human pose construction based on millimeter wave radar point cloud[D].Shanghai:Donghua University,2024.
[1] WANG Xuejian, WANG Yiheng, SUN Xinpo, LIU Chuan, JIA Ming, ZHAO Chao, YANG Chao. Extraction of Crustal Deformation Anomalies Based on Transformer-Isolation Forest [J]. Computer Science, 2025, 52(6A): 240600155-6.
[2] LONG Xiao, HUANG Wei, HU Kai. Bi-MI ViT:Bi-directional Multi-level Interaction Vision Transformer for Lung CT ImageClassification [J]. Computer Science, 2025, 52(6A): 240700183-6.
[3] CHEN Xianglong, LI Haijun. LST-ARBunet:An Improved Deep Learning Algorithm for Nodule Segmentation in Lung CT Images [J]. Computer Science, 2025, 52(6A): 240600020-10.
[4] PIAO Mingjie, ZHANG Dongdong, LU Hu, LI Rupeng, GE Xiaoli. Study on Multi-agent Supply Chain Inventory Management Method Based on Improved Transformer [J]. Computer Science, 2025, 52(6A): 240500054-10.
[5] CUI Kebin, HU Zhenzhen. Few-shot Insulator Defect Detection Based on Local and Global Feature Representation [J]. Computer Science, 2025, 52(6): 286-296.
[6] CHEN Jiajun, LIU Bo, LIN Weiwei, ZHENG Jianwen, XIE Jiachen. Survey of Transformer-based Time Series Forecasting Methods [J]. Computer Science, 2025, 52(6): 96-105.
[7] WANG Teng, XIAN Yunting, XU Hao, XIE Songqi, ZOU Quanyi. Ship License Plate Recognition Network Based on Pyramid Transformer in Transformer [J]. Computer Science, 2025, 52(6): 179-186.
[8] HAN Daojun, LI Yunsong, ZHANG Juntao, WANG Zemin. Knowledge Graph Completion Method Fusing Entity Descriptions and Topological Structure [J]. Computer Science, 2025, 52(5): 260-269.
[9] JIANG Yiheng, LI Yang, LIU Chunyan , ZHAO Yunlong. Multi-view Multi-person 3D Human Pose Estimation Based on Center-point Attention [J]. Computer Science, 2025, 52(3): 68-76.
[10] LI Yujie, MA Zihang, WANG Yifu, WANG Xinghe, TAN Benying. Survey of Vision Transformers(ViT) [J]. Computer Science, 2025, 52(1): 194-209.
[11] LIU Qian, BAI Zhihao, CHENG Chunling, GUI Yaocheng. Image-Text Sentiment Classification Model Based on Multi-scale Cross-modal Feature Fusion [J]. Computer Science, 2024, 51(9): 258-264.
[12] LI Zhi, LIN Sen, ZHANG Qiang. Edge Cloud Computing Approach for Intelligent Fault Detection in Rail Transit [J]. Computer Science, 2024, 51(9): 331-337.
[13] WEI Xiangxiang, MENG Zhaohui. Hohai Graphic Protein Data Bank and Prediction Model [J]. Computer Science, 2024, 51(8): 117-123.
[14] XU Bei, LIU Tong. Semi-supervised Emotional Music Generation Method Based on Improved Gaussian Mixture Variational Autoencoders [J]. Computer Science, 2024, 51(8): 281-296.
[15] LEI Yongsheng, DING Meng, SHEN Yao, LI Juhao, ZHAO Dongyue, CHEN Fushi. Action Recognition Model Based on Improved Two Stream Vision Transformer [J]. Computer Science, 2024, 51(7): 229-235.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!