计算机科学 ›› 2022, Vol. 49 ›› Issue (11A): 211100290-5.doi: 10.11896/jsjkx.211100290

• 图像处理&多媒体技术 • 上一篇    下一篇

基于改进YOLOv4-tiny的人脸关键点快速检测

付博闻1, 李闯闯1, 梁爱华2   

  1. 1 北京联合大学机器人学院 北京 100101
    2 北京联合大学前沿智能技术研究院 北京 100101
  • 出版日期:2022-11-10 发布日期:2022-11-21
  • 通讯作者: 梁爱华(liangaihua@buu.edu.cn)
  • 作者简介:(1171460872@qq.com)
  • 基金资助:
    国家自然科学基金(61502036);北京联合大学校级科研项目(ZK50202002);北京市高等教育学会2021年立项一般课题(YB202175)

Facial Landmark Fast Detection Based on Improved YOLOv4-tiny

FU Bo-wen1, LI Chuang-chuang1, LIANG Ai-hua2   

  1. 1 School of Robotics,Beijing Union University,Beijing 100101,China
    2 Frontier Intelligent Technology Research Institute,Beijing Union University,Beijing 100101,China
  • Online:2022-11-10 Published:2022-11-21
  • About author:FU Bo-wen,born in 2000,undergra-duate.His main research interests include computer vison and so on.
    LIANG Ai-hua,born in 1979,Ph.D,associate professor.Her main research interests include biometric recognition and image processing.
  • Supported by:
    National Natural Science Foundation of China(61502036),Scientific Research Project of Beijing Union University(ZK50202002) and General Project of Beijing Association of Higher Education(YB202175).

摘要: 人脸关键点检测作为人脸识别的重要环节,一直是计算机视觉领域的研究热点。为了满足高效轻量级的人脸关键点检测需求,提出了一种基于改进YOLOv4-tiny的人脸关键点快速检测算法。模型输入采用608*608*3的彩色图像,使用CSPDarknet53-tiny网络对输入图像进行主干特征提取,对提取到的特征进行上采样和特征融合,在特征融合之前添加注意力机制来提高检测准确度,同时对YOLOv4-tiny网络的损失函数进行调整,添加人脸关键点的损失计算,实现在人脸目标检测的同时对关键点进行标定定位。模型输出包括人脸标记框和人脸5个关键点。实验结果表明,相比其他网络的人脸关键点检测方法,所提模型在保证识别准确度的基础上,具有更高的识别效率和更低的配置要求,可以满足快速实时检测的需求,且更易部署在边缘设备或者移动设备上。

关键词: 人脸关键点检测, YOLOv4-tiny, 注意力机制, 实时检测, 深度学习

Abstract: Facial landmark detection is an important part of face recognition,which has been a hot issue in the field of computer vision.In order to meet the needs of efficient and lightweight face recognition,this paper proposes a facial landmark detection algorithm based on improved YOLOv4-tiny.608*608*3 color image is used for model input.The CSPDarknet53-tiny network is adopted to extract the main features of the input image.Then the extracted features are up-sampled and fused.Attention mechanism is added before feature fusion to improve the detection accuracy.The loss function of YOLOv4-tiny target detection is reconstructed,and the loss function of facial landmark is added to realize the location of facial landmark while detecting.The model output includes face marker frame and five key points.Compared with other facial landmark detection algorithms,the proposed algorithm has higher recognition efficiency and lower configuration requirements while ensuring recognition accuracy.Therefore,it can be better deployed on edge devices or mobile devices.

Key words: Facial landmark detection, YOLOv4-tiny, Attention mechanism, Real-time detection, Deep learning

中图分类号: 

  • TP391
[1]COOTES T F,TAYLOR C J,COOPER D H,et al.Active shape models-their training and application[J].Computer Vision and Image Understanding,1995,61(1):38-59.
[2]COOTES T F,EDWARDS G J,TAYLOR C J.Active appea-rance models[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2001,23(6):681-685.
[3]DENG J,GUO J,ZHOU Y,et al.Retinaface:Single-stage dense face localisation in the wild[J].arXiv:1905.00641,2019.
[4]HE K,ZHANG X,REN S,et al.Deep residual learning forimage recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:770-778.
[5]LIN T Y,DOLLÁR P,GIRSHICK R,et al.Feature pyramidnetworks for object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:2117-2125.
[6]SUI Y T,YAN Z Y,DAI L L,et al.Research on face multi-attribute detection algorithm based on RetinaFace[J].Railway Computer Applications,2021,30(3):1-4.
[7]DENG J,GUO J,XUE N,et al.Arcface:Additive angular margin loss for deep face recognition[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:4690-4699.
[8]LI H,LIN Z,SHEN X,et al.A convolutional neural network cascade for face detection[C]//Proceedings of the IEEE Confe-rence on Computer Vision and Pattern Recognition.2015:5325-5334.
[9]ZHANG K,ZHANG Z,LI Z,et al.Joint face detection andalignment using multitask cascaded convolutional networks[J].IEEE Signal Processing Letters,2016,23(10):1499-1503.
[10]ZHANG S,ZHU X,LEI Z,et al.Faceboxes:A CPU real-time face detector with high accuracy[C]//2017 IEEE International Joint Conference on Biometrics(IJCB).IEEE,2017:1-9.
[11]WANG C Y,BOCHKOVSKIY A,LIAO H Y M.Scaled-yolov4:Scaling cross stage partial network[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2021:13029-13038.
[12]BOCHKOVSKIY A,WANG C Y,LIAO H Y M.Yolov4:Optimal speed and accuracy of object detection[J].arXiv:2004.10934,2020.
[13]WANG Q L,WU B G,ZHU P F,et al.ECT-Net:EfficientChannerl Attention for Deep Convolutional Neural Networks[J].arXiv:1910.03151,2019.
[14]REN S,HE K,GIRSHICK R,et al.Faster r-cnn:Towards real-time object detection with region proposal networks[J].Advances in Neural Information Processing Systems,2015,28:91-99.
[15]YANG S,LUO P,LOY C C,et al.WIDER FACE:A Face Detection Benchmark[C]//IEEE Conference on Computer Vision &Pattern Recognition.IEEE,2016.
[1] 周芳泉, 成卫青.
基于全局增强图神经网络的序列推荐
Sequence Recommendation Based on Global Enhanced Graph Neural Network
计算机科学, 2022, 49(9): 55-63. https://doi.org/10.11896/jsjkx.210700085
[2] 戴禹, 许林峰.
基于文本行匹配的跨图文本阅读方法
Cross-image Text Reading Method Based on Text Line Matching
计算机科学, 2022, 49(9): 139-145. https://doi.org/10.11896/jsjkx.220600032
[3] 周乐员, 张剑华, 袁甜甜, 陈胜勇.
多层注意力机制融合的序列到序列中国连续手语识别和翻译
Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion
计算机科学, 2022, 49(9): 155-161. https://doi.org/10.11896/jsjkx.210800026
[4] 徐涌鑫, 赵俊峰, 王亚沙, 谢冰, 杨恺.
时序知识图谱表示学习
Temporal Knowledge Graph Representation Learning
计算机科学, 2022, 49(9): 162-171. https://doi.org/10.11896/jsjkx.220500204
[5] 熊丽琴, 曹雷, 赖俊, 陈希亮.
基于值分解的多智能体深度强化学习综述
Overview of Multi-agent Deep Reinforcement Learning Based on Value Factorization
计算机科学, 2022, 49(9): 172-182. https://doi.org/10.11896/jsjkx.210800112
[6] 饶志双, 贾真, 张凡, 李天瑞.
基于Key-Value关联记忆网络的知识图谱问答方法
Key-Value Relational Memory Networks for Question Answering over Knowledge Graph
计算机科学, 2022, 49(9): 202-207. https://doi.org/10.11896/jsjkx.220300277
[7] 汤凌韬, 王迪, 张鲁飞, 刘盛云.
基于安全多方计算和差分隐私的联邦学习方案
Federated Learning Scheme Based on Secure Multi-party Computation and Differential Privacy
计算机科学, 2022, 49(9): 297-305. https://doi.org/10.11896/jsjkx.210800108
[8] 汪鸣, 彭舰, 黄飞虎.
基于多时间尺度时空图网络的交通流量预测模型
Multi-time Scale Spatial-Temporal Graph Neural Network for Traffic Flow Prediction
计算机科学, 2022, 49(8): 40-48. https://doi.org/10.11896/jsjkx.220100188
[9] 朱承璋, 黄嘉儿, 肖亚龙, 王晗, 邹北骥.
基于注意力机制的医学影像深度哈希检索算法
Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism
计算机科学, 2022, 49(8): 113-119. https://doi.org/10.11896/jsjkx.210700153
[10] 孙奇, 吉根林, 张杰.
基于非局部注意力生成对抗网络的视频异常事件检测方法
Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection
计算机科学, 2022, 49(8): 172-177. https://doi.org/10.11896/jsjkx.210600061
[11] 闫佳丹, 贾彩燕.
基于双图神经网络信息融合的文本分类方法
Text Classification Method Based on Information Fusion of Dual-graph Neural Network
计算机科学, 2022, 49(8): 230-236. https://doi.org/10.11896/jsjkx.210600042
[12] 王剑, 彭雨琦, 赵宇斐, 杨健.
基于深度学习的社交网络舆情信息抽取方法综述
Survey of Social Network Public Opinion Information Extraction Based on Deep Learning
计算机科学, 2022, 49(8): 279-293. https://doi.org/10.11896/jsjkx.220300099
[13] 郝志荣, 陈龙, 黄嘉成.
面向文本分类的类别区分式通用对抗攻击方法
Class Discriminative Universal Adversarial Attack for Text Classification
计算机科学, 2022, 49(8): 323-329. https://doi.org/10.11896/jsjkx.220200077
[14] 姜梦函, 李邵梅, 郑洪浩, 张建朋.
基于改进位置编码的谣言检测模型
Rumor Detection Model Based on Improved Position Embedding
计算机科学, 2022, 49(8): 330-335. https://doi.org/10.11896/jsjkx.210600046
[15] 胡艳羽, 赵龙, 董祥军.
一种用于癌症分类的两阶段深度特征选择提取算法
Two-stage Deep Feature Selection Extraction Algorithm for Cancer Classification
计算机科学, 2022, 49(7): 73-78. https://doi.org/10.11896/jsjkx.210500092
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!