计算机科学 ›› 2019, Vol. 46 ›› Issue (6A): 165-168.

• 模式识别与图像处理 • 上一篇    下一篇

基于混合卷积神经网络的静态手势识别

石雨鑫, 邓洪敏, 郭伟林   

  1. 四川大学电子信息学院 成都610065
  • 出版日期:2019-06-14 发布日期:2019-07-02
  • 通讯作者: 邓洪敏(1969-),女,博士,副教授,主要研究方向为非线性动力学、模糊控制、神经网络,E-mail:denghongming@aliyun.com
  • 作者简介:石雨鑫(1994-),女,硕士生,主要研究方向为神经网络、模式识别,E-mail:shiyuxin2655209@vip.qq.com;郭伟林(1992-),男,硕士生,主要研究方向为神经网络、人工智能。
  • 基金资助:
    本文受国家自然科学基金(61174025)资助。

Static Gesture Recognition Based on Hybrid Convolution Neural Network

SHI Yu-xin, DENG Hong-min, GUO Wei-lin   

  1. College of Electronics and Information Engineering,Sichuan University,Chengdu 610065,China
  • Online:2019-06-14 Published:2019-07-02

摘要: 静态手势识别在人机交互方面具有重要的应用价值,但手势背景的复杂性和手势形态的多样性给识别的准确性带来了一定的影响。为了提高手势识别的准确率,文中提出了一种基于卷积神经网络(Convolution Nenral Network,CNN)与随机森林(Random Forest,RF)的识别方法。该方法首先对静态手势的图片进行手势分割,然后利用卷积网络的特征提取功能提取特征向量,最后使用随机森林分类器对这些特征向量进行分类。一方面,卷积神经网络具有分层学习的能力,能够收集图片上更具代表性的信息;另一方面,随机森林对样本和特征选择具有随机性,并且对每个决策树结果进行了平均,不易出现过拟合问题。在静态手势数据集上进行验证,实验结果显示:所提方法能有效地对静态手势进行识别,平均识别率能够达到94.56%。文中进一步将所提方法与几种经典的特征提取方法(主成分分析(PCA)和局部二进制(LBP))进行对比,实验结果显示:相比于PCA和LBP特征提取方法,由CNN提取的特征向量进行分类识别的效果更好,该方法的识别率比PCA-RF方法高2.44%,比LBP-RF方法高1.74%。最后,在经典的MNIST数据集上进行验证,所提方法的识别率达到了97.9%,高于其他两种传统的特征提取方法。

关键词: 静态手势, 卷积神经网络, 识别, 随机森林

Abstract: Static gesture recognition has caught special attention for its great application value in man-machine interaction.At the same time,the accuracy of gesture recognition is affected by the complexity of gesture background and the diversity of gesture morphology in a certain extent.In order to improve the accuracy of gesture recognition,a method was proposed,which is based on convolutional neural network(CNN) and random forest(RF).Firstly,the image of the static gesture is segmented,then the feature extraction function of convolution network is used to extract feature vectors,and finally the random forest classifier is used to classify these feature vectors.On the one hand,the CNN has the ability of layered learning and is able to collect more representative information on the picture.On the other hand,random forest shows randomness for samples and feature selection,meanwhile,it can be avoided easily that the results of each decision tree is averaged over fitting problem.This paper verified by using the static gesture data set,and the experimental results show that the proposed method can effectively identify the static gestures and achieve an average recognition rate of 94.56%.The method proposed in this paper was further compared with principal component analysis(PCA) and partial binary(LBP).The experimental results show that the classification and recognition effect with feature extraction by CNN is better than PCA and LBP.The recognition rate is 2.44% higher than that of PCA-RF methodand 1.74% higher than that of LBP-RF method.Finally,the recognition rate of the proposed method reaches 97.9%,which is higher than the other two traditional feature extraction methods.

Key words: Convolutional neural network, Random forest, Recognition, Static gesture

中图分类号: 

  • TP183
[1]ZAKI M M,SHAHEEN S I.Sign language recognition using a combination of new vision based features[J].Pattern Recognition Letters,2011,32(4):572-577.
[2]ALKHATEEB J H,KHELIFI F,JIANG J,et al.A new ap-proach for off-line handwritten Arabic word recognition using KNN classifier[C]∥IEEE InternationalConference on Signal and Image Processing Applications.IEEE,2010:191-194.
[3]LIU Y,YIN Y,ZHANG S.Hand Gesture Recognition Based on HU Moments in Interaction of Virtual Reality[C]∥InternationalConference on Intelligent Human-Machine Systems and Cybernetics.IEEE,2012:145-148.
[4]RONCANCIO C.Combined Gesture-Speech Recognition and Synthesis Using Neural Networks[J].IFAC Proceedings Vo-lumes,2008,41(2):2968-2973.
[5]LECUN Y,BENGIO Y.Convolutional networks for images, speech,and time series[M]∥The handbook of brain theory and neural networks.MIT Press,1998.
[6]WAIBEL A,HANAZAWA T,HINTON G,et al.Phoneme recognition using time-delay neural networks[J].Readings in Speech Recognition,1990,1(2):393-404.
[7]VAILLANT R,MONROCQ C,CUN Y L.An original approach for the localization of objects in images[C]∥International Conference on Artificial Neural Networks.IET,1993:26-30.
[8]LAWRENCE S,GILES C L,TSOI A C,et al.Face recognition:a convolutional neural-network approach[J].IEEE Transactions on Neural Networks,1997,8(1):98-113.
[9]NIU X X,SUEN C Y.A novel hybrid CNN-SVM classifier for recognizing handwritten digits[J].Pattern Recognition,2012,45(4):1318-1325.
[10]史鹤欢,许悦雷,马时平,等.PCA预训练的卷积神经网络目标识别算法[J].西安电子科技大学学报(自然科学版),2016,43(3):161-166.
[11]BREIMAN L.Random forest[J].Machine Learning,2001,45: 5-32.
[12]STERGIOPOULOU E,PAPAMARKOS N.Hand gesture re-cognition using a neural network shape fitting technique[J].Engineering Applications of Artificial Intelligence,2009,22(8):1141-1158.
[13]ESCALERA S,RADEVA P,DIMOV D,et al.Graph cuts optimization for multi-limb human segmentation in depth maps[C]∥IEEE Conference on Computer Vision and Pattern Recognition.IEEE Computer Society,2012:726-732.
[14]BELONGIE S,MALIK J,PUZICHA J.Shape matching and object recognition using shape contexts[C]∥IEEE International Conference on Computer Science and Information Technology.IEEE,2010:483-507.
[15]NAIR V,HINTON G E.Rectified linear units improve restric-ted boltzmann machines[C]∥International Conference on International Conference on Machine Learning.Omnipress,2010:807-814.
[16]QUINLAN J R.Bagging,boosting,and C4.5[C]∥Proceedings of the National Conference on Artificial Intelligence.AMER ASSOC ARTFICIAL INTELL,1996:725-730.
[17]JOHNSON R W.An Introduction to the Bootstrap[J].Teaching Statistics,2001,23(2):49-54.
[18]王全才.随机森林特征选择[D].大连:大连理工大学,2011.
[1] 周乐员, 张剑华, 袁甜甜, 陈胜勇.
多层注意力机制融合的序列到序列中国连续手语识别和翻译
Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion
计算机科学, 2022, 49(9): 155-161. https://doi.org/10.11896/jsjkx.210800026
[2] 胡安祥, 尹小康, 朱肖雅, 刘胜利.
基于数据流特征的比较类函数识别方法
Strcmp-like Function Identification Method Based on Data Flow Feature Matching
计算机科学, 2022, 49(9): 326-332. https://doi.org/10.11896/jsjkx.220200163
[3] 陈泳全, 姜瑛.
基于卷积神经网络的APP用户行为分析方法
Analysis Method of APP User Behavior Based on Convolutional Neural Network
计算机科学, 2022, 49(8): 78-85. https://doi.org/10.11896/jsjkx.210700121
[4] 朱承璋, 黄嘉儿, 肖亚龙, 王晗, 邹北骥.
基于注意力机制的医学影像深度哈希检索算法
Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism
计算机科学, 2022, 49(8): 113-119. https://doi.org/10.11896/jsjkx.210700153
[5] 陈坤峰, 潘志松, 王家宝, 施蕾, 张锦.
基于双目叠加仿生的微换衣行人再识别
Moderate Clothes-Changing Person Re-identification Based on Bionics of Binocular Summation
计算机科学, 2022, 49(8): 165-171. https://doi.org/10.11896/jsjkx.210600140
[6] 檀莹莹, 王俊丽, 张超波.
基于图卷积神经网络的文本分类方法研究综述
Review of Text Classification Methods Based on Graph Convolutional Network
计算机科学, 2022, 49(8): 205-216. https://doi.org/10.11896/jsjkx.210800064
[7] 李宗民, 张玉鹏, 刘玉杰, 李华.
基于可变形图卷积的点云表征学习
Deformable Graph Convolutional Networks Based Point Cloud Representation Learning
计算机科学, 2022, 49(8): 273-278. https://doi.org/10.11896/jsjkx.210900023
[8] 金方焱, 王秀利.
融合RACNN和BiLSTM的金融领域事件隐式因果关系抽取
Implicit Causality Extraction of Financial Events Integrating RACNN and BiLSTM
计算机科学, 2022, 49(7): 179-186. https://doi.org/10.11896/jsjkx.210500190
[9] 苏丹宁, 曹桂涛, 王燕楠, 王宏, 任赫.
小样本雷达辐射源识别的深度学习方法综述
Survey of Deep Learning for Radar Emitter Identification Based on Small Sample
计算机科学, 2022, 49(7): 226-235. https://doi.org/10.11896/jsjkx.210600138
[10] 费星瑞, 谢逸.
基于HMM-NN的用户点击流识别
Click Streams Recognition for Web Users Based on HMM-NN
计算机科学, 2022, 49(7): 340-349. https://doi.org/10.11896/jsjkx.210600127
[11] 高振卓, 王志海, 刘海洋.
嵌入典型时间序列特征的随机Shapelet森林算法
Random Shapelet Forest Algorithm Embedded with Canonical Time Series Features
计算机科学, 2022, 49(7): 40-49. https://doi.org/10.11896/jsjkx.210700226
[12] 杨炳新, 郭艳蓉, 郝世杰, 洪日昌.
基于数据增广和模型集成策略的图神经网络在抑郁症识别上的应用
Application of Graph Neural Network Based on Data Augmentation and Model Ensemble in Depression Recognition
计算机科学, 2022, 49(7): 57-63. https://doi.org/10.11896/jsjkx.210800070
[13] 胡艳羽, 赵龙, 董祥军.
一种用于癌症分类的两阶段深度特征选择提取算法
Two-stage Deep Feature Selection Extraction Algorithm for Cancer Classification
计算机科学, 2022, 49(7): 73-78. https://doi.org/10.11896/jsjkx.210500092
[14] 张颖涛, 张杰, 张睿, 张文强.
全局信息引导的真实图像风格迁移
Photorealistic Style Transfer Guided by Global Information
计算机科学, 2022, 49(7): 100-105. https://doi.org/10.11896/jsjkx.210600036
[15] 戴朝霞, 李锦欣, 张向东, 徐旭, 梅林, 张亮.
基于DNGAN的磁共振图像超分辨率重建算法
Super-resolution Reconstruction of MRI Based on DNGAN
计算机科学, 2022, 49(7): 113-119. https://doi.org/10.11896/jsjkx.210600105
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!