计算机科学 ›› 2018, Vol. 45 ›› Issue (5): 232-237.doi: 10.11896/j.issn.1002-137X.2018.05.040
刘壮,柴秀娟,陈熙霖
LIU Zhuang, CHAI Xiu-juan and CHEN Xi-lin
摘要: 在人机交互、手语识别等大量与人手有关的视觉任务中,手部检测是极为重要的一个预处理阶段。随着RGB-D数据采集设备的发展,额外提供的深度数据能够与传统使用的彩色数据互相补充以提供更强的特征表达。此外,传统的检测方法由于使用肤色、HOG等手工设计的特征,不能对手部进行很好的表达。而基于深度学习的检测方法通过从数据中自动学习有效的特征避免了这个问题。为了结合RGB-D数据和深度学习技术的优点,提出了一种融合彩色和深度数据的双通道 Faster R-CNN检测框架。该方法在原有Faster R-CNN检测框架的基础上,增加了Depth通道信息,并在特征层面上将其与RGB通道信息进行融合。实验结果表明,所提方法在性能上比仅采用RGB或在数据层面上融合的Faster R-CNN框架有明显优势。因此,该方法能有效融合来自彩色和深度通道的数据,以提升手部检测性能。
[1] KAKUMANU P,MAKROGIANNIS S,BOURBAKIS N.Asurvey of skin-color modeling and detection methods[J].Pattern recognition,2007,40(3):1106-1122. [2] DAWOD A Y,ABDULLAH J,ALAM M J.Adaptive skin color model for hand segmentation[C]∥2010 International Confe-rence on Computer Applications and Industrial Electronics(ICCAIE).IEEE,2010:486-489. [3] KLSCH M,TURK M.Robust Hand Detection[C]∥FGR.2004:614-619. [4] SHOTTON J,BLAKE A,CIPOLLA R.Contour-based learning for object detection[C]∥Tenth IEEE International Conference on Computer Vision,2005(ICCV 2005).IEEE,2005:503-510. [5] ONG E J,BOWDEN R.A boosted classifier tree for hand shape detection[C]∥Sixth IEEE International Conference on Automatic Face and Gesture Recognition,2004.IEEE,2004:889-894. [6] SHEIKH Y,JAVED O,KANADE T.Background subtraction for freely moving cameras[C]∥2009 IEEE 12th International Conference on Computer Vision.IEEE,2009:1219-1225. [7] FELZENSZWALB P F,GIRSHICK R B,MCALLESTER D,et al.Object detection with discriminatively trained part-based models[J].IEEE Transactions on Pattern Nnalysis and Machine Intelligence,2010,32(9):1627-1645. [8] MITTAL A,ZISSERMAN A,TORR P H S.Hand detectionusing multiple proposals[C]∥Proceedings of British Machine Vision Conference.2011:1-11. [9] GIRSHICK R,DONAHUE J,DARRELL T,et al.Rich feature hierarchies for accurate object detection and semantic segmentation[C]∥IEEE Conference on Computer Vision and Pattern Recognition.2014:580-587. [10] HE K,ZHANG X,REN S,et al.Spatial pyramid pooling in deep convolutional networks for visual recognition[C]∥European Conference on Computer Vision.Springer International Publishing,2014:346-361. [11] GIRSHICK R.Fast r-cnn[C]∥IEEE International Conference on Computer Vision.2015:1440-1448. [12] REN S,HE K,GIRSHICK R,et al.Faster R-CNN:Towardsreal-time object detection with region proposal networks[C]∥Advances in Neural Information Processing Systems.2015:91-99. [13] DAI J,LI Y,HE K,et al.R-FCN:Object Detection via Region-based Fully Convolutional Networks[J].arXiv preprint arXiv:1605.06409,2016. [14] ZHANG L,LIN L,LIANG X,et al.Is Faster R-CNN DoingWell for Pedestrian Detection?[C]∥European Conference on Computer Vision.Springer International Publishing,2016:443-457. [15] UIJLINGS J R R,VAN DE SANDE K E A,GEVERS T,et al.Selective search for obje湣晴攠牲敥湣捯敧?潩湴??潮浛灊畝琮敉牮?噥楲獮楡潴湩?慮湡摬?偊慯瑵瑲敮牡湬?副敦挠潃杯湭楰瑵楴潥湲?坖潩牳歩獯桮漬瀲猰?社???????????戭爱????孢??崊??噛?制?丠???????娠?匠卌?剄??乌???坐??????卢???????散瑡?慩汮?吠桯敢?健?却????癰楯猭畳慡汬?漠扦橲敯捭琠?捤汧慥獳獛敃獝?噅併???捥桡慮氠汃敯湮杦敥孲?嵮??渠瑯敮爠湃慯瑭楰潵湴慥汲??潩畳物湯慮氮?潰晲??潧浥灲甠瑉敮牴?噲楮獡楴潩湯??ぬㄠぐ???????の???????戳爹????戵爮?br> [17] NEUBECK A,VAN GOOL L.Efficient non-maximum suppression[C]∥18th International Conference on Pattern Recognition,2006(ICPR 2006).IEEE,2006,3:850-855. [18] REDMON J,DIVVALA S,GIRSHICK R,et al.You only look once:Unified,real-time object detection[C]∥IEEE Conference on Computer Vision and Pattern Recognition.2016:779-788. [19] LIU W,ANGUELOV D,ERHAN D,et al.SSD:Single shot multibox detector[C]∥European Conference on Computer Vision.Springer International Publishing,2016:21-37. [20] REDMON J,FARHADI A.YOLO9000:Better,Faster,Stronger[J].arXiv preprint arXiv:1612.08242,2016. [21] SHARIF RAZAVIAN A,AZIZPOUR H,SULLIVAN J,et al.CNN features off-the-shelf:an astounding baseline for recognition[C]∥IEEE Conference on Computer Vision and Pattern Recognition Workshops.2014:806-813. [22] LONG J,SHELHAMER E,DARRELL T.Fully convolutional networks for semantic segmentation[C]∥IEEE Conference on Computer Vision and Pattern Recognition.2015:3431-3440. [23] BRADSKI G,KAEHLER A.Learning OpenCV:Computer vision with the OpenCV library[M].Sebastopol:O’Reilly Media,Inc.,2008. [24] JIA Y,SHELHAMER E,DONAHUE J,et al.Caffe:Convolutional architecture for fast feature embedding[C]∥22nd ACM International Conference on Multimedia.ACM,2014:675-678. [25] ZEILER M D,FERGUS R.Visualizing and understanding convolutional networks[C]∥European Conference on Computer Vision.Springer International Publishing,2014:818-833. [26] DENG J,DONG W,SOCHER R,et al.Imagenet:A large-scale hierarchical image database[C]∥ IEEE Conference on Computer Vision and Pattern Recognition,2009(CVPR 2009).IEEE,2009:248-255. [27] WAN J,ZHAO Y,ZHOU S,et al.Chalearn looking at people rgb-d isolated and continuous datasets for gesture recognition[C]∥Proceedings of the IEEE Co |
No related articles found! |
|