计算机科学 ›› 2024, Vol. 51 ›› Issue (10): 295-301.doi: 10.11896/jsjkx.230900094
徐金龙1,3, 董明瑞1,2, 李颖颖1,3, 刘艳青1, 韩林1
XU Jinlong1,3, DONG Mingrui1,2, LI Yingying1,3, LIU Yanqing1, HAN Lin1
摘要: 近年来,眼睛凝视估计引起广泛关注。基于RGB外观的凝视估计方法使用普通摄像机和深度学习来进行凝视估计,避免了像商用眼动仪一样使用昂贵的红外设备,为更准确和成本更低的眼睛凝视估计提供了可能。然而,RGB外观图像中包含如光照强度、肤色等多种与凝视无关的特征,这些无关特征会在深度学习回归的过程中产生干扰,进而影响凝视估计的精度。针对以上问题,提出了一种名为类注意力网络(CA-Net)的新架构,它包含通道、尺度、眼睛3种不同的类注意力模块,通过这些类注意力模块可以提取和融合不同种类的注意力编码,从而降低与凝视无关特征所占的权重。在GazeCapture数据集上的大量实验表明,在基于RGB外观的凝视估计方法中,相比现有的最先进方法,CA-Net在手机和平板上分别能够提高约0.6%和7.4%的凝视估计精度。
中图分类号:
[1]GUO B Y,FANG W N.Fatigue detection method based on eye tracker[J].Aerospace Medicine and Medical Engineering,2004(4):256-260. [2]YAN G L.The Application of Eye Movement Analysis in Advertising Psychology Research[J].Psychological Dynamics,1999(4):50-53. [3]BORGESTIG M,SANDQVIST J,AHLSTEN G,et al.Gaze-based assistive technology in daily activities in children with severe physical impairments-an intervention study[J].Pediatric Rehabilitation,2017,20(3):129-141. [4]CHENNAMMA H R,YUAN X H.A survey on eye-gaze tra-cking techniques[J].Indian Journal of Computer Science and Engineering,2013,4(5):388-393. [5]MURTHY L R D,PRADIPTA B.Appearance-based gaze esti-mation using attention and difference mechanism[C]//Compu-ter Vision and Pattern Recognition Workshops.IEEE Computer Society,2021:3137-3146. [6]BAO Y,CHENG Y,LIU Y,et al.Adaptive feature fusion network for gaze tracking in mobile tablets[C]//International Conference on Pattern Recognition.The International Association for Pattern Recognition,2021:9936-9943. [7]HU J,SHEN L,SUN G.Squeeze-and-excitation networks[C]//Computer Vision and Pattern Recognition.IEEE Computer So-ciety,2018:7132-7141. [8]GUESTRIN E D,EIZENMAN M.General theory of remotegaze estimation using the pupil center and corneal reflections[J].IEEE Transactions on Biomedical Engineering,2006,53(6):1124-1133. [9]SUN L,LIU Z,SUN M T.Real time gaze estimation with a con-sumer depth camera[J].Information Sciences,2015(320):346-360. [10]DEMENTHON D F,DAVIS L S.Model-based object pose in 25 lines of code[J].International Journal of Computer Vision,1995,15(1/2):123-141. [11]WANG K,JI Q.Real time eye gaze tracking with 3d deformable eye-face model[C]//International Conference on Computer Vision.IEEE Computer Society,2017:1003-1011. [12]KRAFKA K,KHOSLA A,KELLNHOFER P,et al.Eye tra-cking for everyone[C]//Computer Vision and Pattern Recognition.IEEE Computer Society,2016:2176-2184. [13]HE J F,PHAM K,LALLIAPPAN N,et al.On-device few-shot personalization for real-time gaze estimation[C]//International Conference on Computer Vision Workshop.IEEE Computer Society,2019:1149-1158. [14]GUO T C,LIU Y C,ZHANG H,et al.A generalized and robust method towards practical gaze estimation on smart phone[C]//International Conference on Computer Vision Workshop.IEEE Computer Society,2019:1131-1139. [15]ATHAVALE R,MOTATI L S,KALAHASTY R.One eye isall you need:lightweight ensembles for gaze estimation with single encoders[J].arXiv:2211.11936,2022. [16]VASWANI A,SHAZEER N,PARMAR N,et al.Attention is all you need[C]//Proceedings of the 31st International Confe-rence on Neural Information Processing Systems.2017:6000-6010. [17]SHI W Z,CABALLERO J,HUSZAR F,et al.Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network[C]//Computer Vision and Pattern Recognition.IEEE Computer Society,2016:1874-1883. [18]JACOB D,CHANG M W,LEE K,et al.BERT:pre-training of deep bidirectional transformers for language understanding[C]//North American Chapter of the Association for Computational Linguistics:Human Language Technologies.2019:4171-4186. [19]PORAC C,COREN S.The dominant eye[J].Psychological Bulletin,1976,83(5):880-897. [20]BAO J,LIU B,YU J.An individual-difference-aware model for cross-person gaze estimation[J].IEEE Transactions on Image Processing,2022,31:3322-3333. [21]KARTYNNIK Y,ABLAVATSKI A,GRISHCHENKO I,et al.Real-time facial surface geometry from monocular video on mobile gpus[J].arXiv:1907.06724,2019. |
|