Computer Science ›› 2024, Vol. 51 ›› Issue (10): 295-301.doi: 10.11896/jsjkx.230900094

• Computer Graphics & Multimedia • Previous Articles     Next Articles

Eye Gaze Estimation Network Based on Class Attention

XU Jinlong1,3, DONG Mingrui1,2, LI Yingying1,3, LIU Yanqing1, HAN Lin1   

  1. 1 National Supercomputing Center in Zhengzhou,Zhengzhou 450000,China
    2 School of Computer and Artificial Intelligence,Zhengzhou University,Zhengzhou 450000,China
    3 Information Engineering University,Zhengzhou 450000,China
  • Received:2023-09-18 Revised:2024-01-16 Online:2024-10-15 Published:2024-10-11
  • About author:XU Jinlong,born in 1985,Ph.D,master'ssupervisor.His main research interests include high performance computing and parallel compilation.
    HAN Lin,born in 1978,Ph.D,associate professor,is a senior member of CCF(No.16416M).His main research interests include compiler optimization and high performance computing.
  • Supported by:
    2022 Henan Province Major Science and Technology Special Project(2211002110600),22 Qiushi Research Initiation(Natural Science)(32213247) and 2023 Henan Province Science and Technology Research Special Project(232102210185).

Abstract: In recent years,eye gaze estimation has attracted widespread attention.The gaze estimation method based on RGB appearance uses ordinary cameras and deep learning for gaze estimation,avoiding the use of expensive infrared devices like commercial eye trackers,providing the possibility for more accurate and cost-effective eye gaze estimation.However,due to the presence of various features unrelated to gaze,such as lighting intensity and skin color,in RGB appearance images,these irrelevant features can cause interference in the deep learning regression process,thereby affecting the accuracy of gaze estimation.In response to the above issues,this paper proposes a new architecture called class attention network(CA-Net),which includes three different class attention modules:channel,scale,and eye.Through these class attention modules,different types of attention encoding can be extracted and fused,thereby reducing the weight of gaze independent features.Extensive experiments on the GazeCapture dataset show that,compared to the state-of-the-art method,CA-Net can improve gaze estimation accuracy by approximately 0.6% and 7.4% on mobile phones and tablets,respectively,in RGB based gaze estimation methods.

Key words: Class attention, Light squeeze-and-excitation, Self-attention, Multiscale, Eye gaze estimation

CLC Number: 

  • TP183
[1]GUO B Y,FANG W N.Fatigue detection method based on eye tracker[J].Aerospace Medicine and Medical Engineering,2004(4):256-260.
[2]YAN G L.The Application of Eye Movement Analysis in Advertising Psychology Research[J].Psychological Dynamics,1999(4):50-53.
[3]BORGESTIG M,SANDQVIST J,AHLSTEN G,et al.Gaze-based assistive technology in daily activities in children with severe physical impairments-an intervention study[J].Pediatric Rehabilitation,2017,20(3):129-141.
[4]CHENNAMMA H R,YUAN X H.A survey on eye-gaze tra-cking techniques[J].Indian Journal of Computer Science and Engineering,2013,4(5):388-393.
[5]MURTHY L R D,PRADIPTA B.Appearance-based gaze esti-mation using attention and difference mechanism[C]//Compu-ter Vision and Pattern Recognition Workshops.IEEE Computer Society,2021:3137-3146.
[6]BAO Y,CHENG Y,LIU Y,et al.Adaptive feature fusion network for gaze tracking in mobile tablets[C]//International Conference on Pattern Recognition.The International Association for Pattern Recognition,2021:9936-9943.
[7]HU J,SHEN L,SUN G.Squeeze-and-excitation networks[C]//Computer Vision and Pattern Recognition.IEEE Computer So-ciety,2018:7132-7141.
[8]GUESTRIN E D,EIZENMAN M.General theory of remotegaze estimation using the pupil center and corneal reflections[J].IEEE Transactions on Biomedical Engineering,2006,53(6):1124-1133.
[9]SUN L,LIU Z,SUN M T.Real time gaze estimation with a con-sumer depth camera[J].Information Sciences,2015(320):346-360.
[10]DEMENTHON D F,DAVIS L S.Model-based object pose in 25 lines of code[J].International Journal of Computer Vision,1995,15(1/2):123-141.
[11]WANG K,JI Q.Real time eye gaze tracking with 3d deformable eye-face model[C]//International Conference on Computer Vision.IEEE Computer Society,2017:1003-1011.
[12]KRAFKA K,KHOSLA A,KELLNHOFER P,et al.Eye tra-cking for everyone[C]//Computer Vision and Pattern Recognition.IEEE Computer Society,2016:2176-2184.
[13]HE J F,PHAM K,LALLIAPPAN N,et al.On-device few-shot personalization for real-time gaze estimation[C]//International Conference on Computer Vision Workshop.IEEE Computer Society,2019:1149-1158.
[14]GUO T C,LIU Y C,ZHANG H,et al.A generalized and robust method towards practical gaze estimation on smart phone[C]//International Conference on Computer Vision Workshop.IEEE Computer Society,2019:1131-1139.
[15]ATHAVALE R,MOTATI L S,KALAHASTY R.One eye isall you need:lightweight ensembles for gaze estimation with single encoders[J].arXiv:2211.11936,2022.
[16]VASWANI A,SHAZEER N,PARMAR N,et al.Attention is all you need[C]//Proceedings of the 31st International Confe-rence on Neural Information Processing Systems.2017:6000-6010.
[17]SHI W Z,CABALLERO J,HUSZAR F,et al.Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network[C]//Computer Vision and Pattern Recognition.IEEE Computer Society,2016:1874-1883.
[18]JACOB D,CHANG M W,LEE K,et al.BERT:pre-training of deep bidirectional transformers for language understanding[C]//North American Chapter of the Association for Computational Linguistics:Human Language Technologies.2019:4171-4186.
[19]PORAC C,COREN S.The dominant eye[J].Psychological Bulletin,1976,83(5):880-897.
[20]BAO J,LIU B,YU J.An individual-difference-aware model for cross-person gaze estimation[J].IEEE Transactions on Image Processing,2022,31:3322-3333.
[21]KARTYNNIK Y,ABLAVATSKI A,GRISHCHENKO I,et al.Real-time facial surface geometry from monocular video on mobile gpus[J].arXiv:1907.06724,2019.
[1] LI Yunchen, ZHANG Rui, WANG Jiabao, LI Yang, WANG Ziqi, CHEN Yao. Re-parameterization Enhanced Dual-modal Realtime Object Detection Model [J]. Computer Science, 2024, 51(9): 162-172.
[2] CHEN Siyu, MA Hailong, ZHANG Jianhui. Encrypted Traffic Classification of CNN and BiGRU Based on Self-attention [J]. Computer Science, 2024, 51(8): 396-402.
[3] LI Jiaying, LIANG Yudong, LI Shaoji, ZHANG Kunpeng, ZHANG Chao. Study on Algorithm of Depth Image Super-resolution Guided by High-frequency Information ofColor Images [J]. Computer Science, 2024, 51(7): 197-205.
[4] LIU Xiaohu, CHEN Defu, LI Jun, ZHOU Xuwen, HU Shan, ZHOU Hao. Speaker Verification Network Based on Multi-scale Convolutional Encoder [J]. Computer Science, 2024, 51(6A): 230700083-6.
[5] HOU Linhao, LIU Fan. Remote Sensing Image Fusion Combining Multi-scale Convolution Blocks and Dense Convolution Blocks [J]. Computer Science, 2024, 51(6A): 230400110-6.
[6] QUE Yue, GAN Menghan, LIU Zhiwei. Object Detection with Receptive Field Expansion and Multi-branch Aggregation [J]. Computer Science, 2024, 51(6A): 230600151-6.
[7] ZHANG Lanxin, XIANG Ling, LI Xianze, CHEN Jinpeng. Intelligent Fault Diagnosis Method for Rolling Bearing Based on SAMNV3 [J]. Computer Science, 2024, 51(6A): 230700167-6.
[8] DING Tianshu, CHEN Yuanyuan. Medical Image Segmentation Algorithm Based on Self-attention and Multi-scale Input-Output [J]. Computer Science, 2024, 51(2): 135-141.
[9] WANG Wenjie, YANG Yan, JING Lili, WANG Jie, LIU Yan. LNG-Transformer:An Image Classification Network Based on Multi-scale Information Interaction [J]. Computer Science, 2024, 51(2): 189-195.
[10] ZHANG Feng, HUANG Shixin, HUA Qiang, DONG Chunru. Novel Image Classification Model Based on Depth-wise Convolution Neural Network andVisual Transformer [J]. Computer Science, 2024, 51(2): 196-204.
[11] ZHOU Xueyang, FU Qiming, CHEN Jianping, LU You, WANG Yunzhe. Chemical-induced Disease Relation Extraction:Graph Reasoning Method Based on Evidence Focusing [J]. Computer Science, 2024, 51(10): 351-361.
[12] TENG Sihang, WANG Lie, LI Ya. Non-autoregressive Transformer Chinese Speech Recognition Incorporating Pronunciation- Character Representation Conversion [J]. Computer Science, 2023, 50(8): 111-117.
[13] YAN Mingqiang, YU Pengfei, LI Haiyan, LI Hongsong. Arbitrary Image Style Transfer with Consistent Semantic Style [J]. Computer Science, 2023, 50(7): 129-136.
[14] ZENG Wu, MAO Guojun. Few-shot Learning Method Based on Multi-graph Feature Aggregation [J]. Computer Science, 2023, 50(6A): 220400029-10.
[15] LI Fan, JIA Dongli, YAO Yumin, TU Jun. Graph Neural Network Few Shot Image Classification Network Based on Residual and Self-attention Mechanism [J]. Computer Science, 2023, 50(6A): 220500104-5.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!