Computer Science ›› 2020, Vol. 47 ›› Issue (3): 174-181.doi: 10.11896/jsjkx.190800040

• Artificial Intelligence • Previous Articles     Next Articles

Survey of Construction and Application of Reading Eye-tracking Corpus

WANG Xiao-ming,ZHAO Xin-bo   

  1. (National Engineering Laboratory for Integrated Aero-Space-Ground-Ocean Big Data Application Technology, School of Computer Science, Northwestern Polytechnical University, Xi’an 710072, China)
  • Received:2019-08-05 Online:2020-03-15 Published:2020-03-30
  • About author:WANG Xiao-ming,born in 1982,Ph.D,lecturer,is member of China Computer Federation (CCF).His main research interests include reading eye movement modeling and artificial intelligence.ZHAO Xin-bo,born in 1970,Ph.D,professor,Ph.D supervisor.His main research interests include image proces-sing,computer vision,pattern recognition and artificial intelligence.
  • Supported by:
    This work was supported by the National Natural Science Foundation of China (61231016, 61871326), Humanities and Social Science Fund of Ministry of Education of China (18YJCZH180) and Social Science Foundation of Shaanxi Province (2019M001).

Abstract: The eye movements in reading are a reflection of the human cognitive process.Reading eye movement data is an important basic data in fields such as cognitive psychology,applied linguistics and computer science,while China is lack of the basic study data in this field.In view of this situation,this paper first introduced the background of reading eye-tracking corpus and the related literatures at home and abroad.Then,it presented the contents and indexes of eye movement in reading eye-tracking corpus including single fixation duration,the first fixation duration,gaze duration,total fixation duration,regression in count,regression out count from low level visual factors and high level visual factors,and analyzed three advantages of using corpus research method for reading eye movement compared to the traditional reading eye movement experiments.At last,some of influential and completed reading eye-tracking corpora were elaborated from the perspectives of the index variables,corpus size,corpus content,corpus language,scale of participants,characteristics of participants and data acquire equipment.It is expected to provide some reference for people who engage in reading eye movement.In the applied research of eye-tracking corpus,this paper reviewed the major researches in cognitive psychology,applied linguistics,computer science and related fields.Based on eye-tracking corpus,the representative studies carried out by computer science in eye movement computational model,natural language processing and pattern recognition were introduced with emphasis.Besides,the studies in eye tracking corpus construction and application in China were covered.This paper reviewed the current situation of relevant studies,analyzed the reasons for the lack of basic data,and proposed the solutions and suggestions from the point of view of the state,scientific research institutions and scientific workers respectively.

Key words: Artificial intelligence, Computational linguistics, Corpus, Eye movement data, Eye-tracking, Reading eye movement

CLC Number: 

  • TP391
[1]LI J Y,PU Y T.A Review of Domestic Studies on L2 Reading Strategies in the Past Decade [J].Foreign Language Education,2017,38(3):62-67.
[2]RAYNER K.Eye movements in reading and information pro- cessing:20 years of research [J].Psychological Bulletin,1998,124(3):372-422.
[3]COLTHEART M,RASTLE K,PERRY C,et al.DRC:A dual route cascaded model of visual word recognition and reading aloud[J].Psychological Review,2001,108(1):204-256.
[4]DILKINA K,MCCLELLAND J L,PLAUT D C.Are there mental lexicons? The role of semantics in lexical decision [J].Brain Research,2010,1365(3):66-81.
[5]HARM M W,SEIDENBERG M S.Computing the Meanings of Words in Reading:Cooperative Division of Labor Between Visual andPhonological Processes[J].Psychological Review,2004,111(3):662-720.
[6]MENG H X.The selection mechanism of saccade target in Chinese reading [M].Guangzhou:World Book Publishing Guangdong Co.,Ltd.,2016:30-108.
[7]LIU X L.Visual neurophysiology [M].Beijing:People’s Medical Publishing House,2011:1-42.
[8]YAN G L,XIONG J P,ZANG C L,et al.Review of Eye-movement Measures in Reading Research [J].Advances in Psychological Science,2013,21(4):589-605.
[9]WANG F J,TIAN M,HUANG Y P,et al.Classification Model of Visual Attention Based on Eye Movement Data[J].Computer Science,2016,43(1):85-88,115.
[10]MOUSIKOU P,SADAT J,LUCAS R,et al.Moving beyond the monosyllable in models of skilled reading:Mega-study of disyllabic nonword reading [J].Journal of Memory and Language,2017,93(3):169-192.
[11]KENNEDY A,HILL R,PYNTE J.The Dundee Corpus[C]∥Proceedings of the 12th European Conference on Eye Movement.Dundee,Scotland:Elsevier,2003:13-23.
[12]University of Dundee.Dundee Corpus update could help reveal the secrets of reading [EB/OL].(2018-04-13).https://www.dundee.ac.uk/social-sciences/news/2018/article/dundee-corpus-update-could-help-reveal-the-secrets-of-reading.php.
[13]FRANK S.Surprisal-based comparison between a symbolic and a connectionist model of sentence processing[J].Proceedings of the annual meeting of the Cognitive Science Society,2009,31(31):1139-1144.
[14]BARRETT M,AGIC Ž,SØGAARD A.The Dundee Treebank [C]∥Proceedings of the 14th International Workshop on Treebanks and Linguistic Theories.2015.
[15]KLIEGL R,GRABNER E,ROLFS M,et al.Length,frequency,and predictability effects of words on eye movements in reading [J].European Journal of Cognitive Psychology,2004,16(1/2):262-284.
[16]HUSAIN S,VASISHTH S,SRINIVASAN N.Integration and prediction difficulty in Hindi sentence comprehension:Evidence from an eye-tracking corpus [J].Journal of Eye Movement Research,2014,8(2):1-13.
[17]BHATT R,NARASIMHAN B,PALMER M,et al.A multi-representational and multi-layered treebank for hindi/urdu[C]∥Proceedings of the Third Linguistic Annotation Workshop.Association for Computational Linguistics,2009:186-189.
[18]LUKE S G,CHRISTIANSON K.The Provo Corpus:A large eye-tracking corpus with predictability norms [J].Behavior Research Methods,2018,50(2):826-833.
[19]COP U,DIRIX N,DRIEGHE D,et al.Presenting GECO:An eye tracking corpus of monolingual and bilingual sentence reading [J].Behavior Research Methods,2016,49(2):1-14.
[20]HOLLENSTEIN N,ROTSZTEJN J,TROENDLE M,et al.ZuCo,a simultaneous EEG and eye-tracking resource for natural sentence reading[J].Scientific Data,2018,5:180291-180291.
[21]BALOTA D A,YAP M J,HUTCHISON K A,et al.The English lexicon project [J].Behavior Research Methods,2007,39(3):445-459.
[22]FERRAND L,NEW B,BRYSBAERT M,et al.The French Lexicon Project:Lexical decision data for 38,840 French words and 38,840 pseudowords [J].Behavior Research Methods,2010,42(2):488-496.
[23]KEULEERS E,DIEPENDAELE K,BRYSBAERT M.Practice effects in large-scale visual word recognition studies:A lexical decision study on 14,000 Dutch mono-and disyllabic words and nonwords [J].Frontiers in Psychology,2010,13(1):174-183.
[24]KEULEERS E,LACEY P,RASTLE K,et al.The British Lexicon Project:Lexical decision data for 28,730 monosyllabic and disyllabic English words [J].Behavior research methods,2012,44(1):287-304.
[25]SHI F.The Macrohistory,Mesohistory and Microhistory of Evo- lutionary Linguistics [J].Nankai Journal(Philosophy,Literature and Social Science Edition),2018,33(4):65-71.
[26]GUO X Y,LI L,GENG H J.Eye-movement Analysis of Visual Similarity Perception on Synthesized Texture Images [J].Computer Science,2018,45(8):223-228.
[27]YU M.Eye Movement Research on Syntactic Ambiguity Pro- cessing in Modern Chinese[M].Tianjin:Nankai University Press,2014:21-25.
[28]KUPERMAN V,VAN DYKE J A.Reassessing word frequency as a determinant of word recognition for skilled and unskilled readers [J].Journal of Experimental Psychology:Human Perception and Performance,2013,39(3):802-813.
[29]YAP M J,BALOTA D A.Visual word recognition of multisyllabic words [J].Journal of Memory and Language,2009,60(4):502-529.
[30]WHITNEY C.Location,location,location:How it affects the neighborhood (effect) [J].Brain and Language,2011,118(3):90-104.
[31]KENNEDY A,PYNTE J.Parafoveal-on-foveal effects in normal reading [J].Vision Research,2005,45(2):153-168.
[32]PYNTE J,KENNEDY A.The influence of punctuation and word class on distributed processing in normal reading [J].Vision Research,2007,47(9):1215-1227.
[33]KENNEDY A,PYNTE J.The consequences of violations to reading order:An eye movement analysis [J].Vision Research,2008,48(21):2309-2320.
[34]DIEPENDAELE K,BRYSBAERT M,NERI P.How noisy is lexical decision? [J].Frontiers in psychology,2012,13(3):348-353.
[35]NORRIS D,KINOSHITA S.Orthographic processing is universal;it’s what you do with it that’s different [J].Behavioral and Brain Sciences,2012,35(5):296-297.
[36]KENNEDY A,PYNTE J,MURRAY W S,et al.Frequency and predictability effects in the Dundee Corpus:An eye movement analysis [J].The Quarterly Journal of Experimental Psycho-logy,2013,66(3):601-618.
[37]DEMBERG V,KELLER F.Data from eye-tracking corpora as evidence for theories of syntactic processing complexity [J].Cognition,2008,109(2):193-210.
[38]MITCHELL J,LAPATA M,DEMBERG V,et al.Syntactic and semantic factors in processing difficulty:An integrated measure[C]∥Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics.Association for Computational Linguistics,2010:196-206.
[39]FRANK S L,BOD R.Insensitivity of the human sentence-processing system to hierarchical structure [J].Psychological Scie-nce,2011,22(6):829-834.
[40]FOSSUM V,LEVY R.Sequential vs.hierarchical syntactic models of human incremental sentence processing[C]∥Procee-dings of the 3rd Workshop on Cognitive Modeling and Computational Linguistics.Association for Computational Linguistics,2012:61-69.
[41]KUPERMAN V,DRIEGHE D,KEULEERS E,et al.How strongly do word reading times and lexical decision times correlate? Combining data from eye movement corpora and megastudies[J].The Quarterly Journal of Experimental Psychology,2013,66(3):563-580.
[42]REICHLE E D,RAYNER K,POLLATSEK A.The E-Z reader model of eye-movement control in reading:comparisons to other models [J].Behavioral & Brain Sciences,2003,26(4):477-526.
[43]DUCROT S,LÉTÉ B,SPRENGER-CHAROLLES L,et al.The optimal viewing position effect in beginning and dyslexic readers [J].Current Psychology Letters,Behaviour,Brain & Cognition,2003,33(10):23-33.
[44]JOSEPH H S S L,LIVERSEDGE S P,BLYTHE H I,et al.Word length and landing position effects during reading in children and adults[J].Vision Research,2009,49(16):2078-2086.
[45]VITU F,MCCONKIE G W,KERR P,et al.Fixation location effects on fixation durations during reading:An inverted optimal viewing position effect[J].Vision Research,2001,41(25/26):3513-3533.
[46]NILSSON M,NIVRE J.Learning where to look:modeling eye movements in reading[C]∥Thirteenth Conference on Computational Natural Language Learning.Boulder,Colorado:Association for Computational Linguistics,2009:93-101.
[47]NILSSON M,NIVRE J.Towards a data-driven model of eye movement control in reading[C]∥The Workshop on Cognitive Modeling & Computational Linguistics.Uppsala,Sweden:Association for Computational Linguistics,2010:63-71.
[48]HARA T,MOCHIHASHI D,KANO Y,et al.Predicting Word Fixations in Text with a CRF Model for Capturing General Reading Strategies among Readers[C]∥The Workshop on Eye-Tracking & Natural Language Processing.Mumbai:Association for Computational Linguistics,2012:55-70.
[49]MATTIES F,SØGAARD A.With blinkers on:robust prediction of eye movements across readers[C]∥Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing.Seattle,Washington:Association for Computational Linguistics,2013:803-807.
[50]WANG X,ZHAO X,XIA M.The Prediction Model of Saccade Target Based on LSTM-CRF for Chinese Reading[C]∥International Conference on Brain Inspired Cognitive Systems.Cham:Springer,2018:44-53.
[51]WANG X M,ZHAO X B,REN J C.A New Type of Eye Movement Model Based on Recurrent Neural Networks for Simulating the Gaze Behavior of Human Reading [J].Complexity,2019,2019:1-12.
[52]WANG X M,ZHAO X B.Eye movement prediction of indivi- duals while reading based on deep neural networks [J].Journal of Tsinghua University(Science and Technology),2019,59(6):468-475.
[53]BARRETT M,BINGEL J,KELLER F,et al.Weakly supervised part-of-speech tagging using eye-tracking data[C]∥54th AnnualMeeting of the Association for Computational Linguistics,(ACL 2016).2016:579-584.
[54]MISHRA A,KANOJIA D,NAGAR S,et al.Leveraging Cognitive Features for Sentiment Analysis[C]∥Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learning.2016.
[55]SØGAARD A.Evaluating word embeddings with fMRI and eye-tracking[C]∥Proceedings of the 1st Workshop on Evaluating Vector-Space Representations for NLP.2016:116-121.
[56]WANG Z M,ZHANG S,HE Y.Selective Ensemble Learning Human Activity Recognition Model Based on Diversity Mea-surement Cluster[J].Computer Science,2018,45(1):307-312.
[57]KASPROWSKI P,OBER J.Eye movements in biometrics[C]∥International Workshop on Biometric Authentication.Berlin:Springer,2004:248-258.
[58]HOLLAND C,KOMOGORTSEV O V.Biometric identification via eye movement scanpaths in reading[C]∥2011 International Joint Conference on Biometrics (IJCB).IEEE,2011:1-8.
[59]RIGAS I,ECONOMOU G,FOTOPOULOS S.Biometric identification based on the eye movements and graph matching techniques[J].Pattern Recognition Letters,2012,33(6):786-792.
[60]CANTONI V,GALDI C,NAPPI M,et al.GANT:Gaze analysis technique for human identification [J].Pattern Recognition,2015,48(4):1027-1038.
[61]Laboratory of Bilingual Cognition and Development of Guangdong University of Foreign Studies.Professor Bai Xue-jun talks about "Several Basic Characteristics of Chinese Eye Movement Research"[EB/OL].(2017-02-26)[2018-12-03].http://bcdlab.gdufs.edu.cn/info/1017/1463.htm.
[62]YAN M,KLIEGL R,RICHTER E M,et al.Flexible saccade-target selection in Chinese reading [J].The Quarterly Journal of Experimental Psychology,2010,63(4):705-725.
[63]YU B,ZHANG W,JING Q,et al.STM capacity for Chinese and English language materials [J].Memory & Cognition,1985,13(3):202-207.
[1] LIU Chang, WEI Wei-min, MENG Fan-xing, CAI Zhi. Research Progress on Speech Style Transfer [J]. Computer Science, 2022, 49(6A): 301-308.
[2] DU Xiao-ming, YUAN Qing-bo, YANG Fan, YAO Yi, JIANG Xiang. Construction of Named Entity Recognition Corpus in Field of Military Command and Control Support [J]. Computer Science, 2022, 49(6A): 133-139.
[3] LI Ye, CHEN Song-can. Physics-informed Neural Networks:Recent Advances and Prospects [J]. Computer Science, 2022, 49(4): 254-262.
[4] LIU Yan, XIONG De-yi. Construction Method of Parallel Corpus for Minority Language Machine Translation [J]. Computer Science, 2022, 49(1): 41-46.
[5] CHAO Le-men, YIN Xian-long. AI Governance and System:Current Situation and Trend [J]. Computer Science, 2021, 48(9): 1-8.
[6] JING Hui-yun, WEI Wei, ZHOU Chuan, HE Xin. Artificial Intelligence Security Framework [J]. Computer Science, 2021, 48(7): 1-8.
[7] XIE Chen-qi, ZHANG Bao-wen, YI Ping. Survey on Artificial Intelligence Model Watermarking [J]. Computer Science, 2021, 48(7): 9-16.
[8] JING Hui-yun, ZHOU Chuan, HE Xin. Security Evaluation Method for Risk of Adversarial Attack on Face Detection [J]. Computer Science, 2021, 48(7): 17-24.
[9] BAO Yu-xuan, LU Tian-liang, DU Yan-hui, SHI Da. Deepfake Videos Detection Method Based on i_ResNet34 Model and Data Augmentation [J]. Computer Science, 2021, 48(7): 77-85.
[10] QIAN Ji-de, XIONG Ren-he, WANG Qian-lei, DU Dong, WANG Zai-jun, QIAN Ji-ye. Application of Edge Computing in Flight Training [J]. Computer Science, 2021, 48(6A): 603-607.
[11] QIN Zhi-hui, LI Ning, LIU Xiao-tong, LIU Xiu-lei, TONG Qiang, LIU Xu-hong. Overview of Research on Model-free Reinforcement Learning [J]. Computer Science, 2021, 48(3): 180-187.
[12] XU Lin-hong, LIU Xin, YUAN Wei, QI Rui-hua. Construction and Application of Russian Multimodal Emotion Corpus [J]. Computer Science, 2021, 48(11): 312-318.
[13] REN Yi. Design of Network Multi-server SIP Information Encryption System Based on Block Chain and Artificial Intelligence [J]. Computer Science, 2020, 47(6A): 634-638.
[14] ZHAO Cheng, YE Yao-wei, YAO Ming-hai. Stock Volatility Forecast Based on Financial Text Emotion [J]. Computer Science, 2020, 47(5): 79-83.
[15] WANG Guo-yin, QU Zhong, ZHAO Xian-lian. Practical Exploration of Discipline Construction of Artificial Intelligence+ [J]. Computer Science, 2020, 47(4): 1-5.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!