汉语阅读理解中词义判断题的解答研究

计算机科学 ›› 2018, Vol. 45 ›› Issue (6A): 72-74.

汉语阅读理解中词义判断题的解答研究

谭红叶^1,2,武宇飞¹

山西大学计算机与信息技术学院太原030006¹
山西大学计算智能与中文信息处理教育部重点实验室太原030006²

出版日期:2018-06-20 发布日期:2018-08-03
作者简介:谭红叶(1971-),女,博士,副教授,主要研究方向为自然语言处理、信息检索,E-mail:hytan_2006@126.com;武宇飞(1994-),男,硕士生,主要研究方向为中文信息处理,E-mail:598974237@qq.com(通信作者)。
基金资助:
国家自然科学基金项目(61673248),国家高技术研究发展计划(863计划)项目(2015AA015407),国家自然科学基金青年项目(61100138,61403238,61502287),山西省回国留学人员科研项目(2013-022),山西省2012年度留学回国人员科技活动择优项目资助

Answering Word Sense Judgement Questions in Chinese Reading Comprehension

TAN Hong-ye^1,2,WU Yu-fei¹

School of Computer and Information Technology,Shanxi University,Taiyuan 030006,China¹
Key Laboratory for Ministry of Education of Computational Intelligence and Chinese Information Processing, Shanxi University,Taiyuan 030006,China²

Online:2018-06-20 Published:2018-08-03

摘要/Abstract

摘要： 阅读理解任务是在给定的单篇文本上,要求计算机根据文本的内容对相应的问题作出回答。以北京语文高考阅读理解为背景,对其中的词义判断题进行了分析与研究,提出了一个基于支持度计算的解答框架,并尝试使用语言模型、点互信息与句子相似度3种方法来计算支持度。通过实验验证,3种方法在真实数据集和自动构造的数据集上均有一定成效。其中,基于点互信息的支持度计算方法在真实数据集上表现最好,获得了75%的选项正确率。

关键词: 词义判断, 阅读理解, 支持度

Abstract: Read comprehension tasks require that computers answer relevant query according to the test context on a given single text.This paper researched judgment of word meaning with the background of reading comprehension in Beijing Chinese college entrance examination,proposed a framework based on support value,which was calculated by n-gram,PMI and sentence similar.The experimental results show that the three methods have good effect on real data and auto data.In all ways,support value based on PMI has the best performance on real data,with the accuracy reaching 75%.

Key words: Judgment of word meaning, Reading comprehension, Support value

中图分类号:

TP391

谭红叶, 武宇飞. 汉语阅读理解中词义判断题的解答研究[J]. 计算机科学, 2018, 45(6A): 72-74. https://doi.org/

TAN Hong-ye, WU Yu-fei. Answering Word Sense Judgement Questions in Chinese Reading Comprehension[J]. Computer Science, 2018, 45(6A): 72-74. https://doi.org/

参考文献

[1]赵红红.汉语阅读理解问答题解答研究[D].太原:山西大学,2016. [2]FERRUCCI D,LEVAS A,BAGCHI S,et al.Watson:Beyond Jeopardy![J].Artificial Intelligence,2013,199-200(3):93-105. [3]PE AS A,HOVY E H,FORNER P,et al.Overview of QA4M- RE at CLEF 2011:Question Answering for Machine Reading Evaluation[C]∥Proceedings of the Cross Language Evaluation Forum 2011 Labs and Workshop,Notebook Papers.2011:303-320. [4]卢志茂,刘挺,李生.统计词义消歧的研究进展[J].电子学报,2006,34(2):333-343. [5]宗成庆.统计自然语言处理第二版[M].北京:清华大学出版社,2013. [6]杨陟卓.基于上下文语境的词义消歧方法[J].计算机应用,2015,35(4):1006-1008. [7]CHAN Y S,NG H T.Scaling Up Word Sense Disambiguation via Parallel Texts[C]∥The Twentieth National Conference on Artificial Intelligence and the Seventeenth Innovative Applications of Artificial Intelligence Conference.Pittsburgh,Pennsylvania,USA,DBLP,2005:1037-1042. [8]PILEHVAR M T,JURGENS D,NAVIGLI R.Align,Disambiguate and Walk:A Unified Approach for Measuring Semantic Similarity[C]∥Meeting of the Association for Computational Linguistics.2013. [9]LU W P,HUANG H Y.Word Sense Disambiguation Based on Dependency Fitness with Automatic Knowledge Acquisition[J].Journal of Software,2013,24(10):2300-2311. [10]AGIRRE E,SOROA A.Random walks for knowledge-basedword sense disambiguation[M].Cambridge:The MIT Press,2014. [11]MANNING C D,RAGHAVAN P,SCHUTZE H.Introduction to Information Retrieval [M].王斌,译.北京:人民邮电出版社,2010. [12]吴军.数学之美第二版[M].北京:人民邮电出版社,2014. [13]WOODS A M.Exploiting Linguistic Features for Sentence Completion[C]∥Meeting of the Association for Computational Linguistics.2016:438-442. [14]LIU T,CUI Y,YIN Q,et al.Generating and Exploiting Large-scale Pseudo Training Data for Zero Pronoun Resolution[J].arXiv.org/abs/1606.01603.

相关文章 15

[1]	孙晓寒, 张莉. 基于评分区域子空间的协同过滤推荐算法 Collaborative Filtering Recommendation Algorithm Based on Rating Region Subspace 计算机科学, 2022, 49(7): 50-56. https://doi.org/10.11896/jsjkx.210600062
[2]	邱嘉作, 熊德意. 神经问题生成前沿综述 Frontiers in Neural Question Generation:A Literature Review 计算机科学, 2021, 48(6): 159-167. https://doi.org/10.11896/jsjkx.201100013
[3]	李舟军,王昌宝. 基于深度学习的机器阅读理解综述 Survey on Deep-learning-based Machine Reading Comprehension 计算机科学, 2019, 46(7): 7-12. https://doi.org/10.11896/j.issn.1002-137X.2019.07.002
[4]	刘飞龙,郝文宁,陈刚,靳大尉,宋佳星. 基于双线性函数注意力Bi-LSTM模型的机器阅读理解 Attention of Bilinear Function Based Bi-LSTM Model for Machine Reading Comprehension 计算机科学, 2017, 44(Z6): 92-96. https://doi.org/10.11896/j.issn.1002-137X.2017.6A.019
[5]	齐飞,王宁,张丽方,孙伟娟. 基于列重合度的网络表格一致性扩展 Consistent Web Table Augmentation Based on Column Overlapping 计算机科学, 2017, 44(9): 208-215. https://doi.org/10.11896/j.issn.1002-137X.2017.09.039
[6]	徐卫,李晓粉,刘端阳. 基于命题逻辑的关联规则挖掘算法L-Eclat Propositional Logic-based Association-rule Mining Algorithm L-Eclat 计算机科学, 2017, 44(12): 211-215. https://doi.org/10.11896/j.issn.1002-137X.2017.12.038
[7]	徐本强,谭雪微,邹丽. 基于真值支持度的直觉模糊推理方法 Intuitionistic Fuzzy Reasoning Based on Truth-valued Support Degrees 计算机科学, 2016, 43(3): 68-71. https://doi.org/10.11896/j.issn.1002-137X.2016.03.013
[8]	樊兵娇,徐伟华. 序信息系统中基于粗糙集的证据获取与合成 Evidence Acquirement and Combination Method Based on Rough Set in Ordered Information System 计算机科学, 2015, 42(6): 54-56. https://doi.org/10.11896/j.issn.1002-137X.2015.06.012
[9]	刘端阳,冯建,李晓粉. 一种基于逻辑的频繁序列模式挖掘算法 Logic-based Frequent Sequential Pattern Mining Algorithm 计算机科学, 2015, 42(5): 260-264. https://doi.org/10.11896/j.issn.1002-137X.2015.05.052
[10]	杨泽民. 基于时序和兴趣度约束的加权关联规则挖掘算法研究 Study of Weighted Association Rules Mining Algorithms Based on Timing and Interest Degrees Constraints 计算机科学, 2013, 40(3): 259-262.
[11]	关晓蔷钱宇华. 基于不完备信息系统的决策树生成算法 Algorithm for Generating Decision Tree Based on Incomplete Information Systems 计算机科学, 2012, 39(1): 156-158.
[12]	谢福鼎,周晨光,张永,杨东巍. 应用主观逻辑的无线传感器网络信任更新算法 Trust Updating Algorithm Using Subject Logic in Wireless Sensor Network 计算机科学, 2011, 38(9): 50-54.
[13]	刘东波，卢正鼎. 模糊Horn子句规则挖掘算法研究 Research on Algorithms for Mining Fuzzy Horn Clause Rules 计算机科学, 2011, 38(9): 142-145.
[14]	郭鑫，董坚峰,周清平. 动态数据库中的频繁子树挖掘算法 Mining Frequent Subtrees from Dynamic Database 计算机科学, 2011, 38(5): 138-141.
[15]	杜永萍何明. 基于多策略的单文档问答式信息检索技术 Multi-strategy Based Single Document Question Answering 计算机科学, 2009, 36(7): 193-196. https://doi.org/10.11896/j.issn.1002-137X.2009.07.046

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed