基于粗糙集理论的中文知识问答的知识谓词分析

doi:10.11896/j.issn.1002-137X.2018.06.032

计算机科学 ›› 2018, Vol. 45 ›› Issue (6): 183-186.doi: 10.11896/j.issn.1002-137X.2018.06.032

基于粗糙集理论的中文知识问答的知识谓词分析

韩朝^1,3, 苗夺谦^1,2, 任福继^2,3

同济大学电子与信息工程学院上海201804¹;
嵌入式系统与服务计算教育部重点实验室(同济大学) 上海201804²;
德岛大学工学部德岛7708506³

收稿日期:2017-12-18 出版日期:2018-06-15 发布日期:2018-07-24
作者简介:韩朝(1990-),女,博士生,CCF会员,主要研究方向为粗糙集、问答系统,E-mail:1990hanzhao@tongji.edu.cn;苗夺谦(1964-),男,博士,教授,CCF会员,主要研究方向为粒计算、粗糙集,E-mail:dqmiao@tongji.edu.cn(通信作者);任福继(1959-),男,博士,教授,主要研究方向为情感计算、对话机器人,E-mail:ren@is.tokushima-u.ac.jp
基金资助:
本文受国家自然科学基金项目(61273304,61673301,61573255),高校学校博士学科点专项基金项目(20130072130004)资助

Rough Set Based Knowledge Predicate Analysis of Chinese Knowledge Based Question Answering

HAN Zhao^1,3, MIAO Duo-qian^1,2, REN Fu-ji^2,3

College of Electronics and Information Engineering,Tongji University,Shanghai 201804,China¹;
Key Laboratory of Embedded System and Service Computing,Ministry of Education,Tongji University,Shanghai 201804,China²;
Faculty of Engineering,Tokushima University,Tokushima 7708506,Japan³

Received:2017-12-18 Online:2018-06-15 Published:2018-07-24

摘要/Abstract

摘要： 在基于知识的问答系统中,问句中的知识谓词信息分析结果将会对知识元组的整体匹配效果产生影响。中文短问句中的知识谓词的信息表达方式存在着不确定性,这些不确定性的表达增加了知识谓词分析的难度。从粗糙集理论的角度,提出了一种问句中的知识谓词的分析方法,对问句中的知识谓词的弱相关表达进行约简,使问句中与知识谓词强相关的表达词能更有效地与知识元组中的知识谓词匹配,进而提高系统对知识谓词的整体分析能力。实验结果验证了新方法的有效性。

关键词: 粗糙集, 短文本相似度, 问答系统, 信息检索, 知识问答

Abstract: In knowledge based question answering system,the performance of knowledge predicate analysis can affect the overall match result of knowledge triple.The knowledge predicate analysis of Chinese short question is difficult because of the uncertainty of Chinese knowledge predicate representation.Based on the rough set theory,a new definition of knowledge predicate analysis of knowledge based question snswering was given,and a new method was proposed to analyze the knowledge predicate of question.It can reduce the words which are weakly related with the knowledge predi-cate,and then the words which are more related with knowledge predicate representation will be used to match the knowledge triples to improve the overall performance of system.The experiment results verify the validity of the method.

Key words: Information retrieval, Knowledge based question answering, Question answering system, Rough set, Short text similarity

中图分类号:

TP391

韩朝, 苗夺谦, 任福继. 基于粗糙集理论的中文知识问答的知识谓词分析[J]. 计算机科学, 2018, 45(6): 183-186. https://doi.org/10.11896/j.issn.1002-137X.2018.06.032

HAN Zhao, MIAO Duo-qian, REN Fu-ji. Rough Set Based Knowledge Predicate Analysis of Chinese Knowledge Based Question Answering[J]. Computer Science, 2018, 45(6): 183-186. https://doi.org/10.11896/j.issn.1002-137X.2018.06.032

参考文献

[1]MILLER E.An Introduction to the Resource Description Frame-work[J].Journal of Library Administration,2001,34(3/4):245-255.
[2]DU Z Y,YANG Y,HE L,et al.Question answering system of electric business field based on chinese knowledge map[J].Computer Applications and Software,2017,34(5):153-159.(in Chinese)
杜泽宇,杨燕,贺樑,等.基于中文知识图谱的电商领域问答系统[J].计算机应用与软件,2017,34(5):153-159.
[3]ZHANG K L,LI W G,WANG H L,et al.Ontology-based Question Answering System for Aviation Domain[J].Journal of Chinese Information Processing,2015,29(4):192-198.(in Chinese)
张克亮,李伟刚,王慧兰,等.基于本体的航空领域问答系统[J].中文信息学报,2015,29(4):192-198.
[4]XIE Z,ZENG Z,ZHOUG,et al.Topic enhanced deep structured semantic models for knowledge base question answering[J].Scien-ce China(Information & Sciences),2017,60(11):110103.
[5]ZHAN C D,LING Z H,DAI L R.Learning Word Embeddings for Paraphrase Scoring in Knowledge Base Based Question An-swering[J].Pattern Recognition and Artificial Intelligence,2016,29(9):825-831.(in Chinese)
詹晨迪,凌震华,戴礼荣.面向知识库问答中复述问句评分的词向量构建方法[J].模式识别与人工智能,2016,29(9):825-831.
[6]ZENG S,WANG S,YUAN Y,et al.Towards Knowledge Automation:A Survey on Question Answering Systems[J].Acta Automatica Sinica,2017,43(9):1491-1508.(in Chinese)
曾帅,王帅,袁勇,等.面向知识自动化的自动问答研究进展[J].自动化学报,2017,43(9):1491-1508.
[7]LIU K,ZHANG Y Z,JI G L,et al.Representation Learning for Question Answering over Knowledge Base:An Overview[J].Acta Automatica Sinica,2016,42(6):807-818.(in Chinese)
刘康,张元哲,纪国良,等.基于表示学习的知识库问答研究进展与展望[J].自动化学报,2016,42(6):807-818.
[8]WANG Y,REN F J,QUAN C Q.Review of Dialogue Management Methods in Spoken Dialogue System[J].Computer Scien-ce,2015,42(6):1-7,27.(in Chinese)
王玉,任福继,全昌勤.口语对话系统中对话管理方法研究综述[J].计算机科学,2015,42(6):1-7,27.
[9]ZHANG Z Z,MIAO D Q,YUE X D.Similarity measure for short texts using topic models and rough sets[J].Journal of Computational Information Systems,2013,9(16):6603-6611.
[10]YI G X,HU H P.A Web Search Result Clustering Based on Tolerance Rough Set[J].Journal of Computer Research and Development,2006,43(2):275-280.(in Chinese)
易高翔,胡和平.一种基于容错粗糙集的Web搜索结果聚类方法[J].计算机研究与发展,2006,43(2):275-280.
[11]LIU H,LIU D Y,PEI Z L,et al.A Feature Weighting Scheme for Text Categorization Based on Feature Importance[J].Journal of Computer Research and Development,2009,46(10):1693-1703.(in Chinese)
刘赫,刘大有,裴志利,等.一种基于特征重要度的文本分类特征加权方法[J].计算机研究与发展,2009,46(10):1693-1703.
[12]THANH N C,YAMADA K.Document Representation and Clustering with WordNet Based Similarity Rough Set Model[J].International Journal of Computer Science Issues,2011,8(5):1-8.
[13]FAN T F,LIAU C J.Rough set-based concept mining from social networks[C]//IEEE International Conference on Fuzzy Systems.IEEE,2016:663-670.
[14]CAO L,HUANG G,CHAI W.A knowledge discovery model for third-party payment networks based on rough set theory[J].Journal of Intelligent & Fuzzy Systems,2017,33(1):1-9.
[15]DAI R,DUAN X.Research on Knowledge Acquisition of Motorcycle Intelligent Design System Based on Rough Set[M]//Computer and Computing Technologies in Agriculture V.Springer Berlin Heidelberg,2012:16-27.
[16]CHEN X G,DUAN S,WANG L D.Research on trend prediction and evaluation of network public opinion[J].Concurrency &Computation Practice & Experience,2017,29(4):e4212.
[17]苗夺谦,李道国.粗糙集理论、算法与应用[M].北京:清华大学出版社,2008.
[18]PAWLAK Z.Rough set approach to knowledge-based decision support[J].European Journal of Operational Research,1997,99(1):48-57.
[19]HUANG X,WEI B,ZHANG Y.Automatic Question-Answering Based on Wikipedia Data Extraction[C]//International Conference on Intelligent Systems and Knowledge Engineering.IEEE,2016:314-317.
[20]https://github.com/huangxiangzhou/NLPCC2016KBQA.
[21]DUAN N.Overview of the NLPCC-ICCPOL 2016 Shared Task:Open Domain Chinese Question Answering[C]//International Conference on Computer Processing of Oriental Languages.Springer International Publishing.2016:942-948.
[22]PEDREGOSA F,GRAMFORT A,MICHEL V,et al.Scikitlearn:Machine Learning in Python[J].Journal of Machine Learning Research,2011,12(10):2825-2830.

相关文章 15

[1]	程富豪, 徐泰华, 陈建军, 宋晶晶, 杨习贝. 基于顶点粒k步搜索和粗糙集的强连通分量挖掘算法 Strongly Connected Components Mining Algorithm Based on k-step Search of Vertex Granule and Rough Set Theory 计算机科学, 2022, 49(8): 97-107. https://doi.org/10.11896/jsjkx.210700202
[2]	许思雨, 秦克云. 基于剩余格的模糊粗糙集的拓扑性质 Topological Properties of Fuzzy Rough Sets Based on Residuated Lattices 计算机科学, 2022, 49(6A): 140-143. https://doi.org/10.11896/jsjkx.210200123
[3]	方连花, 林玉梅, 吴伟志. 随机多尺度序决策系统的最优尺度选择 Optimal Scale Selection in Random Multi-scale Ordered Decision Systems 计算机科学, 2022, 49(6): 172-179. https://doi.org/10.11896/jsjkx.220200067
[4]	陈于思, 艾志华, 张清华. 基于三角不等式判定和局部策略的高效邻域覆盖模型 Efficient Neighborhood Covering Model Based on Triangle Inequality Checkand Local Strategy 计算机科学, 2022, 49(5): 152-158. https://doi.org/10.11896/jsjkx.210300302
[5]	韩红旗, 冉亚鑫, 张运良, 桂婕, 高雄, 易梦琳. 基于共同子空间分类学习的跨媒体检索研究 Study on Cross-media Information Retrieval Based on Common Subspace Classification Learning 计算机科学, 2022, 49(5): 33-42. https://doi.org/10.11896/jsjkx.210200157
[6]	孙林, 黄苗苗, 徐久成. 基于邻域粗糙集和Relief的弱标记特征选择方法 Weak Label Feature Selection Method Based on Neighborhood Rough Sets and Relief 计算机科学, 2022, 49(4): 152-160. https://doi.org/10.11896/jsjkx.210300094
[7]	王子茵, 李磊军, 米据生, 李美争, 解滨. 基于误分代价的变精度模糊粗糙集属性约简 Attribute Reduction of Variable Precision Fuzzy Rough Set Based on Misclassification Cost 计算机科学, 2022, 49(4): 161-167. https://doi.org/10.11896/jsjkx.210500211
[8]	王志成, 高灿, 邢金明. 一种基于正域的三支近似约简 Three-way Approximate Reduction Based on Positive Region 计算机科学, 2022, 49(4): 168-173. https://doi.org/10.11896/jsjkx.210500067
[9]	薛占熬, 侯昊东, 孙冰心, 姚守倩. 带标记的不完备双论域模糊概率粗糙集中近似集动态更新方法 Label-based Approach for Dynamic Updating Approximations in Incomplete Fuzzy Probabilistic Rough Sets over Two Universes 计算机科学, 2022, 49(3): 255-262. https://doi.org/10.11896/jsjkx.201200042
[10]	李艳, 范斌, 郭劼, 林梓源, 赵曌. 基于k-原型聚类和粗糙集的属性约简方法 Attribute Reduction Method Based on k-prototypes Clustering and Rough Sets 计算机科学, 2021, 48(6A): 342-348. https://doi.org/10.11896/jsjkx.201000053
[11]	余笙, 李斌, 孙小兵, 薄莉莉, 周澄. 知识驱动的相似缺陷报告推荐方法 Approach for Knowledge-driven Similar Bug Report Recommendation 计算机科学, 2021, 48(5): 91-98. https://doi.org/10.11896/jsjkx.200600159
[12]	黄欣, 雷刚, 曹远龙, 陆明名. 基于深度学习的交互式问答研究综述 Review on Interactive Question Answering Techniques Based on Deep Learning 计算机科学, 2021, 48(12): 286-296. https://doi.org/10.11896/jsjkx.210100209
[13]	薛占熬, 孙冰心, 侯昊东, 荆萌萌. 基于多粒度粗糙直觉犹豫模糊集的最优粒度选择方法 Optimal Granulation Selection Method Based on Multi-granulation Rough Intuitionistic Hesitant Fuzzy Sets 计算机科学, 2021, 48(10): 98-106. https://doi.org/10.11896/jsjkx.200800074
[14]	薛占熬, 张敏, 赵丽平, 李永祥. 集对优势关系下多粒度决策粗糙集的可变三支决策模型 Variable Three-way Decision Model of Multi-granulation Decision Rough Sets Under Set-pair Dominance Relation 计算机科学, 2021, 48(1): 157-166. https://doi.org/10.11896/jsjkx.191200175
[15]	桑彬彬, 杨留中, 陈红梅, 王生武. 优势关系粗糙集增量属性约简算法 Incremental Attribute Reduction Algorithm in Dominance-based Rough Set 计算机科学, 2020, 47(8): 137-143. https://doi.org/10.11896/jsjkx.190700188

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

基于粗糙集理论的中文知识问答的知识谓词分析

Rough Set Based Knowledge Predicate Analysis of Chinese Knowledge Based Question Answering

PDF (PC)

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

Metrics

本文评价

推荐阅读 0