计算机科学 ›› 2019, Vol. 46 ›› Issue (8): 272-276.doi: 10.11896/j.issn.1002-137X.2019.08.045
冯雪
FENG Xue
摘要: 语义网是依托互联网技术而产生的一类非常重要的资源。目前,语义网中的用户查询仅支持形式化的查询方式,因此需要严格地遵循某种特定的语法规范,从而导致只有熟悉语义网系统和形式语言的专业人士才能正确进行查询操作。为了弥补这一缺陷,提出了一个无指导的自然语言查询系统,它能自动地将自然语言的句子转换成语义网查询支持的形式语言语句,从而方便非专业用户(即普通用户)使用。该系统首先根据语义网自动抽取给定句子中的所有实体和属性,然后将这些实体和属性关联起来形成一个语义关联图,最后通过启发式的方式从图中搜索出一条最优路径,并将这条路径转换成SPARQL语句。该系统最关键的部分在于语义网中的实体和属性覆盖度,它能直接决定语义关联图的好坏,从而影响系统的最终性能。为了提升系统的实用性,进一步利用外部语义网的知识来补全和丰富自然语言句子中所蕴含的信息,优化中间生成的语义关联度,得到更准确的SPARQL语句。最后采用美国地理问题集进行实验以验证该系统以及提出的改进方法,该数据集共包含了880个问句的人工SPARQL语句,是自然语言查询相关工作中一个被广泛认可的数据集。最终实验结果表明:提出的基准系统能够正确回答77.6%的问题,显著优于当前最好的无指导系统;当采用外部语义知识补全后,回答正确率达到78.5%。
中图分类号:
[1]FABIAN M.SUCHANE K,KASNEC G,et al.Yago:A Core of Semantic Knowledge[C]∥Proceedings of WWW.New York:ACM,2007:697-706. [2]BOLLACKER K,EVANS C,PARITOSH P,et al.Freebase:A Collaboratively Created Graph Database for Structuring Human Knowledge[C]∥Proceedings of the SIGMOD.New York:ACM,2008:1247-1250. [3]BERNERSLEE T,AHENDLER J,LASSILA O.THE SEMANTIC WEB[J].Scientific American,2001,284(5):28-37. [4]WANG C,XIONG M,ZHOU Q,et al.PANTO:A Portable Natural Language Interface to Ontologies[C]∥The Semantic Web:Research and Applications,ESWC 2007.Berlin:Springer,2007:473-487. [5]TROELS A.An approach to knowledge-based query evaluation[J].Fuzzy Sets and Systems,2003,140(1):75-91. [6]ZHANG Z R,YANG T Q.SPARQL ontology query based on natural language understanding[J].Journal of Computer Applications,2010,30(12):3397-3400.(in Chinese) 张宗仁,杨天奇.基于自然语言理解的SPARQL本体查询[J].计算机应用,2010,30(12):3397-3400. [7]LI H,TIAN J W,WANG H H,et al.Ontology-based Natural Language Interface to Relational Databases[J].Computer Scien-ce,2010,37(6):200-205.(in Chinese) 李虎,田金文,王缓缓,等.基于 Ontology 的数据库自然语言查询接口的研究[J].计算机科学,2010,37(6):200-205. [8]XU K,FENG Y S,ZHAO D Y,et al.Automatic Understanding of Natural Language Questions for Querying Chinese Know-ledge Bases[J].Acta Scientiarum Naturalium Universitatis Pekinensis,2014,50(1):85-92.(in Chinese) 许坤,冯岩松,赵东岩,等.面向知识库的中文自然语言问句的语义理解[J].北京大学学报(自然科学版),2014,50(1):85-92. [9]LINCKELS S,MEINEL C.Semantic Interpretation of Natural Language User Input to Improve Search in Multimedia Know-ledge Base [J].Information Technology,2007,49(1):40. [10]BERANT J,CHOU A,FROSTIG R,et al.Semantic Parsing on Freebase from Question-Answer Pairs[C]∥Proceedings of the EMNLP 2013.USA:ACL,2013:1533-1544. [11]LIANG P,JORDAN M I,KLEIN D.Learning dependency-based compositional semantics[J].Computational Linguistics,2013,39(2):389-446. [12]KWIATKOWSKI T,CHOI E,ARTZI Y,et al.Scaling Semantic Parsers with On-the-fly Ontology Matching [C]∥Proceedings of the EMNLP.USA:ACL,2013:1545-1556. [13]WONG Y W,MOONEY R J.Learning Synchronous Grammars for Semantic Parsing with Lambda Calculus[C]∥Proceedings of ACL 2007.USA:ACL,2007:960-967. [14]JONATHAN H,BERANT J.Neural Semantic Parsing over Multiple Knowledge-bases[C]∥Proceedings of the ACL 2017.USA:ACL,2017:623-628. [15]ALON T,BERANT J.The Web as a Knowledge-Base for Answering Complex Questions[C]∥Proceedings of the NAACL-HLT.USA:ACL,2018:641-651. [16]SUHR A,IYER S,ARTZI Y.Learning to Map Context-De- pendent Sentences to Executable Formal Queries[C]∥Procee-dings of NAACL-HLT.USA:ACL,2018:2238-2249. [17]CHEN B,AN B,SUN L,et al.Semi-Supervised Lexicon Lear- ning for Wide-Coverage Semantic Parsing[C]∥Proceedings of the COLING 2018.USA:ACL,2018:892-904. [18]MARTINS A F T,SMITH N A,XING E P,et al.Turbo par- sers:Dependency parsing by approximate variational inference[C]∥Proceedings of the EMNLP 2010.USA:ACL,2010:34-44. [19]DAS D,CHEN D,MARTINS A F T,et al.Frame-semantic parsing[J].Computational Linguistics,2014,40(1):9-56. |
[1] | 陈艳, 陈佳晴, 陈星. 基于层次标签的机器学习流程组装 Machine Learning Process Composition Based on Hierarchical Label 计算机科学, 2021, 48(6A): 306-312. https://doi.org/10.11896/jsjkx.200500077 |
[2] | 卢海川, 符海东, 刘宇. 基于CAN的地理语义数据存储与检索机制 Geo-semantic Data Storage and Retrieval Mechanism Based on CAN 计算机科学, 2019, 46(2): 171-177. https://doi.org/10.11896/j.issn.1002-137X.2019.02.027 |
[3] | 刘浩舸,管建和. 基于正态分布对模糊概念自动计算的FPDA应用设计 FPDA of Fuzzy Concepts Automatic Calculation PDA Designing Based on Normal Distribution 计算机科学, 2017, 44(Z6): 557-559. https://doi.org/10.11896/j.issn.1002-137X.2017.6A.124 |
[4] | 柯昌博,黄志球,肖甫. 基于本体概念相似度的软件构件检索方法 Software Component Retrieval Method Based on Ontology Concept Similarity 计算机科学, 2017, 44(12): 144-149. https://doi.org/10.11896/j.issn.1002-137X.2017.12.028 |
[5] | 苗德成,奚建清,苏锦钿. 形式语言基于Monads的语义计算模型 Semantics Computational Model of Formal Languages Based on Monads 计算机科学, 2017, 44(1): 199-202. https://doi.org/10.11896/j.issn.1002-137X.2017.01.038 |
[6] | 董书暕,汪璟玢,陈远. HMSST+:基于分布式内存数据库的HMSST算法优化 HMSST+:HMSST Algorithm Optimization Based on Distributed Memory Database 计算机科学, 2016, 43(3): 220-224. https://doi.org/10.11896/j.issn.1002-137X.2016.03.040 |
[7] | 王诗碕,李伊潇,沈立炜,赵文耘. 本体概念图的展示过程及技术实现 Display Process and Technique Implementation of Ontology Conceptual Diagram 计算机科学, 2015, 42(12): 87-91. |
[8] | 柯叶青,马志柔,伍海江,刘 杰. 一种简历语义搜索系统的实现方法 SmartHR:A Resume Query and Management System Based on Semantic Web 计算机科学, 2015, 42(12): 56-59. |
[9] | 叶锡君,尹岩. 基于认知语言学的自然语言语义表示方法 Natural Language Semantic Representation Based on Cognitive Linguistics 计算机科学, 2014, 41(Z6): 98-102. |
[10] | 董书暕,汪璟玢. HMSST:一种高效的SPARQL查询优化算法 HMSST:An Efficient Algorithm for SPARQL Query 计算机科学, 2014, 41(Z11): 323-326. |
[11] | 汪璟玢,方知立,张燕琴. 面向分布式的SPARQL查询优化算法 Distributed Optimized Query Algorithm Based on SPARQL 计算机科学, 2014, 41(7): 227-231. https://doi.org/10.11896/j.issn.1002-137X.2014.07.047 |
[12] | 王汀,邸瑞华,李维铭. 一种基于同义词词林的中文大规模本体映射方案 Tongyici Cilin-based Mapping Approach for Large-scale Chinese Ontology 计算机科学, 2014, 41(5): 120-123. https://doi.org/10.11896/j.issn.1002-137X.2014.05.026 |
[13] | 王海荣,马宗民,程经纬. 一种支持用户偏好的RDF模糊查询方法 Approach for Querying RDF with Fuzzy Conditions and User Preferences 计算机科学, 2013, 40(8): 176-180. |
[14] | 蔡国永,林 航,文益民. 社会语义网社区发现标签传递算法研究 Study on Label Propagation Based Community Detection Algorithm for Social Semantic Network 计算机科学, 2013, 40(2): 53-57. |
[15] | 苗德成,奚建清. 一种时态数据形式语言模型 Formal Languages Model for Temporal Data 计算机科学, 2012, 39(4): 172-176. |
|