Computer Science ›› 2021, Vol. 48 ›› Issue (11): 276-286.doi: 10.11896/jsjkx.210100218

• Artificial Intelligence • Previous Articles     Next Articles

Automatic Learning Method of Domain Semantic Grammar Based on Fault-tolerant Earley Parsing Algorithm

MA Yi-fan1, MA Tao-tao2, FANG Fang3, WANG Shi2, TANG Su-qin4, CAO Cun-gen2   

  1. 1 School of Computer Science and Information Engineering,Guangxi Normal University,Guilin,Guangxi 541000,China
    2 Institute of Computing Technology,Chinese Academy of Sciences,Beijing 100190,China
    3 Institute of Information Engineering,Chinese Academy of Sciences,Beijing 100190,China
    4 Department of Educational Technology,Faculty of Education,Guangxi Normal University,Guilin,Guangxi 541000,China
  • Received:2021-01-28 Revised:2021-05-19 Online:2021-11-15 Published:2021-11-10
  • About author:MA Yi-fan,born in 1996,postgraduate.Her main research interests include na-tural language processing and so on.
    CAO Cun-gen,born in 1964,Ph.D,professor,Ph.D supervisor,is a member of China Computer Federation.His main research interests include large-scale knowledge process and so on.
  • Supported by:
    Key Research and Development Projects of the Ministry of Science and Technology(2017YFC1700302),Beijing NOVA Program(Cross-discipline,Z191100001119014),National Key Research and Development Program of China(2017YFB1002300) and National Natural Science Foundation of China(61967002).

Abstract: Refined domain text analysis is an important prerequisite for high-quality domain knowledge acquisition.It usually relies on a large number of some form of semantic grammars,but summarizing them is often time-consuming and labor-intensive.In this paper,an automatic learning method of semantic grammar based on fault-tolerant Earley parsing algorithm is proposed,which automatically generates new semantic grammars (including lexicons and grammar production rules) according to seed grammar to reduce labor costs.This method uses the optimized fault-tolerant Earley parser to perform fault-tolerant parsing on the input statements,and then generates candidate semantic grammars based on the parse tree generated by the fault-tolerant parsing.Finally,the candidate semantic grammars are filtered or corrected to obtain the final semantic grammars.In the experiment of five TCM medical records with different diseases,the precision rate of learning new lexicons is 63.88%,and precision rate of learning new grammar production rules is 81.78%.

Key words: Fault-tolerant Earley parser, Filtering algorithm, Grammar learning, Semantic correction, Semantic grammar

CLC Number: 

  • TP391
[1]SARAWAGI S.Information Extraction[J].IEEE IntelligentSystems,2015,30(3):8-15.
[2]PAULHEIM H,CIMIANO P.Knowledge graph refinement:A survey of approaches and evaluation methods[J].Semantic Web,2017,8(3):489-508.
[3]LIU Y C,LI H Y.Survey of Domain Knowledge Graph Research[J].Computer System Application,2020,29(6):1-12.
[4]ALANI H,SANGHEE K,MILLARD D,et al.Automatic onto-logy-based knowledge extraction from Web documents[J].Intelligent Systems,IEEE,2003,18(1):14-21.
[5]VARGAS-VERA M,MOTTA E,DOMINGUE J,et al.Know-ledge extraction by using an ontology-based annotation tool[C]//K-cap Workshop on Knowledge Markup & Semantic Annotation.2001.
[6]GARCEZ A,BRODA K,GABBAY D M.Symbolic knowledge extraction from trained neural networks:A sound approach[J].Artificial intelligence,2001,125(1/2):155-207.
[7]BOGER Z,GUTERMAN H.Knowledge extraction from artificial neural network models[C]//IEEE International Conference on Systems,Man and Cybernetics.Computational Cybernetics and Simulation.IEEE,1997.
[8]CUNGEN C,QIANGZE F,YING G,et al.Progress in the development of national knowledge infrastructure[J].Journal of Computer Science & Technology,2002,17(5):523-534.
[9]WANG Y.Research on common sense knowledge acquisitionmethod based on semantic classification[D].Guangxi Normal University,2015.
[10]MA T T.Research on the Design and Optimization Method of Domain Semantic Grammar[D].University of Chinese Academy of Sciences,2020.
[11]SAKAKIBARA Y,MURAMATSU H.Learning Context-Free Grammars from Partially Structured Examples[C]//Lecture Notes in Computer Science(ICGI 2000).Berlin:Springer,2000:229-240.
[12]SAKAKIBARA Y,KONDO M.GA-based Learning of Con-text-Free Grammars using Tabular Representations[C]//Procee-dings of the Sixteenth International Conference on Machine Learning (ICML 1999).Morgan Kaufmann,1999.
[13]SAKAKIBARA Y.Learning context-free grammars using tabular representations[J].Pattern Recognition,2005,38(9):1372-1383.
[14]GRAHAM S L,HARRISON M A,RUZZO W L.An Improved Context-Free Recognizer[J].ACM Transactions on Programming Languages and Systems,1980,2(3):415-462.
[15]NAKAMURA K,MATSUMOTO M.Incremental learning ofcontext free grammars based on bottom-up parsing and search[J].Pattern Recognition,2005,38(9):1384-1392.
[16]NAKAMURA K.Incremental Learning of Context Free Grammars by Bridging Rule Generation and Search for Semi-optimum Rule Sets[C]//International Colloquium on Grammatical Infe-rence.Berlin:Springer,2006.
[17]IMADA K,NAKAMURA K.Search for Minimal and Semi-Minimal Rule Sets in Incremental Learning of Context-Free and Definite Clause Grammars[J].IEICE Transactions on Information & Systems,2010,93-D(5):1197-1204.
[18]WANG D S.Research on domain-specific natural language understanding and semantic grammar learning[D].Beijing:University of Chinese Academy of Sciences,2012.
[19]ZHOU D.Research on Chinese Semantic Grammar Expansion Method Based on Seed Grammar[D].Beijing:University of Chinese Academy of Sciences,2015.
[20]HARADA T,ARAKI O,SAKURAI A.Learning context-freegrammars with recurrent neural networks[C]//International Joint Conference on Neural Networks.IEEE,2002.
[21]COHEN M,CACIULARU A,REJWAN I,et al.Inducing Regular Grammars Using Recurrent Neural Networks[J].arXiv:1710.10453.
[22]SHEN Y K,LIN Z H,HUANG C W,et al.Neural languagemodeling by jointly learning syntax and lexicon[C]//International Conference on Learning Representations.2018.
[23]WU Z K,JOHNSON E,WEI Y,et al.REINAM:reinforcement learning for input-grammar inference[C]//Proceedings of the 2019 27th ACM Joint Meeting on European Software Enginee-ring Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2019).Association for Computing Machinery,New York,NY,USA,2019:488-498.
[24]TOMITA M.An Efficient Augmented-Context-Free Parsing Algorithm[J].Computational Linguistics,1987,13(1):31-46.
[25]EARLEY J.An Efficient Context-free Parsing Algorithm[J].Communications of the ACM,1970,26(1):57-61.
[26]GU B,LI R,LIU K Y.Earley Algorithm with Forecasting Stra-tegy[J].Computer Science,2010,37(1):229-232.
[27]FANG F.Research on Semantic Analysis and Knowledge Acquisition from Web Texts[D].University of Chinese Academy of Sciences,2019.
[28]YANG G Z.Analysis and improvement of Earley algorithm[J].Journal of University of Science and Technology of China,1985(S1):90-98.
[29]JACCARD P.The Distribution of the Flora in the Alpine Zone[J].New Phytologist,2010,11(2):37-50.
[1] NI Xiao-jun, GAO Yan, LI Ling-feng. Hybrid Filtering Algorithm Based on RSSI [J]. Computer Science, 2019, 46(8): 133-137.
[2] ZHANG Xu-tao. Filtering Algorithm Based on Gaussian-salt and Pepper Noise [J]. Computer Science, 2019, 46(6A): 263-265.
[3] DENG Xiu-qin, LIU Tai-heng, LIU Fu-chun, LONG Yong-hong. User Collaborative Filtering Recommendation Algorithm Based on All Weighted Matrix Factorization [J]. Computer Science, 2019, 46(11A): 199-203.
[4] ZHOU Qi, LU Ye, LI Ting-yu, WANG Ya, ZHANG Zai-yue and CAO Cun-gen. Acquiring Relationships Between Geographical Entities Based on Semantic Grammar [J]. Computer Science, 2016, 43(7): 208-216.
[5] ZHANG Xiu-yu. Research on Web Music Push Model of Mobile Terminal Ubiquitous Context Adaptation [J]. Computer Science, 2015, 42(Z6): 503-509.
[6] MAO Qin, ZENG Bi and YE Lin-feng. Research on Improved Indoor Mobile Robot Fuzzy Position Fingerprint Localization [J]. Computer Science, 2015, 42(11): 170-173.
[7] HOU Sheng-luan,LIU Lei and CAO Cun-gen. Research on Accurate Analysis of Internet Public Opinion:A Semantic Grammar-based Method [J]. Computer Science, 2014, 41(10): 225-231.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!