计算机科学 ›› 2025, Vol. 52 ›› Issue (11): 13-21.doi: 10.11896/jsjkx.241200198
方全1, 张金龙2, 王冰倩1, 胡骏3
FANG Quan1, ZHANG Jinlong2, WANG Bingqian1, HU Jun3
摘要: 近年来,大型语言模型的快速发展引起了社会各界的广泛关注。大型语言模型虽然天然适应各种自然语言处理任务,但是在特定领域的问答任务中,由于缺少针对垂直领域的训练,生成答案的可靠性和适用性往往不尽如人意。为提升领域知识问答系统的性能,提出了一种新的基于组合上下文提示的大型语言模型领域知识问答方法。组合上下文提示包括了领域知识上下文和问答示例上下文两部分。领域知识上下文由采用基于对比学习的密集检索器从领域知识库中检索得到,能够增强大型语言模型的领域专业知识处理能力。问答示例上下文则通过语义相似检索从训练集中取得,能够提升大型语言模型对问题意图的理解能力。最后,将得到的组合上下文提示输入经过领域知识微调后的大型语言模型中,生成最终的领域答案。通过充分的实验和与基线模型的综合比较证明,所提方法在BERTScore指标上精确度和召回率分别比ChatGPT提高了15.91%和16.14%,F1 Score比ChatGPT提高了15.87%。
中图分类号:
| [1]LIN Z Y,YANG D Q,WANG T J,et al.Keyword Query Based on Relational Databases[J].Journal of Software,2010,21(10):2454-2476. [2]SU Y,YANG S,SUN H,et al.Exploiting relevance feedback in knowledge graph search[C]//Proceedings of the 21th ACM Sigkdd International Conference on Knowledge Discovery and Data Mining.2015:1135-1144. [3]GREEN J R B F,WOLF A K,CHOMSKY C,et al.Baseball:an automatic question-answerer[C]// Western Joint IRE-AIEE-ACM Computer Conference.1961:219-224. [4]WOODS W A.Progress in natural language understanding:an application to lunar geology[C]// National Computer Confe-rence and Exposition.1973:441-450. [5]PEI Z,ZHANG J,XIONG W,et al.A general framework forChinese domain knowledge graph question answering based on TransE[C]//Journal of Physics:Conference Series.IOP Publishing,2020:012136. [6]CHANG Y,WANG X,WANG J,et al.A survey on evaluation of large language models[J].ACM Transactions on Intelligent Systems and Technology, 2024,15(3):39.1-39.45. [7]WEN S,QIAN L,HU M D,et al.A Review of Research Progress on Question Answering Technology Based on Large Language Models[J].Data Analysis and Knowledge Discovery,2024,8(6):16-29. [8]MARX E,USBECK R,NGOMO A C N,et al.Towards an open question answering architecture[C]//Proceedings of the 10th International Conference on Semantic Systems.2014:57-60. [9]AUER S,BIZER C,KOBILAROV G,et al.Dbpedia:A nucleus for a web of open data[C]//International Semantic Web Confe-rence.Berlin:Springer,2007:722-735. [10]SINGH K,RADHAKRISHNA A S,BOTH A,et al.Why reinvent the wheel:Let's build question answering systems together[C]//Proceedings of the 2018 World Wide Web Conference.2018:1247-1256. [11]DIEFENBACH D,MIGLIATTI P H,QAWASMEH O,et al.QAnswer:A Question Answering prototype bridging the gap between a considerable part of the LOD cloud and end-users[C]//The World Wide Web Conference.2019:3507-3510. [12]DIEFENBACH D,LOPEZ V,SINGH K,et al.Core techniques of question answering systems over knowledge bases:a survey[J].Knowledge and Information Systems,2018,55:529-569. [13]ZOU L,HUANG R,WANG H,et al.Natural language question answering over RDF:a graph data driven approach[C]//Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data.2014:313-324. [14]YAHYA M,BERBERICH K,ELBASSUONI S,et al.Natural language questions for the web of data[C]//Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning.2012:379-390. [15]SHEKARPOUR S,MARX E,NGOMO A C N,et al.Sina:Semantic interpretation of user queries for question answering on interlinked data[J].Journal of Web Semantics,2015,30:39-51. [16]BROWN T,MANN B,RUDER N,et al.Language models are few-shot learners[C]//Advances in Neural Information Processing Systems.2020:1877-1901. [17]XIE S M,RAGHUNATHAN A,LIANG P,et al.An explanation of in-context learning as implicit bayesian inference[J].arXiv:2111.02080,2021. [18]WIES N,LEVINE Y,SHASHUA A.The learnability of in-context learning[C]//Advances in Neural Information Processing Systems.2024. [19]LEWIS P,PEREZ E,PIKTUS A,et al.Retrieval-augmentedgeneration for knowledge-intensive nlp tasks[C]//Advances in Neural Information Processing Systems.2020:9459-9474. [20]LUO H,TANG Z,PENG S,et al.Chatkbqa:A generate-then-retrieve framework for knowledge base question answering with fine-tuned large language models[J].arXiv:2310.08975,2023. [21]LIN X V,CHEN X,CHEN M,et al.Ra-dit:Retrieval-augmented dual instruction tuning[J].arXiv:2310.01352,2023. [22]LIU X,LAI H,YU H,et al.Webglm:Towards an efficient web-enhanced question answering system with human preferences[C]//Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining.2023:4549-4560. [23]MA H,CAO J,FANG Y,et al.Retrieval-based gradient boosting decision trees for disease risk assessment[C]//Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining.2022:3468-3476. [24]WIRATUNGA N,ABEYRANTNE R,JAYAWARDENA L,et al.CBR-RAG:case-based reasoning for retrieval augmented generation in LLMs for legal question answering[C]//International Conference on Case-Based Reasoning.Cham:Springer,2024:445-460. [25]SARMAH B,MEHTA D,HALL B,et al.Hybridrag:Integra-ting knowledge graphs and vector retrieval augmented generation for efficient information extraction[C]//Proceedings of the 5th ACM International Conference on AI in Finance.2024:608-616. [26]HU E J,SHEN Y,WALLIS P,et al.Lora:Low-rank adaptation of large language models[J].arXiv:2106.09685,2021. [27]VASWANI A,SHAZEER N,PARMAR N,et al.Attention is all you need[C]//Advances in Neural Information Processing Systems.2017. [28]XU M.Text2vec:Text to vector toolkit(Version 1.1.2) [EB/OL].https://github.com/shibing624/text2vec. [29]ZHAO H,YUAN S,LENG J,et al.Calculating Question Similarity is Enough:A New Method for KBQA Tasks[J].arXiv:2111.07658,2021. [30]LI Y,SOSEA T,SAWANT A,et al.P-stance:A large datasetfor stance detection in political domain[C]//Findings of the Association for Computational Linguistics:ACL-IJCNLP 2021.2021:2355-2365. [31]LlamaFamily.Llama Chinese[EB/OL].[2024-02-14].https://github.com/LlamaFamily/Llama-Chinese/tree/main. [32]TOUVRON H,LAVRIL T,IZACARD G,et al.Llama:Openand efficient foundation language models[J].arXiv:2302.13971,2023. [33]CHIANG W L,LI Z,LIN Z,et al.Vicuna:An open-source chatbot impressing gpt-4 with 90%* chatgpt quality[EB/OL].https://vicuna.lmsys.org. [34]YANG A,XIAO B,WANG B,et al.Baichuan 2:Open large-scale language models[J].arXiv:2309.10305,2023. [35]CUI Y,YANG Z,YAO X.Efficient and effective text encoding for chinese llama and alpaca[J].arXiv:2304.08177,2023. [36]WU T,HE S,LIU J,et al.A brief overview of ChatGPT:The history,status quo and potential future development[J].IEEE/CAA Journal of Automatica Sinica,2023,10(5):1122-1136. [37]RADFORD A,NARASIMHAN K,SALIMANS T,et al.Improving language understanding by generative pre-training[EB/OL].https://gwern.net/doc/www/s3-us-west-2.amazonaws.com/d73fdc5ffa8627bce44dcda2fc012da638ffb158.pdf. [38]ZENG A,LIU X,DU Z,et al.Glm-130b:An open bilingual pre-trained model[J].arXiv:2210.02414,2022. [39]ZHANG Y,XU G,FU X,et al.Adversarial training improved multi-path multi-scale relation detector for knowledge base question answering[J].IEEE Access,2020,8:63310-63319. [40]ZHOU G,XIE Z,YU Z,et al.DFM:A parameter-shared deep fused model for knowledge base question answering[J].Information Sciences,2021,547:103-118. [41]ZHAO H,YUAN S,LENG J,et al.Calculating Question Similarity is Enough:A New Method for KBQA Tasks[J].arXiv:2111.07658,2021. [42]YANG E,HAO F,SHANG J,et al.BT-CKBQA:An efficientapproach for Chinese knowledge base question answering[J].Data & Knowledge Engineering,2023,147:102204. [43]HAO H,SUN X,WEI J.A semantic union model for open domain Chinese knowledge base question answering[J].Scientific Reports,2023,13(1):11903. [44]PRASANNA K S L,LOKESH S,CHANDRAMOULI G,et al.BERT-QA:Empowering Intelligent Question Answering with NLP and Entity Recognition[C]//2024 3rd International Conference on Applied Artificial Intelligence and Computing(ICAAIC).IEEE,2024:1006-1010. [45]AUGENSTEIN I,ROCKTASCHEL T,VLACHOS A,et al.Stance Detection with Bidirectional Conditional Encoding[C]//EMNLP 2016.The Association for Computational Linguistics.2016:876-885. [46]YOON K.Convolutional neural networks for sentence classification[C]//EMNLP.2014:1746-1751. [47]DU J,XU R,HE Y,et al.Stance classification with target-specific neural attention networks[C]//26th International Joint Conference on Artificial Intelligence(IJCAI 2017).International Joint Conferences on Artificial Intelligence,2017:3988-3994. [48]SHIN T,RAZEGHI Y,LOGAN IV R L,et al.Autoprompt:Eliciting knowledge from language models with automatically generated prompts[J].arXiv:2010.15980,2020. [49]XUE W,LI T.Aspect based sentiment analysis with gated con-volutional networks[J].arXiv:1805.07043,2018. [50]HUANG B,CARLEY K M.Parameterized Convolutional Neural Networks for Aspect Level Sentiment Classification[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing.2018:1091-1096. [51]KENTON J D M W C,TOUTANOVA L K.Bert:Pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of naacL-HLT.2019:2. [52]DU Z,QIAN Y,LIU X,et al.Glm:General language model pretraining with autoregressive blank infilling[J].arXiv:2103.10360,2021. |
|
||