计算机科学 ›› 2025, Vol. 52 ›› Issue (6A): 240700190-4.doi: 10.11896/jsjkx.240700190

• 大语言模型技术及应用 • 上一篇    下一篇

基于大语言模型的审计领域命名实体识别算法研究

户才顺   

  1. 海军工程大学 武汉 430000
  • 出版日期:2025-06-16 发布日期:2025-06-12
  • 作者简介:(h8788449@163.com)

Study on Named Entity Recognition Algorithms in Audit Domain Based on Large LanguageModels

HU Caishun   

  1. Naval University of Engineering,Wuhan 430000,China
  • Online:2025-06-16 Published:2025-06-12
  • About author:HU Caishun,born in 1989,postgra-duate.His main research interests include finance and auditing.

摘要: 随着ChatGPT的出现,从通用领域到专业领域,大语言模型开始在各行各业发挥着重要作用。审计领域与人工智能结合的方法不断涌现,但是传统人工智能方法的准确率远低于现有大语言模型,因此大语言模型在审计领域中的应用仍需进一步研究。在审计领域中,通过人工智能方法智能识别出文本中的有用实体可以极大提升工作效率,减少错误情况。传统的审计文本实体识别算法主要是基于机器学习结合特征工程,这种方法准确率普遍较低。鉴于此,研究几种常见的开源模型(如Llama等)和闭源模型(如ChatGPT等)在审计文本实体识别中的应用,同时结合上下文学习技术提升模型识别效果,有效提升了识别准确率。其中,上下文学习技术结合了基于相似度选择的样例组织方式,实体识别准确率最高提升至98.3%,取得了较好的效果。

关键词: 审计, 大语言模型, ChatGPT, 命名实体识别, 上下文学习

Abstract: With the emergence of ChatGPT,large language models have begun to play a significant role across various industries,from general fields to specialized domains.Although there have been methods combining artificial intelligence with auditing,the application of large language models in auditing still needs further research due to the fact that the accuracy of traditional artificial intelligence methods is much lower than that of existing large language models.The use of AI methods to intelligently identify useful entities within text in auditing can greatly enhance work efficiency and reduce errors.Conventional auditing text entity recog-nition algorithms primarily rely on machine learning combined with feature engineering,which generally results in lower accuracy.In light of this,this study investigates the applications of several common open-source models(such as Llama) and closed-source models(such as ChatGPT) in auditing text entity recognition,while integrating contextual learning techniques to improve model recognition performance.The results demonstrate that by employing a sample organization method based on similarity selection,the accuracy of entity recognition can be improved to 98.3%,achieving notable improvements.

Key words: Audit, Large language models, ChatGPT, Named entity recognition, In-context learning

中图分类号: 

  • TP391
[1]CHEN X,OUYANG C,LIU Y,et al.Improving the named enti-ty recognition of Chinese electronic medical records by combining domain dictionary and rules[J].International Journal of Environmental Research and Public Health,2020,17(8):2687-2703.
[2]PATIL N,PATIL A,PAWAR B V.Named entity recognition using conditional random fields[J].Procedia Computer Science,2020,167:1181-1188.
[3]ZHANG Y J,XU Z T,XUE X Y.Fusion of multiple features forChinese named entity recognition based on maximum entropy model[J].Journal of Computer Research and Development,2008,45(6):1004-1010.
[4]LAMPLE G,BALLESTEROS M,SUBRA-MANIAN,et al.Neural architectures for named entity recognition[J].arXiv:1603.01360,2016.
[5]WU J C,ZHAN R Z,YANG S,et al.A survey on llm-gernera-ted text detection:Necessity,methods,and future directions[J].arXiv:2310.14724,2023.
[6]GRAVESA.Long-ShortTerm Memory[J].Neural Computation,1997,9(8):1735-1780.
[7]VASWANI A,SHAZEER N,PARMAR N, et al.Attention is all you need[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems.2017:6000-6010.
[8]DEVLIN J,CHANG M W,LEE K,et al.Bert:Pre-training ofdeep bidirectional transformers for language understanding[J].arXiv:1810.04805,2018.
[9]ACHIAM J,ADLER S,AGARWAL S,et al.Gpt-4 technical report[J].arXiv:2303.08774 2023.
[10]LONG O Y,WU J,JIANG X,et al.Training language models to follow instructions with human feedback[C]//Advances in Neural Information Processing Systems.Morial:MIT Press,2022:27730-27744.
[11]YUAN Z H,SHANG Y,ZHOU Y,et al.Llm inference un-veiled:Survey and roofline model insights[J]. arXiv:2402.16363,2024.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!