融合BERT模型与词汇增强的中医命名实体识别模型

doi:10.11896/jsjkx.230900030

Abstract

Abstract: There are few researches on TCM named entity recognition,and most of them are based on Chinese medical cases,and they do not perform well in TCM case texts.Aiming at the characteristics of dense named entities and fuzzy boundary in TCM cases,this paper proposes a method of TCM named entity recognition,LEBERT-BILSTM-CRF,which combines lexical enhancement and pre-training model.This method is optimized from the perspective of the fusion of vocabulary enhancement and pre-training model,and the vocabulary information is input into the BERT model for feature learning,so as to achieve the purpose of dividing word class boundaries and distinguishing word class attributes,and improve the accuracy of TCM medical case named entity recognition.Experiments show that when ten entities are identified on the TCM case data set constructed in this paper,the comprehensive accuracy rate,recall rate and F1 of the TCM case named entity recognition model based on LEBERT-BILSTM-CRF is 88.69%,87.4% and 88.1%,respectively.It is higher than common named entity recognition models such as BERT-CRF and LEBERT-CRF.

Key words: Natural language processing, Chinese medicine case, Vocabulary enhancement, BERT, BiLSTM-CRF

CLC Number:

TP391

LI Minzhe, YIN Jibin. TCM Named Entity Recognition Model Combining BERT Model and Lexical Enhancement[J].Computer Science, 2024, 51(6A): 230900030-6.

References

[1]JI T,SU S L,SHANG E X,et al.Determining the rules of traditional Chinese medicine on treatment of consumptive thirst based on association rules mining[J].China Journal of Traditional Chinese Medicine and Pharmacy,2016,31(12):4982-4986.
[2]XU Z H.Statistical Model based Chinese Named Entity Recognition Methods and its Application to Medical Records[D].Beijing:Beijing University of Chemical Technology,2017.
[3]GAO J Y,LIU Z,YANG T,et al.Research on Named Entity Extraction of TCM Clinical Medical Records Symptoms Based on Conditional Random Field[J].Modernization of Traditional Chinese Medicine and Materia Medica-World Science and Technology,2020,22(6):1947-1954.
[4]ZHAO Z H,YANG Z H,LUO L,et al.Disease named entity recognition from biomedical literature using a novel convolu-tional neural network[J].BMC Medical Genomics,2017,10(S5):73.
[5]CAO C P,GUAN J P.Clinical text named entity recognition based on E-CNN and BLSTM -CRF[J].Application Research of Computers,2019,36(12):3748-3751.
[6]GUO X R,LUO P,WANG W L.Chinese named entity recognition based on Transformer encoder[J].Journal of Jilin University(Engineering and Technology Edition),2021,51(3):989-995.
[7]DEVLIN J,CHANG M W,LEE K,et al.BERT:Pre-training of Deep Bidirectional Transformers for Language Understanding[J].arXiv:1810.04805,2018.
[8]YAN H,DENG B,LI X,et al.TENER:Adapting Transformer Encoder for Named Entity Recognition[J].arXiv:1911.04474,2019.
[9]LASRI K,LENCI A,POIBEAU T.Does BERT really agree?Fine-grained Analysis of Lexical Dependence on a Syntactic Task[J].arXiv:2204.06889,2022.
[10]MA R,PENG M,ZHANG Q,et al.Simplify the Usage of Lexicon in Chinese NER[C]//Proceedings of the 58th ANNUAL Meeting of the Association for Computational Linguistics.2020:5951-5960.
[11]LIU W,FU X,ZHANG Y,et al.Lexicon Enhanced Chinese Sequence Labeling Using BERT Adapter[C]//Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing.2021:5847-5858.
[12]REN Y,YU H,YANG H,et al.Recognition of quantitative indicator of fishery standard using attention mechanism and the BERT+BiLSTM+CRF model[J].Transactions of the Chinese Society of Agricultural Engineering,2021,37(10):135-141.
[13]LIU J G,XIA C H.Innovative deep neural network modeling for fine-grained Chinese entity recognition[J].Electronics,2020,9(6):1001.
[14]LIAO X F,XIE S S.Chinese Named Entity Recognition Based on Attention Mechanism Feature Fusion[J].Computer Engineering,2023,49(4):256-262.
[15]LAN Z,CHEN M,GOODMAN S,et al.ALBERT:A Lite BERT for Self-supervised Learning of Language Representations[J].arXiv:1909.11942,2019.

Related Articles 15

[1]	MENG Xiangfu, REN Quanying, YANG Dongshen, LI Keqian, YAO Keyu, ZHU Yan. Literature Classification of Individual Reports of Adverse Drug Reactions Based on BERT and CNN [J]. Computer Science, 2024, 51(6A): 230400049-6.
[2]	LI Bin, WANG Haochang. Implementation and Application of Chinese Grammatical Error Diagnosis System Based on CRF [J]. Computer Science, 2024, 51(6A): 230900073-6.
[3]	WANG Yingjie, ZHANG Chengye, BAI Fengbo, WANG Zumin. Named Entity Recognition Approach of Judicial Documents Based on Transformer [J]. Computer Science, 2024, 51(6A): 230500164-9.
[4]	JIANG Haoda, ZHAO Chunlei, CHEN Han, WANG Chundong. Construction Method of Domain Sentiment Lexicon Based on Improved TF-IDF and BERT [J]. Computer Science, 2024, 51(6A): 230800011-9.
[5]	YANG Junzhe, SONG Ying, CHEN Yifei. Text Emotional Analysis Model Fusing Theme Characteristics [J]. Computer Science, 2024, 51(6A): 230600111-8.
[6]	PENG Bo, LI Yaodong, GONG Xianfu, LI Hao. Method for Entity Relation Extraction Based on Heterogeneous Graph Neural Networks and TextSemantic Enhancement [J]. Computer Science, 2024, 51(6A): 230700071-5.
[7]	YANG Binxia, LUO Xudong, SUN Kaili. Recent Progress on Machine Translation Based on Pre-trained Language Models [J]. Computer Science, 2024, 51(6A): 230700112-8.
[8]	CHEN Bingting, ZOU Weiqin, CAI Biyu, LIU Wenjie. Bug Report Severity Prediction Based on Fine-tuned Embedding Model with Domain Knowledge [J]. Computer Science, 2024, 51(6A): 230400068-7.
[9]	CHEN Haoyang, ZHANG Lei. Very Short Texts Hierarchical Classification Combining Semantic Interpretation and DeBERTa [J]. Computer Science, 2024, 51(5): 250-257.
[10]	YAN Yintong, YU Lu, WANG Taiyan, LI Yuwei, PAN Zulie. Study on Binary Code Similarity Detection Based on Jump-SBERT [J]. Computer Science, 2024, 51(5): 355-362.
[11]	ZHANG Mingdao, ZHOU Xin, WU Xiaohong, QING Linbo, HE Xiaohai. Unified Fake News Detection Based on Semantic Expansion and HDGCN [J]. Computer Science, 2024, 51(4): 299-306.
[12]	TU Xin, ZHANG Wei, LI Jidong, LI Meijiao , LONG Xiangbo. Study on Automatic Classification of English Tense Exercises for Intelligent Online Teaching [J]. Computer Science, 2024, 51(4): 353-358.
[13]	ZHENG Cheng, SHI Jingwei, WEI Suhua, CHENG Jiaming. Dual Feature Adaptive Fusion Network Based on Dependency Type Pruning for Aspect-basedSentiment Analysis [J]. Computer Science, 2024, 51(3): 205-213.
[14]	MAO Xin, LEI Zhanyao, QI Zhengwei. Automated Kaomoji Extraction Based on Large-scale Danmaku Texts [J]. Computer Science, 2024, 51(1): 284-294.
[15]	GE Huibin, WANG Dexin, ZHENG Tao, ZHANG Ting, XIONG Deyi. Study on Model Migration of Natural Language Processing for Domestic Deep Learning Platform [J]. Computer Science, 2024, 51(1): 50-59.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

TCM Named Entity Recognition Model Combining BERT Model and Lexical Enhancement

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 15

Metrics

Comments

Recommended 0