计算机科学 ›› 2025, Vol. 52 ›› Issue (10): 208-216.doi: 10.11896/jsjkx.240200081
王剑1, 王京岭2, 张革1, 王章全1, 郭世远2, 庾桂铭1
WANG Jian1, WANG Jingling2, ZHANG Ge1, WANG Zhangquan1, GUO Shiyuan2, YU Guiming1
摘要: 在过去多模态信息抽取(Multimodal Information Extraction,MIE)任务中,研究人员通常使用数据层融合的方式训练用于MIE的神经网络模型。但是,由于不同模态间的异构性,这种融合方式容易导致特征冗余、特征不兼容和缺乏解释性等问题,进而影响模型训练的效果。针对此问题,提出了一种基于Dempster-Shafer(DS)理论的决策层融合方法。该方法通过神经网络和狄利克雷函数处理不同模态特征生成证据,经证据修正和权重分配后,利用Shafer融合规则得出最终决策,有效提升了特征处理的准确性和模型的可解释性。采用精确率、召回率和F1分数作为评价指标,在公开和私有数据集上的实验结果表明,相较于现有方法,所提方法的总体性能提升了0.22个百分点到1.87个百分点。
中图分类号:
[1]LANDOLSI M Y,HLAOUA L,BEN ROMDHANE L.Information extraction from electronic medical documents:state of the art and future research directions[J].Knowledge and Information Systems,2023,65(2):463-516. [2]XU B,HUANG S,DU M,et al.A Unified Visual Prompt Tuning Framework with Mixture-of-Experts for Multimodal Information Extraction[C]//International Conference on Database Systems for Advanced Applications.Cham:Springer,2023:544-554. [3]RAHATE A,WALAMBE R,RAMANNA S,et al.Multimodal co-learning:Challenges,applications with datasets,recent advances and future directions[J].Information Fusion,2022,81:203-239. [4]YANG Y,ZHAN D C,JIANG Y,et al.Reliable multi-modallearning:a survey[J].Journal of Software,2020,32(4):1067-1081. [5]LI X,ZHAO X,XU J,et al.IMF:Interactive Multimodal Fusion Model for Link Prediction[C]//Proceedings of the ACM Web Conference 2023.2023:2572-2580. [6]DOU H,ZHANG L M,HAN F,et al.Survey on Convolutional Neural Network Interpretability[J].Ruan Jian Xue Bao/Journal of Software,2023,35(1):159-184. [7]CUNNINGHAM H,DING Y,KIRYAKOV A.Workshop onHuman Language Technology for the Semantic Web and Web Services[EB/OL].https://gate.ac.uk/conferences/iswc2003/proceedings/iswc2003-hlt4sw-proceedings.pdf. [8]ETZIONI O,CAFARELLA M,DOWNEY D,et al.Unsuper-vised named-entity extraction from the web:An experimental study[J].Artificial Intelligence,2005,165(1):91-134. [9]SEKINE S,NOBATA C.Definition,Dictionaries and Tagger for Extended Named Entity Hierarchy[C]//LREC.2004:1977-1980. [10]ZHANG S,ELHADAD N.Unsupervised biomedical named entity recognition:experiments with clinical and biological texts.[J].Journal of Biomedical Informatics,2013,46( 6):1088-1098. [11]HANISCH D,FUNDEL K,MEVISSEN H T,et al.ProMiner:rule-based protein and gene entity recognition[J].BMC Bioinformatics,2005,6(1):1-9. [12]QUIMBAYA A P,MÚNERA A S,RIVERA R A G,et al.Named entity recognition over electronic health records through a combined dictionary-based approach[J].Procedia Computer Science,2016,100:55-61. [13]FLESCA S,MANCO G,MASCIARI E,et al.Web wrapper induction:a brief survey[J].AI communications,2004,17(2):57-61. [14]NADEAU D,TURNEY P D,MATWIN S.Unsupervisednamed-entity recognition:Generating gazetteers and resolving ambiguity[C]//Advances in Artificial Intelligence:19th Confe-rence of the Canadian Society for Computational Studies of Intelligence.Berlin:Springer,2006:266-277. [15]COLLINS M,SINGER Y.Unsupervised models for named entity classification[C]//1999 Joint SIGDAT Conference on Empi-rical Methods in Natural Language Processing and Very Large Corpora.1999. [16]YAO L,LIU H,LIU Y,et al.Biomedical named entity recognition based on deep neutral network[J].International Journal of Hybrid Information Technology,2015,8(8):279-288. [17]YANG J,ZHANG Y,DONG F.Neural reranking for named entity recognition[J].arXiv:1707.05127,2017. [18]KURU O,CAN O A,YURET D.Charner:Character-levelnamed entity recognition[C]//Proceedings of COLING 2016,the 26th International Conference on Computational Linguistics:Technical Papers.2016:911-921. [19]ZHANG Q,FU J,LIU X,et al.Adaptive co-attention network for named entity recognition in tweets[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2018. [20]CARBONELL M,RIBA P,VILLEGAS M,et al.Named entity recognition and relation extraction with graph neural networks in semi structured documents[C]//2020 25th International Conference on Pattern Recognition(ICPR).2021:9622-9627. [21]LIU X,GAO F,ZHANG Q,et al.Graph convolution for multimodal information extraction from visually rich documents[J].arXiv:1903.11279,2019. [22]YU W,LU N,QI X,et al.PICK:processing key information extraction from documents using improved graph learning-convolutional networks[C]//2020 25th International Conference on Pattern Recognition(ICPR).2021:4363-4370. [23]HUANG Z,CHEN K,HE J,et al.Icdar2019 competition onscanned receipt ocr and information extraction[C]//2019 International Conference on Document Analysis and Recognition(ICDAR).2019:1516-1520. [24]XU Y,LI M,CUI L,et al.Layoutlm:Pre-training of text and layout for document image understanding[C]//Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining.2020:1192-1200. [25]XU Y,XU Y,LYU T,et al.Layoutlmv2:Multi-modal pre-training for visually-rich document understanding[J].arXiv:2012.14740,2020. [26]XU Y,LYU T,CUI L,et al.Layoutxlm:Multimodal pre-training for multilingual visually-rich document understanding[J].ar-Xiv:2104.08836,2021. [27]SUI D,TIAN Z,CHEN Y,et al.A large-scale chinese multimodal ner dataset with speech clues[C]//Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing.2021:2807-2818. [28]HAN Z,ZHANG C,FU H,et al.Trusted multi-view classification with dynamic evidential fusion[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2022,45(2):2551-2566. [29]DEMPSTER A P.Upper and Lower Probabilities Induced by a Multivalued Mapping[J].Annals of Mathematical Statistics,1967,38(2):325-339. [30]SHAFER G A.A Mathematical Theory of Evidence[J].Tech-nometrics,1978,20(1):106-106. [31]REIMERS N,GUREVYCH I.Sentence-bert:Sentence embed-dings using siamese bert-networks[J].arXiv:1908.10084,2019. [32]LI D,DENG Y,CHEONG K H.Multisource basic probability assignment fusion based on information quality[J].International Journal of Intelligent Systems,2021,36(4):1851-1875. [33]WANG Y C,WANG J,HUANG M J,et al.An evidence combination rule based on a new weight assignment scheme[J].Soft Computing,2022,26(15):7123-7137. [34]ZENG J,XIAO F.A fractal belief KL divergence for decision fusion[J].Engineering Applications of Artificial Intelligence,2023,121:106027. [35]DENG Y.Deng entropy[J].Chaos,Solitons & Fractals,2016,91:549-553. [36]ZHANG P,XU Y,CHENG Z,et al.TRIE:end-to-end text rea-ding and information extraction for document understanding[C]//Proceedings of the 28th ACM International Conference on Multimedia.2020:1413-1422. |
|