计算机科学 ›› 2024, Vol. 51 ›› Issue (4): 174-181.doi: 10.11896/jsjkx.230400031

• 数据库&大数据&数据科学 • 上一篇    下一篇

基于自然语言生成的制造企业自动化图表分析方法研究

王旭1, 刘昌宏2, 李生春2, 刘爽3, 赵康廷1, 陈亮1   

  1. 1 西安工程大学计算机科学学院 西安710048
    2 重庆中烟工业有限责任公司黔江卷烟厂 重庆409000
    3 陕西师范大学数学与统计学院 西安710119
  • 收稿日期:2023-04-05 修回日期:2023-06-20 出版日期:2024-04-15 发布日期:2024-04-10
  • 通讯作者: 刘昌宏(liuch02@cncqti.com)
  • 作者简介:(sxwnfpwx@163.com)
  • 基金资助:
    陕西省教育厅重点科学研究计划(22JS021)

Study on Manufacturing Company Automated Chart Analysis Method Based on Natural LanguageGeneration

WANG Xu1, LIU Changhong2, LI Shengchun2, LIU Shuang3, ZHAO Kangting1, CHEN Liang1   

  1. 1 School of Computer Science,Xi'an Polytechnic University,Xi'an 710048,China
    2 China Tobacco Chongqing Industrial Co.Ltd.,Qianjiang Cigarette Factory,Chongqing 409000,China
    3 School of Mathematics and Statistics,Shaanxi Normal University,Xi'an 710119,China
  • Received:2023-04-05 Revised:2023-06-20 Online:2024-04-15 Published:2024-04-10
  • Supported by:
    Key Scientific Research Program of the Education Department of Shaanxi Province,China(22JS021).

摘要: 随着数字化转型的浪潮席卷全球,制造企业每天都会产生大量的图表数据,传统的图表分析方法很难对图表数据进行高效、准确的分析,自动化图表分析方法成为图表分析的重要手段。为解决自动化图表分析方法在实际应用时很难满足具体需求的问题,提出了一种基于自然语言生成的制造企业自动化图表分析方法。该方法基于LSTM对图表数据进行分析,并针对分析过程中出现的多余数据误导LSTM等问题,在嵌入层之后增加判别器层使LSTM能够根据图表类型进行更有针对性的语义理解和文本预测;针对图表分析过程中生成描述语句质量差等问题,参考集束搜索和随机采样策略,提出随机集束采样策略以提高图表分析质量,并引入知识蒸馏方法对LSTM进行优化,进一步提高描述文本的质量。实验证明,相较于LSTM,该方法文本质量提升了8.9%。为了便于将该方法应用在实际中,设计并开发了制造企业自动化图表分析系统,并将该方法引入作为图表分析工具。实验结果表明,所提方法能够提高制造企业图表分析的质量和效率。

关键词: 图表分析, 自然语言生成, LSTM, 知识蒸馏

Abstract: With the wave of digital transformation,manufacturing enterprises produce a large number of chart data every day.Traditional chart analysis methods are difficult to analyze chart data efficiently and accurately.Automated chart analysis methods have become an important means of chart analysis.In order to solve the problem that the automatic chart analysis method is difficult to meet the specific needs in practical application,an automatic chart analysis method of manufacturing enterprises based on natural language generation is proposed.This method analyzes the chart data based on LSTM,and in order to solve the problem of misleading LSTM by redundant data in the analysis process,a discriminator layer is added after the embedding layer to enable LSTM to perform more targeted semantic understanding and text prediction according to the type of chart.Aiming at the problem of poor quality of description sentences generated in the process of diagram analysis,a random cluster sampling strategy is proposed to improve the quality of diagram analysis by referring to beam search and random sampling strategy,and knowledge distillation method is introduced to optimize LSTM to further improve the quality of description text.Experiments show that this method improves the text quality by 8.9% compared with LSTM.In order to apply the method in practice,an automatic chart analysis system for manufacturing enterprises is designed and developed,and the method is introduced as a chart analysis tool.Experimental results show that the application of this method can improve the quality and efficiency of chart analysis in manufacturing enterprises.

Key words: Chart analysis, Natural language generation, LSTM, Knowledge distillation

中图分类号: 

  • TP391.41
[1]LI Q.Application of Artificial Intelligence in Industrial Automation Control System[C]//International Conference on Advances in Energy Resources and Environment Engineering(ICAESEE).2021.
[2]LI H,YANG C.Digital Transformation of Manufacturing Enterprises[C]//International Conference on Identification,Information and Knowledge in the Internet of Things(IIKI).2021,187:24-29.
[3]WANG Y,HAN M.Research on the Impact Mechanism of Organizational Based Psychological Ownership on the Intelligent Transformation of Manufacturing Enterprises:Based on the Perspective of Technological Change[J].Psychology Research and Behavior Management,2020,13:775-786.
[4]MAHMOOD A,BAJWA I,QAZI K.An Automated Approachfor Interpretation of Statistical Graphics[C]//International Conference on Intelligent Human-Machine Systems and Cybernetics.2014:376-379.
[5]KALLIMANI J,SRINIVASA K,ESWARA R.Extraction andinterpretation of charts in technical documents[C]//International Conference on Advances in Computing,Communications and Informatics(ICACCI).2013:382-387.
[6]LIU C,XIE L,HAN Y,et al.AutoCaption:An approach to ge-nerate natural language description from visualization automatically[C]//IEEE Pacific Visualization Symposium(PacificVis).2020:191-195.
[7]BRYAN C,MA K,WOODRING J.Temporal Summary Images:An Approach to Narrative Visualization via Interactive Annotation Generation and Placement[J].IEEE Transactions on Visua-lization and Computer Graphics,2017,23(1):511-520.
[8]DE OLIVEIRA C L T,SILVA A T D A,CAMPOS E M,et al.Proposal and evaluation of textual description templates for barcharts vocalization[C]//Proceedings of the 2019 23rd International Conference Information Visualisation(IV).Institute of Electrical and Electronics Engineers(IEEE),Paris,France,2019:163-169.
[9]WU W,GU G,LIU Q,et al.Image dense caption descriptionbased on depthwise convolution and global features[J].Signal Processing,2020,36(9):1525-1532.
[10]LI W,ZENG S,WANG J.Image description generation algorithm based on improved attention mechanism[J].Computer Applications,2021,41(5):1262-1267.
[11]DEROSE J F,WANG J,BERGER M.Attention Flows:Analyzing and Comparing Attention Mechanisms in Language Models[J].IEEE Transactions on Visualization and Computer Gra-phics,2021,27(2):1160-1170.
[12]ZHANG Z,WU S,JIANG D,et al.BERT-JAM:Maximizing the utilization of BERT for neural machine translation[J].Neuro-computing,2021,460:84-94.
[13]ZHENG X,ZHANG C,WOODLAND P C.Adapting GPT,GPT-2 and Bert Language Models For Speech Recognition[C]//IEEE Automatic Speech Recognition and Understanding Workshop(ASRU).2021:162-168.
[14]TIAN G,HUANG J,PENG M,et al.Dynamic sampling of text streams and its application in text analysis[J].Knowledge and Information Systems,2017,53(2):507-531.
[15]CAMBRIA E,WHITE B.Jumping NLP Curves:A Review ofNatural Language Processing Research[J].IEEE Computational Intelligence Magazine,2014,9(2):48-57.
[16]YOUNG T,HAZARIKA D,PORIA S,et al.Recent Trends inDeep Learning Based Natural Language Processing[J].IEEE Computational Intelligence Magazine,2018,13(3):55-75.
[17]BAI X.Text classification based on LSTM and attention[C]//Thirteenth International Conference on Digital Information Management(ICDIM).2018:29-32.
[18]SONG J,CHEN Y,YE J,et al.Spot-Adaptive Knowledge Distillation[J].IEEE Transactions on Image Processing,2022,31:3359-3370.
[19]LIU Y,CHEN K,LIU C,et al.Structured Knowledge Distillation for Semantic Segmentation[C]//IEEE Conference on Computer Vision and Pattern Recognition.2019:2599-2608.
[20]PARK J,YI D,JI S.Analysis of Recurrent Neural Network and Predictions[J].Symmetry,2020,12(4):615.
[21]BANERJEE S,LAVIE A.METEOR:An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments[C]//ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation.Michigan,USA:Asso-ciation for Computational Linguistics,2005:65-72.
[22]KISHORE P,SALIM R,TODD W,et al.BLEU:a method for automatic evaluation of machine translation[C]//Proceedings of the 40th Annual Meeting on Association for Computational Linguistics(ACL'02).Association for Computational Linguistics,2002:311-318.
[23]VEDANTAM R,ZITNICK C L,PARIKH D.CIDEr:Consen-sus-based Image Description Evaluation[C]//IEEE Conference on Computer Vision and Pattern Recognition(CVPR).Boston,MA,USA:IEEE Computer Society,2015:4566-4575.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!