Computer Science ›› 2025, Vol. 52 ›› Issue (5): 227-234.doi: 10.11896/jsjkx.240400035

• Artificial Intelligence • Previous Articles     Next Articles

Intelligent Error Correction Model for Chinese Idioms Fused with Fixed-length Seq2Seq Network

HE Chunhui, GE Bin, ZHANG Chong, XU Hao   

  1. Laboratory for Big Data and Decision,National University of Defense Technology,Changsha 410073,China
  • Received:2024-04-04 Revised:2024-08-20 Online:2025-05-15 Published:2025-05-12
  • About author:HE Chunhui,born in 1991,master.His main research interests include information processing and artificial intelligence.
    GE Bin,born in 1979,Ph.D,researcher.His main research interests include big data analysis and information extraction.
  • Supported by:
    National Key Research and Development Program of China(2022YFB3103600).

Abstract: As a special kind of words,four-character idioms are very popular in Chinese.With the development of Chinese error correction task,intelligent error correction for Chinese idioms has become a research hotspot in natural language processing(NLP) domain.For the low accuracy of the existing methods in intelligent error correction task for Chinese idioms,this paper proposes an intelligent error correction model for Chinese idioms fused with fixed-length Seq2Seq network.In the bottom layer,Seq2Seq network architecture and attention mechanism are combined with hybrid dataset construction method to train Seq2Seq model with fixed input and output sequence length,which is used to solve intelligent error correction task for Chinese four-character idioms.Experimental results on a large public Chinese idiom error correction dataset show that the performance of fixed-length Seq2Seq model is better than the existing methods,and it can achieve the goal of intelligent error correction of three diffe-rent Chinese idioms:out-of-order,missing character and wrong character.Its comprehensive error correction accuracy can reach 91.3%,which is 11.73% higher than the optimal baseline model.

Key words: Idioms error correction, Fixed length Seq2Seq, BiGRU, Attention mechanism

CLC Number: 

  • TP391
[1]XU H,HE C H,ZHANG C,et al.A Multi-channel ChineseText Correction Method Based on Grammatical Error Diagnosis[C]//2022 8th International Conference on Big Data and Information Analytics(BigDIA).2022:396-401.
[2]SUN Q J,LIANG J G,LI S,Chinese grammatical error correction model based on bidirectional and auto-regressive transfor-mers noiser[J].Journal of Computer Applications,2022,42(3):860-866.
[3]YOO Y.An Analysis on Four-character idiom in the Contempo-rary Chinese Dictionary[J].Journal of Chinese Humanities,2010(46):93-109.
[4]WANG Y,WANG Y,DANG K,et al.A comprehensive survey of grammatical error correction[J].ACM Transactions on Intelligent Systems and Technology(TIST),2021,12(5):1-51.
[5]WU C H,LIU C H,HARRIS M,et al.Sentence correction incorporating relative position and parse template language models[J].IEEE Transactions on Audio Speech & Language Proces-sing,2010,18(6):1170-1181.
[6]YU C H,CHEN H H.Detecting word ordering errors in Chi-nese sentences for learning Chinese as a foreign language[C]//Proceedings of COLING.2012.
[7]CHENG S M,YU C H,CHEN H H.Chinese Word Ordering Errors Detection and Correction for Non-Native Chinese Language Learners[C]//The 25th International Conference on Computational Linguistics.2014:279-289.
[8]FU K,HUANG J,DUAN Y.Youdao's winning solution to the NLPCC-2018 task 2 challenge:a neural machine translation approach to Chinese grammatical error correction[C]//NLPCC2018.Cham:Springer,2018:341-350.
[9]SALHAB M,ABU-KHZAM F.AraSpell:A Deep Learning Approach for Arabic Spelling Correction[J].arXiv:2405.06981,2024.
[10]HUANG Y,ZENG Q,LEI Q,et al.Smartphone heading correc-tion method based on LSTM neural network[C]//China Satellite Navigation Conference.Singapore:Springer Nature Singapore,2022:415-425.
[11]ZHANG C,JIANG D,GAO Y,et al.A hierarchical tensor error correction GRU model[J].Information Sciences,2023,642:119156.
[12]WANG N,LI Z.Short term power load forecasting based onBES-VMD and CNN-Bi-LSTM method with error correction[J].Frontiers in Energy Research,2023,10:1076529.
[13]LI J,GUO J,ZHU Y,et al.Sequence-to-action:Grammatical error correction with action guided sequence generation[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2022:10974-10982.
[14]ZHU C,YING Z,ZHANG B,et al.MDCSpell:A multi-task detector-corrector framework for Chinese spelling correction[C]//Findings of the Association for Computational Linguistics:ACL 2022.2022:1244-1253.
[15]HOKAMP C,LIU Q.Lexically Constrained Decoding for Se-quence Generation Using Grid Beam Search[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics(ACL).Vancouver,Canada,2017:1535-1546.
[16]SUTSKEVER I,VINYALS O,LE Q V.Sequence to sequence learning with neural networks[J].arXiv:1409.3215,2014.
[17]CHO K,MERRIENBOER B,GULCEHRE C,et al.LearningPhrase Representations using RNN Encoder-Decoder for Statistical Machine Translation[C]//EMNLP.2014.
[18]GEMECHU E,KANAGACHIDAMBARESAN G R.Text-Text Neural Machine Translation:A Survey[J].Optical Memory and Neural Networks,2023,32(2):59-72.
[19]DAS B,MAJUMDER M,PHADIKAR S,et al.Automatic question generation and answer assessment:a survey[J/OL].https://telrp.springeropen.com/counter/pdf/10.1186/s41039-021-00151-1.pdf.
[20]ZHAO S,LI Q,HE T J,et al.A Step-by-Step Gradient Penalty with Similarity Calculation for Text Summary Generation[J].Neural Processing Letters,2022,55(4):4111-4126.
[21]BAHDANAU D,CHO K,BENGIO Y.Neural machine translation by jointly learning to align and translate[J].arXiv:1409.0473,2014.
[22]LUONG M T,PHAM H,MANNING C D.Effective approaches to attention-based neural machine translation[J].arXiv:1508.04025,2015.
[23]PODDA M,BONECHI S,PALLADINO A,et al.Classification of Neisseria meningitidis genomes with a bag-of-words approach and machine learning[J].iScience,2024,27(3):1-15.
[24]WANG H,KUROSAWA M,KATSUMATA S,et al.Chinese grammatical correction using BERT-based pre-trained model[J].arXiv:2011.02093,2020.
[25]XU M.Pycorrector:Text error correction tool [EB/OL].(2024-02-03).https://github.com/shibing624/pycorrector.
[26]CUI Y,CHE W,LIU T,et al.Pre-training with whole word masking for Chinese bert[J].IEEE/ACM Transactions on Audio,Speech,and Language Processing,2021,29:3504-3514.
[27]CZHANG Y,LI Z,BAO Z,et al.MuCGEC:a Multi-ReferenceMulti-Source Evaluation Dataset for Chinese Grammatical Error Correction[J].arXiv:2204.10994,2022.
[28]TIAN T,SONG C,TING J,et al.A French-to-English Machine Translation Model Using Transformer Network[J].Procedia Computer Science,2022,199:1438-1443.
[1] LIU Dehua, YU Saixuan, QIAO Jinlan, HUANG Heqing, CHENG Wenhui. Denoising Diffusion Model-enhanced Algorithm for Battery Swap Demand Data Generation [J]. Computer Science, 2026, 53(4): 163-172.
[2] PENG Juhong, ZHANG Zhengyue, DING Zixu, FAN Xinyu, HU Changyu, ZHAO Mingjun. Multi-view Local Language Feature and Global Feature Fusion for Conversational Aspect-based Sentiment Quadruple Analysis [J]. Computer Science, 2026, 53(4): 384-392.
[3] ZHENG Cheng, BAN Qingqing. Knowledge-assisted and Reinforced Syntax-driven for Aspect-based Sentiment Analysis [J]. Computer Science, 2026, 53(4): 406-414.
[4] QIAN Qing, CHEN Huicheng, CUI Yunhe, TANG Ruixue, FU Jinmei. Joint Entity and Relation Extraction Method with Multi-scale Collaborative Aggregation and Axial-semantic Guidance [J]. Computer Science, 2026, 53(3): 97-106.
[5] GE Zeqing, HUANG Shengjun. Semi-supervised Learning Method for Multi-label Tabular Data [J]. Computer Science, 2026, 53(3): 151-157.
[6] WANG Xinyu, GAO Donghuai, NING Yuwen, XU Hao, QI Haonan. Student Behavior Detection Method Based on Improved YOLO Algorithm [J]. Computer Science, 2026, 53(3): 246-256.
[7] ZHUO Tienong, YING Di, ZHAO Hui. Research on Student Classroom Concentration Integrating Cross-modal Attention and Role
Interaction
[J]. Computer Science, 2026, 53(2): 67-77.
[8] XU Jingtao, YANG Yan, JIANG Yongquan. Time-Frequency Attention Based Model for Time Series Anomaly Detection [J]. Computer Science, 2026, 53(2): 161-169.
[9] HAN Lei, SHANG Haoyu, QIAN Xiaoyan, GU Yan, LIU Qingsong, WANG Chuang. Constrained Multi-loss Video Anomaly Detection with Dual-branch Feature Fusion [J]. Computer Science, 2026, 53(2): 236-244.
[10] GUO Xingxing, XIAO Yannan, WEN Peizhi, XU Zhi, HUANG Wenming. Attention-based Audio-driven Digital Face Video Generation Method [J]. Computer Science, 2026, 53(2): 245-252.
[11] JI Sai, QIAO Liwei, SUN Yajie. Semantic-guided Hybrid Cross-feature Fusion Method for Infrared and Visible Light Images [J]. Computer Science, 2026, 53(2): 253-263.
[12] CHANG Xuanwei, DUAN Liguo, CHEN Jiahao, CUI Juanjuan, LI Aiping. Method for Span-level Sentiment Triplet Extraction by Deeply Integrating Syntactic and Semantic
Features
[J]. Computer Science, 2026, 53(2): 322-330.
[13] ZHANG Jing, PAN Jinghao, JIANG Wenchao. Background Structure-aware Few-shot Knowledge Graph Completion [J]. Computer Science, 2026, 53(2): 331-341.
[14] LYU Jinggang, GAO Shuo, LI Yuzhi, ZHOU Jin. Facial Expression Recognition with Channel Attention Guided Global-Local Semantic Cooperation [J]. Computer Science, 2026, 53(1): 195-205.
[15] FAN Jiabin, WANG Baohui, CHEN Jixuan. Method for Symbol Detection in Substation Layout Diagrams Based on Text-Image MultimodalFusion [J]. Computer Science, 2026, 53(1): 206-215.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!