融合定长Seq2Seq网络的中文成语智能纠错模型

doi:10.11896/jsjkx.240400035

Abstract

Abstract: As a special kind of words,four-character idioms are very popular in Chinese.With the development of Chinese error correction task,intelligent error correction for Chinese idioms has become a research hotspot in natural language processing(NLP) domain.For the low accuracy of the existing methods in intelligent error correction task for Chinese idioms,this paper proposes an intelligent error correction model for Chinese idioms fused with fixed-length Seq2Seq network.In the bottom layer,Seq2Seq network architecture and attention mechanism are combined with hybrid dataset construction method to train Seq2Seq model with fixed input and output sequence length,which is used to solve intelligent error correction task for Chinese four-character idioms.Experimental results on a large public Chinese idiom error correction dataset show that the performance of fixed-length Seq2Seq model is better than the existing methods,and it can achieve the goal of intelligent error correction of three diffe-rent Chinese idioms:out-of-order,missing character and wrong character.Its comprehensive error correction accuracy can reach 91.3%,which is 11.73% higher than the optimal baseline model.

Key words: Idioms error correction, Fixed length Seq2Seq, BiGRU, Attention mechanism

CLC Number:

TP391

HE Chunhui, GE Bin, ZHANG Chong, XU Hao. Intelligent Error Correction Model for Chinese Idioms Fused with Fixed-length Seq2Seq Network[J].Computer Science, 2025, 52(5): 227-234.

References

[1]XU H,HE C H,ZHANG C,et al.A Multi-channel ChineseText Correction Method Based on Grammatical Error Diagnosis[C]//2022 8th International Conference on Big Data and Information Analytics(BigDIA).2022:396-401.
[2]SUN Q J,LIANG J G,LI S,Chinese grammatical error correction model based on bidirectional and auto-regressive transfor-mers noiser[J].Journal of Computer Applications,2022,42(3):860-866.
[3]YOO Y.An Analysis on Four-character idiom in the Contempo-rary Chinese Dictionary[J].Journal of Chinese Humanities,2010(46):93-109.
[4]WANG Y,WANG Y,DANG K,et al.A comprehensive survey of grammatical error correction[J].ACM Transactions on Intelligent Systems and Technology(TIST),2021,12(5):1-51.
[5]WU C H,LIU C H,HARRIS M,et al.Sentence correction incorporating relative position and parse template language models[J].IEEE Transactions on Audio Speech & Language Proces-sing,2010,18(6):1170-1181.
[6]YU C H,CHEN H H.Detecting word ordering errors in Chi-nese sentences for learning Chinese as a foreign language[C]//Proceedings of COLING.2012.
[7]CHENG S M,YU C H,CHEN H H.Chinese Word Ordering Errors Detection and Correction for Non-Native Chinese Language Learners[C]//The 25th International Conference on Computational Linguistics.2014:279-289.
[8]FU K,HUANG J,DUAN Y.Youdao's winning solution to the NLPCC-2018 task 2 challenge:a neural machine translation approach to Chinese grammatical error correction[C]//NLPCC2018.Cham:Springer,2018:341-350.
[9]SALHAB M,ABU-KHZAM F.AraSpell:A Deep Learning Approach for Arabic Spelling Correction[J].arXiv:2405.06981,2024.
[10]HUANG Y,ZENG Q,LEI Q,et al.Smartphone heading correc-tion method based on LSTM neural network[C]//China Satellite Navigation Conference.Singapore:Springer Nature Singapore,2022:415-425.
[11]ZHANG C,JIANG D,GAO Y,et al.A hierarchical tensor error correction GRU model[J].Information Sciences,2023,642:119156.
[12]WANG N,LI Z.Short term power load forecasting based onBES-VMD and CNN-Bi-LSTM method with error correction[J].Frontiers in Energy Research,2023,10:1076529.
[13]LI J,GUO J,ZHU Y,et al.Sequence-to-action:Grammatical error correction with action guided sequence generation[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2022:10974-10982.
[14]ZHU C,YING Z,ZHANG B,et al.MDCSpell:A multi-task detector-corrector framework for Chinese spelling correction[C]//Findings of the Association for Computational Linguistics:ACL 2022.2022:1244-1253.
[15]HOKAMP C,LIU Q.Lexically Constrained Decoding for Se-quence Generation Using Grid Beam Search[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics(ACL).Vancouver,Canada,2017:1535-1546.
[16]SUTSKEVER I,VINYALS O,LE Q V.Sequence to sequence learning with neural networks[J].arXiv:1409.3215,2014.
[17]CHO K,MERRIENBOER B,GULCEHRE C,et al.LearningPhrase Representations using RNN Encoder-Decoder for Statistical Machine Translation[C]//EMNLP.2014.
[18]GEMECHU E,KANAGACHIDAMBARESAN G R.Text-Text Neural Machine Translation:A Survey[J].Optical Memory and Neural Networks,2023,32(2):59-72.
[19]DAS B,MAJUMDER M,PHADIKAR S,et al.Automatic question generation and answer assessment:a survey[J/OL].https://telrp.springeropen.com/counter/pdf/10.1186/s41039-021-00151-1.pdf.
[20]ZHAO S,LI Q,HE T J,et al.A Step-by-Step Gradient Penalty with Similarity Calculation for Text Summary Generation[J].Neural Processing Letters,2022,55(4):4111-4126.
[21]BAHDANAU D,CHO K,BENGIO Y.Neural machine translation by jointly learning to align and translate[J].arXiv:1409.0473,2014.
[22]LUONG M T,PHAM H,MANNING C D.Effective approaches to attention-based neural machine translation[J].arXiv:1508.04025,2015.
[23]PODDA M,BONECHI S,PALLADINO A,et al.Classification of Neisseria meningitidis genomes with a bag-of-words approach and machine learning[J].iScience,2024,27(3):1-15.
[24]WANG H,KUROSAWA M,KATSUMATA S,et al.Chinese grammatical correction using BERT-based pre-trained model[J].arXiv:2011.02093,2020.
[25]XU M.Pycorrector:Text error correction tool [EB/OL].(2024-02-03).https://github.com/shibing624/pycorrector.
[26]CUI Y,CHE W,LIU T,et al.Pre-training with whole word masking for Chinese bert[J].IEEE/ACM Transactions on Audio,Speech,and Language Processing,2021,29:3504-3514.
[27]CZHANG Y,LI Z,BAO Z,et al.MuCGEC:a Multi-ReferenceMulti-Source Evaluation Dataset for Chinese Grammatical Error Correction[J].arXiv:2204.10994,2022.
[28]TIAN T,SONG C,TING J,et al.A French-to-English Machine Translation Model Using Transformer Network[J].Procedia Computer Science,2022,199:1438-1443.

Related Articles 15

[1]	JIANG Wenwen, XIA Ying. Improved U-Net Multi-scale Feature Fusion Semantic Segmentation Network for RemoteSensing Images [J]. Computer Science, 2025, 52(5): 212-219.
[2]	HAN Daojun, LI Yunsong, ZHANG Juntao, WANG Zemin. Knowledge Graph Completion Method Fusing Entity Descriptions and Topological Structure [J]. Computer Science, 2025, 52(5): 260-269.
[3]	PENG Linna, ZHANG Hongyun, MIAO Duoqian. Complex Organ Segmentation Based on Edge Constraints and Enhanced Swin Unetr [J]. Computer Science, 2025, 52(4): 177-184.
[4]	KONG Jialin, ZHANG Qi, WEI Jianze, LI Qi. Adaptive Contextual Learning Method Based on Iris Texture Perception [J]. Computer Science, 2025, 52(4): 185-193.
[5]	HU Huijuan, QIN Yifeng, XU Heand LI Peng. An Improved YOLOv8 Object Detection Algorithm for UAV Aerial Images [J]. Computer Science, 2025, 52(4): 202-211.
[6]	WANG Xingbo, ZHANG Hao, GAO Hao, ZHAI Mingliang, XIE Jiucheng. Talking Portrait Synthesis Method Based on Regional Saliency and Spatial Feature Extraction [J]. Computer Science, 2025, 52(3): 58-67.
[7]	ZHONG Yue, GU Jieming. 3D Reconstruction of Single-view Sketches Based on Attention Mechanism and Contrastive Loss [J]. Computer Science, 2025, 52(3): 77-85.
[8]	CHENG Qinghua, JIAN Haifang, ZHENG Shuaikang, GUO Huimin, LI Yuehao. Illumination-aware Infrared/Visible Fusion for Object Detection [J]. Computer Science, 2025, 52(2): 173-182.
[9]	LIU Yanlun, XIAO Zheng, NIE Zhenyu, LE Yuquan, LI Kenli. Case Element Association with Evidence Extraction for Adjudication Assistance [J]. Computer Science, 2025, 52(2): 222-230.
[10]	ZHAO Qian, GUO Bin, LIU Yubo, SUN Zhuo, WANG Hao, CHEN Mengqi. Generation of Enrich Semantic Video Dialogue Based on Hierarchical Visual Attention [J]. Computer Science, 2025, 52(1): 315-322.
[11]	LI Yunchen, ZHANG Rui, WANG Jiabao, LI Yang, WANG Ziqi, CHEN Yao. Re-parameterization Enhanced Dual-modal Realtime Object Detection Model [J]. Computer Science, 2024, 51(9): 162-172.
[12]	HU Pengfei, WANG Youguo, ZHAI Qiqing, YAN Jun, BAI Quan. Night Vehicle Detection Algorithm Based on YOLOv5s and Bistable Stochastic Resonance [J]. Computer Science, 2024, 51(9): 173-181.
[13]	LIU Qian, BAI Zhihao, CHENG Chunling, GUI Yaocheng. Image-Text Sentiment Classification Model Based on Multi-scale Cross-modal Feature Fusion [J]. Computer Science, 2024, 51(9): 258-264.
[14]	LI Zhe, LIU Yiyang, WANG Ke, YANG Jie, LI Yafei, XU Mingliang. Real-time Prediction Model of Carrier Aircraft Landing Trajectory Based on Stagewise Autoencoders and Attention Mechanism [J]. Computer Science, 2024, 51(9): 273-282.
[15]	LIU Qilong, LI Bicheng, HUANG Zhiyong. CCSD:Topic-oriented Sarcasm Detection [J]. Computer Science, 2024, 51(9): 310-318.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

Intelligent Error Correction Model for Chinese Idioms Fused with Fixed-length Seq2Seq Network

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 15

Metrics

Comments

Recommended 0