Computer Science ›› 2026, Vol. 53 ›› Issue (4): 377-383.doi: 10.11896/jsjkx.250600032

• Artificial Intelligence • Previous Articles     Next Articles

LLM-augmented Training Framework with Cycle-Consistency Constraints

WU Qiaorui1, LUO Li2, ZHAO Cairong1   

  1. 1 School of Computer Science and Technology, Tongji University, Shanghai 201804, China
    2 State Grid Hunan Electric Power Co., Ltd.Changsha Power Supply Branch, Changsha 410000, China
  • Received:2025-06-06 Revised:2025-07-19 Online:2026-04-15 Published:2026-04-08
  • About author:WU Qiaorui,born in 2001,postgra-duate.His main research interests include large language models and model self-optimization.
    ZHAO Cairong,born in 1981,Ph.D,professor,Ph.D supervisor,is a member of CCF(No.24551D).His main research interests include artificial intelligence and computer vision.
  • Supported by:
    National Natural Science Fundation of China(62076184,62473286) and Shanghai Natural Science Foundation(22ZR1466700).

Abstract: This paper proposes a training framework termed LACC(Large Language Model-Augmented Consistency-Constrained),designed to address key challenges in patent abstract generation,including incomplete coverage of technical features,insufficient legal compliance,and inefficiency in edge deployment.The LACC framework constructs a bidirectional reversible task structure between abstract generation and claim expansion,incorporating a cycle-consistency constraint to jointly optimize technical expression and legal formulation.On this basis,LACC integrates a controllable data augmentation strategy powered by large language models(LLMs) to automatically generate high-quality patent text pairs.A dynamic verification mechanism is further introduced to enhance the technical accuracy and regulatory reliability of generated content.Experimental results on the Chinese patent dataset CPTD demonstrate that LACC achieves a ROUGE-L score of 56.74,outperforming the baseline by 8.99 percentage points,and shows significant improvements in the recurrence consistency score(RCS).Moreover,the framework supports efficient edge deployment,with inference latency controlled within 420 ms and single-GPU memory usage limited to 4.5 GB.Overall,LACC offers a practical and scalable solution for downstream tasks such as patent drafting assistance,legal text generation,and intelligent intellectual property(IP) management,and shows strong potential in enabling the automation of the full lifecycle of IP processing.

Key words: Natural language processing, Large language model, Cycle consistency, Data augmentation, Knowledge distillation, Collaborative training

CLC Number: 

  • TP391
[1]SEMERIKOV S O,VAKALIUK T A,KANEVSKA O B,et al.LLM on the edge:the new frontier[C]//Proceedings of the 5th Edge Computing Workshop(doors 2025),2025:137-161.
[2]FRIHA O,FERRAG M A,KANTARCI B,et al.LLM-basededge intelligence:A comprehensive survey on architectures,applications,security and trustworthiness[J].IEEE Open Journal of the Communications Society,2024,5:5799-5856.
[3]LASKAR M T R,ALQAHTANI S,BARI M S,et al.A systematic survey and critical review on evaluating large language mo-dels:Challenges,limitations,and recommendations[C]//Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing.2024:13785-13816.
[4]ZHU J Y,PARK T,ISOLA P,et al.Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks[C]//2017 IEEE International Conference on Computer Vision(ICCV).IEEE,2017:2242-2251.
[5]KIM B,KIM J,LEE J G,et al.Unsupervised Deformable Image Registration Using Cycle-Consistent CNN[C]//Medical Image Computing and Computer Assisted Intervention-MICCAI 2019.2019.
[6]SHEN T X,LEI T,BARZILAY R,et al.Style Transfer fromNon-Parallel Text by Cross-Alignment[C]//31st Conference on Neural Information Processing Systems(NIPS 2017).2017.
[7]LIAN C,LI X,KONG L,et al.CoCycleReg:Collaborative Cycle-Consistency Method for Multi-Modal Medical Image Re-gistration[J].Neurocomputing,2022,500:799-808.
[8]LI C Y,LI X M,KONG L K,et al.CoCycleReg:Collaborative Cycle-Consistency Method for Multi-Modal Medical Image Re-gistration[J].Neurocomputing,2022,500:799-808.
[9]MA C,ZHANG W E,GUO M Y,et al.Multi-document Summarization Via Deep Learning Techniques:A Survey[J].ACM Computing Surveys,2023,55(5):1-37.
[10]WANG Z,WANG P,LIU K,et al.A comprehensive survey on data augmentation[J].arXiv:2405.09591,2024.
[11]ZHOU Y,GUO C,WANG X,et al.A survey on data augmentation in large model era[J].arXiv:2401.15422,2024.
[12]SCHICK T,SCHÜTZE H.Few-Shot Text Generation with Pattern-Exploiting Training[J].arXiv:2012.11926,2020.
[13]WEI J,ZOU K.EDA:Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Confe-rence on Natural Language Processing.2019:6382-6388.
[14]DING B,QIN C,ZHAO R,et al.Data augmentation usingLLMs:Data perspectives,learning paradigms and challenges[C]//Findings of the Association for Computational Linguistics ACL 2024.2024:1679-1705.
[15]ZELIKMAN E,WU Y H,MU J,et al.STaR:BootstrappingReasoning with Reasoning[C]//NEURIPS 2022.2022.
[16]LOUIS D,VAN H,JAAP S,et al.Robust Deformable ImageRegistration Using Cycle-Consistent Implicit Representations[J].IEEE Transactions on Medical Imaging,2024,43(2):784-793.
[17]RAMAKANTH P,ASLI C,MICHEL G,et al.Data Augmentation for Abstractive Query-Focused Multi-Document Summarization[J].Computing Research Repository,2021,35(15):13666-13674.
[18]GOU J,YU B,MAYBANK S J,et al.Knowledge distillation:A survey[J].International Journal of Computer Vision,2021,129(6):1789-1819.
[19]GEOFFREY H,ORIOL V,JEFF D.Distilling the Knowledge in a Neural Network[J].arXiv:1503.02531,2015.
[20]KUMAR S,ALESSANDRO S,MRINMAYA S.Distilling Reasoning Capabilities into Smaller Language Models[J].arXiv:2212.00193,2022.
[21]LIU Z Y,SOUMYA S,ISABELLE L,et al.Self-Contradictory Reasoning Evaluation and Detection[J].Association for Computational Linguistics,2024(11):3725-3742.
[22]XU X,LI M,TAO C,et al.A survey on knowledge distillation of large language models[J].arXiv:2402.13116,2024.
[23]ZHANG Y,KANG B,HOOI B,et al.Deep long-tailed learning:A survey[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2023,45(9):10795-10816.
[1] SONG Jianhua, LIU Chun, ZHANG Yan. Lightweight Camouflaged Object Detection Model Based on Structured Knowledge Distillation [J]. Computer Science, 2026, 53(4): 299-307.
[2] HU Junjie, CHEN Yujie, HU Yikun, WEN Cheng, CAO Jialun, MA Zhi, SU Jie, SUN Weidi, TIAN Cong, QIN Shengchao. Formal Theorem Proving Empowered by Large Language Model:Survey and Perspectives [J]. Computer Science, 2026, 53(4): 1-23.
[3] LIU Suyi, LIU Qi, GAO Weibo. Agent4Stu:Efficient LLM-based Student Answer Behavior Simulation Agent [J]. Computer Science, 2026, 53(4): 347-355.
[4] PAN Jiahao, FENG Xiang, YU Huiqun. SM-PHT:Robust,Scalable,and Efficient Method for Multi-task Reinforcement Learning [J]. Computer Science, 2026, 53(4): 366-376.
[5] WANG Zhibin, LI Shipeng, ZHOU Yuhang, LI Xue, ZHANG Zhonghui, JIANG Zhiwei, GU Rong, TIAN Chen, CHEN Guihai, ZHONG Sheng. Optimization of Service Level Objectives and System Level Metrics in Large Language ModelServing System [J]. Computer Science, 2026, 53(3): 23-32.
[6] ZHOU Yueyuan, LU Guanze, XIANG Jiawei, ZHANG Jiawei, SHAO En, HE Xin. Training System for Large Language Models Based on Adaptive Transpose on Hygon DCU [J]. Computer Science, 2026, 53(3): 33-40.
[7] CHEN Han, XU Zefeng, JIANG Jiu, FAN Fan, ZHANG Junjian, HE Chu, WANG Wenwei. Large Language Model and Deep Network Based Cognitive Assessment Automatic Diagnosis [J]. Computer Science, 2026, 53(3): 41-51.
[8] QIAN Qing, CHEN Huicheng, CUI Yunhe, TANG Ruixue, FU Jinmei. Joint Entity and Relation Extraction Method with Multi-scale Collaborative Aggregation and Axial-semantic Guidance [J]. Computer Science, 2026, 53(3): 97-106.
[9] GE Zeqing, HUANG Shengjun. Semi-supervised Learning Method for Multi-label Tabular Data [J]. Computer Science, 2026, 53(3): 151-157.
[10] WU Xianjie, LI Tongliang, LI Zhoujun. Survey of Table Question Answering Research [J]. Computer Science, 2026, 53(3): 295-306.
[11] XU Cheng, LIU Yuxuan, WANG Xin, ZHANG Cheng, YAO Dengfeng, YUAN Jiazheng. Review of Speech Disorder Assessment Methods Driven by Large Language Models [J]. Computer Science, 2026, 53(3): 307-320.
[12] LI Wenli, FENG Xiaonian, QIAN Tieyun. Few-shot Continuous Toxicity Detection Based on Large Language Model Augmentation [J]. Computer Science, 2026, 53(3): 321-330.
[13] CHEN Yuyin, LI Guanfeng, QIN Jing, XIAO Yuhang. Survey on Complex Logical Query Methods in Knowledge Graphs [J]. Computer Science, 2026, 53(2): 273-288.
[14] SUN Mingxu, LIANG Gang, WU Yifei, HU Haixin. Chinese Hate Speech Detection Incorporating Hate Object Features and Variant Word Restoration Mechanism [J]. Computer Science, 2026, 53(2): 289-299.
[15] GUO Luxiang, WANG Yueyu, LI Qianyue, LI Shasha, LIU Xiaodong, JI Bin, YU Jie. Comprehensive Survey of LLM-based Agent Operating Systems [J]. Computer Science, 2026, 53(1): 1-11.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!