Computer Science ›› 2026, Vol. 53 ›› Issue (4): 1-23.doi: 10.11896/jsjkx.251000067
• Interdisciplinary Integration of Artificial Intelligence and Theoretical Computer Science • Previous Articles Next Articles
HU Junjie1, CHEN Yujie1, HU Yikun2, WEN Cheng1, CAO Jialun3, MA Zhi2, SU Jie1, SUN Weidi4, TIAN Cong2, QIN Shengchao1
CLC Number:
| [1]ROBINSON J A,VORONKOV A.Handbook of automated reasoning[M].Elsevier,2001. [2]HAVELUND K,LOWRY M,PENIX J.Formal analysis of aspace-craft controller using SPIN[J].IEEE Transactions on Software Engineering,2002,27(8):749-765. [3]SILVA L C,ALMEIDA H O,PERKUSICH A,et al.A model-based approach to support validation of medical cyber-physical systems[J].Sensors,2015,15(11):27625-27670. [4]APPICHARLA S.The Boeing 737 Max 8 Crashes:System-based Approach to Safety-A Different Perspective[J].Safety-Critical Systems eJournal,2023,2(1):93-100. [5]VAL O O,KOLADE T M,GBADEBO M O,et al.Strengthening cybersecurity measures for the defense of critical infrastructure in the United States[J].Asian Journal of Research in Computer Science,2024,17(11):25-45. [6]National Highway Traffic Safety Administration.Part 573 Safety Recall Report 23V-862:Autopilot Controls Insufficient to Prevent Misuse[EB/OL].https://static.nhtsa.gov/odi/rcl/2023/RCLRPT-23V862-5632.PDF. [7]DIJKSTRA E W.Notes on structured programming,Vol.70-WSK-03[R].Technische Hogeschool Eindhoven,1971. [8]HOARE C A R.An axiomatic basis for computer programming[J].Communications of the ACM,1969,12(10):576-580. [9]KLEIN G,ELPHINSTONE K,HEISER G,et al.seL4:Formal verification of an OS kernel[C]//Proceedings of the ACM SIGOPS 22nd Symposium on Operating Systems Principles.2009:207-220. [10]JIANG A Q,WELLECK S,ZHOU J P,et al.Draft,sketch,and prove:Guiding formal theorem provers with informal proofs[J].arXiv:2210.12283,2022. [11]CHEN L,GU J,HUANG L,et al.Seed-prover:Deep and broad reasoning for automated theorem proving[J].arXiv:2507.23726,2025. [12]WANG M,TANG Y,WANG J,et al.Premise selection for theorem proving by deep graph embedding[C]//Advances in Neural Information Processing Systems.2017. [13]POLU S,SUTSKEVER I.Generative language modeling for automated theorem proving[J].arXiv:2009.03393,2020. [14]LAMPLE G,LACROIX T,LACHAUX M A,et al.Hypertree proof search for neural theorem proving[J].Advances in Neural Information Processing Systems,2022,35:26337-26349. [15]BROWN T,MANN B,RYDER N,et al.Language models are few-shot learners[J].Advances in Neural Information Proces-sing Systems,2020,33:1877-1901. [16]LEWIS P,PEREZ E,PIKTUS A,et al.Retrieval-augmentedgeneration for knowledge-intensive nlp tasks[J].Advances in Neural Information Processing Systems,2020,33:9459-9474. [17]RAFFEL C,SHAZEER N,ROBERTS A,et al.Exploring the limits of transfer learning with a unified text-to-text transformer[J].Journal of Machine Learning Research,2020,21(140):1-67. [18]OUYANG L,WU J,JIANG X,et al.Training language models to follow instructions with human feedback[J].Advances in Neural Information Processing Systems,2022,35:27730-27744. [19]WOHLIN C.Guidelines for snowballing in systematic literature studies and a replication in software engineering[C]//Procee-dings of the 18th International Conference on Evaluation and Assessment in Software Engineering.2014:1-10. [20]HUTH M,RYAN M.Logic in Computer Science:Modelling and reasoning about systems[M].Cambridge University Press,2004. [21]WOODCOCK J,LARSEN P G,BICARREGUI J,et al.Formal methods:Practice and experience[J].ACM Computing Surveys,2009,41(4):1-36. [22]SIMPSON M S,BARUA R K.MemSafe:ensuring the spatial and temporal memory safety of C at runtime[J].Software:Practice and Experience,2013,43(1):93-128. [23]LEINO K R M.Dafny:An automatic program verifier for functional correctness[C]//International Conference on Logic for Programming Artificial Intelligence and Reasoning.Berlin:Springer,2010:348-370. [24]FILLIÂTRE J C,PASKEVICH A.Why3-where programsmeet provers[C]//European Symposium on Programming.Berlin:Springer,2013:125-128. [25]HUET G,KAHN G,PAULIN-MOHRING C.The coq proofassistant a tutorial[J].Rapport Technique,1997,178:113. [26]NIPKOW T,WENZEL l M,PAULSON L C.Isabelle/HOL:a proof assistant for higher-order logic[M].Berlin:Springer,2002. [27]DE MOURA L,KONG S,AVIGAD J,et al.The Lean theorem prover(system description)[C]//International Conference on Automated Deduction.Cham:Springer,2015:378-388. [28]TURING A M.On computable numbers,with an application to the Entscheidungs problcm[C]//Proceedings of the London Mathematical Society.1937:230-265. [29]HALES T,ADAMS M,BAUER G,et al.A formal proof of the Kepler conjecture[J].arXiv:1501.02155,2015. [30]LIN X,CAO Q,HUANG Y,et al.FVEL:Interactive formalverification environment with large language models via theorem proving[J].Advances in Neural Information Processing Systems,2024,37:54932-54946. [31]DONG K,MAHANKALI A,MA T.Formal theorem proving by rewarding llms to decompose proofs hierarchically[J].arXiv:2411.01829,2024. [32]THOMPSON K,SAAVEDRA N,CARROTT P,et al.Rango:Adaptive retrieval-augmented proving for automated software verification[J].arXiv:2412.14063,2024. [33]KOZYREV A,SOLOVEV G,KHRAMOV N,et al.CoqPilot,a plugin for LLM-based generation of proofs[C]//Proceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering.2024:2382-2385. [34]XIN H,GUO D,SHAO Z,et al.Deepseek-prover:Advancingtheorem proving in llms through large-scale synthetic data[J].arXiv:2405.14333,2024. [35]XIN H,REN Z Z,SONG J,et al.Deepseek-prover-v1.5:Harnessing proof assistant feedback for reinforcement learning and monte-carlo tree search[J].arXiv:2408.08152,2024. [36]REN Z Z,SHAO Z,SONG J,et al.Deepseek-prover-v2:Advancing formal mathematical reasoning via reinforcement learning for subgoal decomposition[J].arXiv:2504.21801,2025. [37]WANG H,UNSAL M,LIN X,et al.Kimina-prover preview:Towards large formal reasoning models with reinforcement learning[J].arXiv:2504.11354,2025. [38]XIN R,XI C,YANG J,et al.Bfs-prover:Scalable best-first tree search for llm-based automatic theorem proving[J].arXiv:2502.03438,2025. [39]YANG K,SWOPE A,GU A,et al.Leandojo:Theorem proving with retrieval-augmented language models[J].Advances in Neural Information Processing Systems,2023,36:21573-21612. [40]YING H,ZHANG S,LI L,et al.Internlm-math:Open mathlarge language models toward verifiable reasoning[J].arXiv:2402.06332,2024. [41]WANG R,ZHANG J,JIA Y,et al.Theoremllama:Transfor-ming general-purpose llms into lean4 experts[J].arXiv:2407.03203,2024. [42]LIN H,SUN Z,WELLECK S,et al.Lean-star:Learning to interleave thinking and proving[J].arXiv:2407.10040,2024. [43]WU Z,WANG J,LIN D,et al.Lean-github:Compiling github lean repositories for a versatile lean prover[J].arXiv:2407.17227,2024. [44]GLOECKLE F,LIMPERG J,SYNNAEVE G,et al.ABEL:Sample efficient online reinforcement learning for neural theorem proving[C]//The 4th Workshop on Mathematical Reaso-ning and AI at NeurIPS’24.2024. [45]WU Z,HUANG S,ZHOU Z,et al.Internlm2.5-stepprover:Advancing automated theorem proving via expert iteration on large-scale lean problems[J].arXiv:2410.15700,2024. [46]WU S,LU S,GONG Y,et al.Alchemy:Amplifying theorem-proving capability through symbolic mutation[J].arXiv:2410.15748,2024. [47]LAMONT S,WALDER C,DEZFOULI A,et al.3D-Prover:Diversity Driven Theorem Proving With Determinantal Point Processes[J].arXiv:2410.11133,2024. [48]LI Y,DU D,SONG L,et al.Hunyuanprover:A scalable datasynthesis framework and guided tree search for automated theorem proving[J].arXiv:2412.20735,2024. [49]DONG K,MA T.Stp:Self-play llm theorem provers with iterative conjecturing and proving[J].arXiv:2502.00212,2025. [50]LIN Y,TANG S,LYU B,et al.Goedel-prover:A frontier model for open-source automated theorem proving[J].arXiv:2502.07640,2025. [51]ZHANG J,WANG Q,JI X,et al.Leanabell-prover:Posttraining scaling in formal reasoning[J].arXiv:2504.06122,2025. [52]SHAN H,XIONG B,QI N,et al.FrameProver:Leveraging Formal Frameworks in Building Proofs for Enhanced Inference[C]//International Conference on Intelligent Computing.Singapore:Springer,2025:188-200. [53]JI X,LIU Y,WANG Q,et al.Leanabell-prover-v2:Verifier-integrated reasoning for formal theorem proving via reinforcement learning[J].arXiv:2507.08649,2025. [54]LIN Y,TANG S,LYU B,et al.Goedel-prover-v2:Scaling formal theorem proving with scaffolded data synthesis and self-correction[J].arXiv:2508.03613,2025. [55]XIN R,ZHENG Z,NIE Y,et al.Scaling up Multi-Turn Off-Po-licy RL and Multi-Agent Tree Search for LLM Step-Provers[J].arXiv:2509.06493,2025. [56]LU M,DELAWARE B,ZHANG T.Proof automation withlarge language models[C]//Proceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering.2024:1509-1520. [57]QIN J X,DU A,ZHANG D F,et al.Can Large Language Mo-dels Verify System Software? A Case Study Using FSCQ as a Benchmark[C]//Workshop on Hot Topics in Operating Systems(HOTOS’25).New York:ACM,2025. [58]LIU X P,LIU H Z,YI X D,et al.LLM-Enhanced Theorem Proving with Term Explanation and Tactic Parameter Repair[C]//15th Asia-Pacific Symposium on Internetware(Internetware’24).New York:ACM,2024. [59]THAKUR A,LEE J,TSOUKALAS G,et al.CLEVER:A Curated Benchmark for Formally Verified Code Generation[J].arXiv:2505.13938,2025. [60]THAKUR A,WEN Y,CHAUDHURI S.A language-agent approach to formal theorem-proving[EB/OL].https://openreview.net/pdf?id=XCMbagV0No. [61]LIU H,SUN J,LI Z,et al.ProofAug:Efficient Neural Theorem Proving via Fine-grained Proof Structure Analysis[C]//Forty-second International Conference on Machine Learning.2025. [62]KUMARAPPAN A,TIWARI M,SONG P,et al.LeanAgent:Lifelong Learning for Formal Theorem Proving[J].arXiv:2410.06209,2024. [63]CAI Y F,HOU Z,SANAN D,et al.Automated Program Refinement:Guide and Verify Code Large Language Model with Refinement Calculus[C]//Proceedings of the ACM on Programming Languages.2025:2057-2089. [64]TAHAT A,HARDIN D,PETZ A,et al.Proof repair utilizing large language models:a case study on the Copland remote attestation proofbase[C]//International Conference on Bridging the Gap between AI and Reality.Cham:Springer,2024:145-166. [65]ANIVA L,SUN C,MIRANDA B,et al.Pantograph:A machine-to-machine interaction interface for advanced theorem proving,high level reasoning,and data extraction in lean 4[C]//International Conference on Tools and Algorithms for the Construction and Analysis of Systems.Cham:Springer,2025:104-123. [66]HAN S,HUR T,HUR Y,et al.Simplifying Formal Proof-Generating Models with ChatGPT and Basic Searching Techniques[C]//Intelligent Computing-Proceedings of the Computing Conference.Cham:Springer,2025:205-222. [67]DOAN L,NGUYEN T V.AI-Assisted Autoformalization ofCombinatorics Problems in Proof Assistants[C]//2025 IEEE/ACM 47th International Conference on Software Engineering:New Ideas and Emerging Results(ICSE-NIER).IEEE,2025:1-5. [68]KRIPNER M,ŠUSTR M,STRAKA M.LeanTree:Accelerating White-Box Proof Search with Factorized States in Lean 4[J].arXiv:2507.14722,2025. [69]BABA K,LIU C,KURITA S,et al.Prover Agent:An Agent-based Framework for Formal Mathematical Proofs[J].arXiv:2506.19923,2025. [70]CAO C,LI M,DAI J,et al.Towards Advanced Mathematical Reasoning for LLMs via First-Order Logic Theorem Proving[J].arXiv:2506.17104,2025. [71]ASHER J.LeanExplore:A search engine for Lean 4 declarations[J].arXiv:2506.11085,2025. [72]PATEL M,BHATTACHARYYA R,LU T,et al.LeanTutor:A Formally-Verified AI Tutor for Mathematical Proofs[J].arXiv:2506.08321,2025. [73]KIRTANIA S.Activation Steering in Neural Theorem Provers[J].arXiv:2502.15507,2025. [74]YANG T,YAN M,ZHAO H,et al.LemmaHead:RAG Assisted Proof Generation Using Large Language Models[J].arXiv:2501.15797,2025. [75]BUALI M,HOEHNDORF R.Towards Automated FunctionalEquation Proving:A Benchmark Dataset and A Domain-Specific In-Context Agent[J].arXiv:2407.14521,2024. [76]SONG P,YANG K,ANANDKUMAR A.Lean copilot:Large language models as copilots for theorem proving in lean[J].ar-Xiv:2404.12534,2024. [77]KASIBATLA S R,AGARWAL A,BRUN Y,et al.Cobble-stone:Iterative Automation for Formal Verification[J].arXiv:2410.19940,2024. [78]ZHOU J P,STAATS C,LI W,et al.Don’t Trust:Verify-Grounding LLM Quantitative Reasoning with Autoformalization[J].arXiv:2403.18120,2024. [79]GADGIL S,TADIPATRI A R,AGRAWAL A,et al.Towards automating formalisation of theorem statements using large language models[C]//36th Conference on Neural Information Processing Systems(NeurIPS 2022) Workshop on MATH-AI.2022. [80]MIRANDA B,ZHOU Z,NIE A,et al.VeriBench:End-to-EndFormal Verification Benchmark for AI Code Generation in Lean 4[C]//2nd AI for Math Workshop@ ICML 2025.2025. [81]TSOUKALAS G,LEE J,JENNINGS J,et al.Putnambench:A multilingual competition-mathematics benchmark for formal theorem-proving[C]//AI for Math Workshop@ ICML 2024.2024. [82]CAO C,SONG L,LI Z,et al.Reviving DSP for Advanced Theorem Proving in the Era of Reasoning Models[J].arXiv:2506.11487,2025. [83]ZHOU Y,ZHAO J,ZHANG Y,et al.Solving formal math problems by decomposition and iterative reflection[J].arXiv:2507.15225,2025. [84]CHEN B,BABIKIAN A A,FENG S,et al.LLM-based Satisfiability Checking of String Requirements by Consistent Data and Checker Generation[J].arXiv:2506.16639,2025. [85]PAN L,ALBALAK A,WANG X,et al.Logic-LM:Empowering Large Language Models with Symbolic Solvers for Faithful Lo-gical Reasoning[J].arXiv:2305.12295,2023. [86]YE X,CHEN Q,DILLIG I,et al.Satlm:Satisfiability-aided language models using declarative prompting[J].Advances in Neural Information Processing Systems,2023,36:45548-45580. [87]KARIA R,DOBHAL D,BRAMBLETT D,et al.Can LLMs translate SATisfactorily? Assessing LLMs in Generating and Interpreting Formal Specifications[J/OL].https://pulkitverma.net/assets/pdf/kdbvs_aia24/kdbvs_aia24.pdf. [88]SZEIDER S.MCP-Solver:Integrating Language Models withConstraint Programming Systems[J].arXiv:2501.00539,2025. [89]HAZRA R,VENTURATO G,MARTIRES P Z D,et al.Canlarge language models reason? A characterization via 3-SAT[J].arXiv:2408.07215,2024. [90]MARINO R.Fast Analysis of the OpenAI O1-Preview Model in Solving Random K-SAT Problem:Does the LLM Solve the Problem Itself or Call an External SAT Solver?[J].arXiv:2409.11232,2024. [91]ALFEZEA Y,AL-YOUSEFI A,AL-JARRAH M,et al.Analyzing the Impact of Fine-Tuning a Large Language Model on the 3-SAT Problem[C]//2025 1st International Conference on Computational Intelligence Approaches and Applications(ICCIAA).2025:1-8. [92]SCHIDLER A,SZEIDER S.Extracting problem structure with LLMs for optimized SAT local search[J].arXiv:2501.14630,2025. [93]CLARKE E,BIERE A,RAIMI R,et al.Bounded model chec-king using satisfiability solving[J].Formal methods in system design,2001,19(1):7-34. [94]JANßEN C,RICHTER C,WEHRHEIM H.Can ChatGPT support software verification?[C]//International Conference on Fundamental Approaches to Software Engineering.Cham:Springer,2024:266-279. [95]KAMATH A,MOHAMMED N,SENTHILNATHAN A,et al.Leveraging LLMs for Program Verification[C]//2024 Formal Methods in Computer-Aided Design(FMCAD).2024:107-118. [96]HEIZMANN M,HOENICKE J,PODELSKI A.Refinement of trace abstraction[C]//International Static Analysis Symposium.Berlin:Springer,2009:69-85. [97]WEN C,CAO J,SU J,et al.Enchanting program specification synthesis by large language models using static analysis and program verification[C]//International Conference on Computer Aided Verification.Cham:Springer,2024:302-328. [98]LIU C,WU X,FENG Y,et al.Towards general loop invariant generation:A benchmark of programs with memory manipulation[J].Advances in Neural Information Processing Systems,2024,37:129120-129145. [99]MA L,LIU S,LI Y,et al.Specgen:Automated generation offormal program specifications via large language models[J].arXiv:2401.08807,2024. [100]LIU R,LI G,CHEN M,et al.Enhancing Automated Loop Invariant Generation for Complex Programs with Large Language Models[J].arXiv:2412.10483,2024. [101]ZHONG S,ZHU J,TIAN Y,et al.RAG-Verus:Repository-Level Program Verification with LLMs using Retrieval Augmented Generation[J].arXiv:2502.05344,2025. [102]YANG C,LI X,MISU M R H,et al.AutoVerus:Automatedproof generation for Rust code[J].arXiv:2409.13082,2024. [103]CHAKRABORTY M,PIRKELBAUER P,YI Q.FormalSpec-Cpp:A Dataset of C++ Formal Specifications created using LLMs[C]//2025 IEEE/ACM 22nd International Conference on Mining Software Repositories(MSR).IEEE,2025:758-762. [104]CAO J,LU Y,LI M,et al.From Informal to Formal--Incorporating and Evaluating LLMs on Natural Language Requirements to Verifiable Formal Proofs[J].arXiv:2501.16207,2025. [105]AGGARWAL P,PARNO B,WELLECK S.AlphaVerus:Bootstrapping formally verified code generation through self-improving translation and treefinement[J].arXiv:2412.06176,2024. [106]LI Y,PARSERT J,POLGREEN E.Guiding enumerative program synthesis with large language models[C]//International Conference on Computer Aided Verification.Cham:Springer,2024:280-301. [107]MISU M R H,LOPES C V,MA I,et al.Towards ai-assisted synthesis of verified dafny methods[C]//Proceedings of the ACM on Software Engineering.2024:812-835. [108]LOUGHRIDGE C,SUN Q,AHRENBACH S,et al.Dafny-Bench:A benchmark for formal software verification[J].arXiv:2406.08467,2024. [109]SUN C,AGASHE V,CHAKRABORTY S,et al.ClassInvGen:Class Invariant Synthesis using Large Language Models[J].arXiv:2502.18917,2025. [110]MUGNIER E,GONZALEZ E A,POLIKARPOVA N,et al.Laurel:Unblocking Automated Verification with Large Language Models[C]//Proceedings of the ACM on Programming Languages.2025:1519-1545. [111]JHA M,WAN J X,ZHANG H,et al.PREFACE - A Reinforcement Learning Framework for Code Verification via LLM Prompt Repair[C]//Great Lakes Symposium on VLSI 2025(GLSVLSI’25).New York:ACM,2025. [112]CAO W,WU G,XU T,et al.Clause2Inv:A Generate-Combine-Check Framework for Loop Invariant Inference[C]//Procee-dings of the ACM on Software Engineering.2025:1009-1030. [113]TANEJA J,LAIRD A,YAN C,et al.Llm-vectorizer:Llm-based verified loop vectorizer[C]//Proceedings of the 23rd ACM/IEEE International Symposium on Code Generation and Optimization.2025:137-149. [114]RUBIO-MEDRANO C E,KOTAK A,WANG W,et al.Pairing human and artificial intelligence:Enforcing access control policies with llms and formal specifications[C]//Proceedings of the 29th ACM Symposium on Access Control Models and Technologies.2024:105-116. [115]SILVA Á F,MENDES A,FERREIRA J F.Leveraging largelanguage models to boost Dafny’s developers productivity[C]//Proceedings of the 2024 IEEE/ACM 12th International Confe-rence on Formal Methods in Software Engineering(FormaliSE).2024:138-142. [116]LI Y C,ZETZSCHE S,SOMAYYAJULA S.Dafny as Verification-Aware Intermediate Language for Code Generation[C]//Proceedings of Dafny Workshop 2025.Denver:Dafny Workshop Organizing Committee,2025:1-8. [117]YAN C,CHE F,HUANG X,et al.Re:Form--Reducing Human Priors in Scalable Formal Software Verification with RL in LLMs:A Preliminary Study on Dafny[J].arXiv:2507.16331,2025. [118]WU V,MENDES A,ABREU A.Specification-Guided Repair of Arithmetic Errors in Dafny Programs using LLMs[J].arXiv:2507.03659,2025. [119]DENG X,ZHONG S,VENERIS A,et al.VerifyThisBench:Generating Code,Specifications,and Proofs All at Once[J].arXiv:2505.19271,2025. [120]LI S,JIANG J,ZHAO T,et al.OSVBench:Benchmarking LLMs on Specification Generation Tasks for Operating System Verification[J].arXiv:2504.20964,2025. [121]POESIA G,LOUGHRIDGE C,AMIN N.dafny-annotator:Ai-assisted verification of dafny programs[J].arXiv:2411.15143,2024. [122]REGO M,FAN W,HU X,et al.Evaluating the Ability of GPT-4o to Generate Verifiable Specifications in VeriFast[C]//2025 IEEE/ACM Second International Conference on AI Foundation Models and Software Engineering(Forge).IEEE,2025:246-251. [123]MUKHERJEE P,DELAWARE B.Towards automated verification of llm-synthesized c programs[J].arXiv:2410.14835,2024. [124]SUN C,SHENG Y,PADON O,et al.Clover:Clo sed-loop ver ifiable code generation[C]//International Symposium on AI Verification.Cham:Springer,2024:134-155. [125]YAO J,ZHOU Z,CHEN W,et al.Leveraging large language models for automated proof synthesis in rust[J].arXiv:2311.03739,2023. [126]KING D,KOUTAVAS V,KOVÁCS L.Llm-Based Generation of Weakest Preconditions and Precise Array Invariants[C]//2025 IEEE/ACM 13th International Conference on Formal Methods in Software Engineering(FormaliSE).IEEE,2025:1-5. [127]SEVENHUIJSEN M,ETEMADI K,NYBERG M.VeCoGen:Automating Generation of Formally Verified C Code with Large Language Models[C]//2025 IEEE/ACM 13th International Conference on Formal Methods in Software Engineering(FormaliSE).IEEE,2025:101-112. [128]LAHIRI S K.Evaluating llm-driven user-intent formalization for verification-aware languages[C]//Conference on Formal Methods in Computer-Aided Design-FMCAD 2024.2024:142. [129]BECKERT B,KLAMROTH J,PFEIFER W,et al.Towardscombining the cognitive abilities of large language models with the rigor of deductive progam verification[C]//International Symposium on Leveraging Applications of Formal Methods.Cham:Springer,2024:242-257. [130]PASCOAL F J,TRIGO E,ABREU R.Automatic Generation of Loop Invariants in Dafny with Large Language Models[C]//International Conference on Fundamentals of Software Enginee-ring.Cham:Springer,2025:138-154. [131]HUANG D,DHARIWAL P,SONG D,et al.Gamepad:A lear-ning environment for theorem proving[J].arXiv:1806.00608,2018. [132]YANG K,DENG J.Learning to prove theorems via interacting with proof assistants[C]//International Conference on Machine Learning.PMLR,2019:6984-6994. [133]SANCHEZ-STERN A,ALHESSI Y,SAUL L,et al.Generating correctness proofs with neural networks[C]//Proceedings of the 4th ACM SIGPLAN International Workshop on Machine Learning and Programming Languages.2020:1-10. [134]CHEN M,TWOREK J,JUN H,et al.Evaluating large language models trained on code[J].arXiv:2107.03374,2021. [135]REICHEL T,HENDERSON R,TOUCHET A,et al.Proof repair infrastructure for supervised models:Building a large proof repair dataset[C]//14th International Conference on Interactive Theorem Proving(ITP 2023).Schloss Dagstuhl-Leibniz-Zentrum für Informatik,2023:1-26:20. [136]RUTE J,OLŠÁK M,BLAAUWBROEK L,et al.Graph2Tac:Learning hierarchical representations of math concepts in theorem proving[EB/OL].https://openreview.net/pdf?id=XyB4VvF01X. [137]FLORATH A.Enhancing formal theorem proving:a compre-hensive dataset for training AI models on Coq code[J].arXiv:2403.12627,2024. [138]PROOFBASE R A,TAHAT A,HARDIN D,et al.Proof Repair Utilizing Large Language Models:A Case Study on the Copland[C]//Bridging the Gap Between AI and Reality:Second International Conference.Springer,2024:145. [139]LI W,YU L,WU Y,et al.Isarstep:a benchmark for high-level mathematical reasoning[J].arXiv:2006.09265,2020. [140]FIRST E,RABE M N,RINGER T,et al.Baldur:Whole-proof generation and repair with large language models[C]//Procee-dings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering.2023:1229-1241. [141]JIANG A Q,LI W,HAN J M,et al.Lisa:Language models of isabelle proofs[C]//6th Conference on Artificial Intelligence and Theorem Proving.2021:378-392. [142]ZHENG K,HAN J M,POLU S.Minif2f:a cross-system benchmark for formal olympiad-level mathematics[J].arXiv:2109.00110,2021. [143]MIKUŁA M,TWORKOWSKI S,ANTONIAK S,et al.Mag-nushammer:A transformer-based approach to premise selection[J].arXiv:2303.04488,2023. [144]ZHANG L,LU S,DUAN N.Selene:Pioneering automated proof in software verification[J].arXiv:2401.07663,2024. [145]HAN J M,RUTE J,WU Y,et al.Proof artifact co-training for theorem proving with language models[J].arXiv:2102.06203,2021. [146]XIONG J,SHEN J,YUAN Y,et al.Trigo:Benchmarking formal mathematical proof reduction for generative language mo-dels[J].arXiv:2310.10180,2023. [147]AZERBAYEV Z,PIOTROWSKI B,SCHOELKOPF H,et al.Proofnet:Autoformalizing and formally proving undergraduate-level mathematics[J].arXiv:2302.12433,2023. [148]LIU C,SHEN J,XIN H,et al.Fimo:A challenge formal dataset for automated theorem proving[J].arXiv:2309.04295,2023. [149]NESTEROV V,KAPUSHEV Y,BURTSEV M.Lean4trace:Data augmentation for neural theorem proving in Lean[C]//AI for Math Workshop@ ICML 2024.2024. [150]YING H,WU Z,GENG Y,et al.Lean workbook:A large-scale lean problem set formalized from natural language math problems[J].Advances in Neural Information Processing Systems,2024,37:105848-105863. [151]LU J,WAN Y,LIU Z,et al.Process-driven autoformalization in lean 4[J].arXiv:2406.01940,2024. [152]HU J,ZHU T,WELLECK S.miniCTX:Neural theorem proving with(long-) contexts[J].arXiv:2408.03350,2024. [153]ONDA N,KASAURA K,ORIIKE Y,et al.LeanConjecturer:Automatic Generation of Mathematical Conjectures for Theorem Proving[J].arXiv:2506.22005,2025. [154]LIU J,LIN X,BAYER J,et al.CombiBench:BenchmarkingLLM capability for combinatorial mathematics[J].arXiv:2505.03171,2025. [155]ZHAO H,GENG Y,TANG S,et al.Ineq-Comp:Benchmarking human-intuitive compositional reasoning in automated theorem proving on inequalities[J].arXiv:2505.12680,2025. [156]SUN J,TANG Y,LI A,et al.Enumerate-Conjecture-Prove:Formally Solving Answer-Construction Problems in Math Competitions[J].arXiv:2505.18492,2025. [157]XIN H,LI L,JIN X,et al.APE-Bench I:Towards File-level Automated Proof Engineering of Formal Math Libraries[J].arXiv:2504.19110,2025. |
| [1] | CHEN Hongxiu, ZENG Xia, LIU Zhiming, ZHAO Hengjun. Automatic Theorem Proving Based on Pre-trained Language Models and Unification [J]. Computer Science, 2026, 53(4): 40-47. |
| [2] | LIU Suyi, LIU Qi, GAO Weibo. Agent4Stu:Efficient LLM-based Student Answer Behavior Simulation Agent [J]. Computer Science, 2026, 53(4): 347-355. |
| [3] | XU Cheng, LIU Yuxuan, WANG Xin, ZHANG Cheng, YAO Dengfeng, YUAN Jiazheng. Review of Speech Disorder Assessment Methods Driven by Large Language Models [J]. Computer Science, 2026, 53(3): 307-320. |
| [4] | LI Wenli, FENG Xiaonian, QIAN Tieyun. Few-shot Continuous Toxicity Detection Based on Large Language Model Augmentation [J]. Computer Science, 2026, 53(3): 321-330. |
| [5] | LIU Lilong, LIU Guoming, QI Baoyuan, DENG Xueshan, XUE Dizhan, QIAN Shengsheng. Efficient Inference Techniques of Large Models in Real-world Applications:A Comprehensive Survey [J]. Computer Science, 2026, 53(1): 12-28. |
| [6] | SHAO Xinyi, ZHU Jingwei, ZHANG Liang. LLM-based Business Process Adaptation Method to Respond Long-tailed Changes [J]. Computer Science, 2026, 53(1): 29-38. |
| [7] | LI Maolin, LIN Jiajie, YANG Zhenguo. Confidence-guided Prompt Learning for Multimodal Aspect-level Sentiment Analysis [J]. Computer Science, 2025, 52(7): 241-247. |
| [8] | CHEN Jinyin, XI Changkun, ZHENG Haibin, GAO Ming, ZHANG Tianxin. Survey of Security Research on Multimodal Large Language Models [J]. Computer Science, 2025, 52(7): 315-341. |
| [9] | LI Jiangxu, CHEN Zemao, ZHANG Liqiang. Lightweight Authentication and Key Agreement Protocol for Cloud-assisted Smart Home Communication [J]. Computer Science, 2025, 52(7): 342-352. |
| [10] | LI Bo, MO Xian. Application of Large Language Models in Recommendation System [J]. Computer Science, 2025, 52(6A): 240400097-7. |
| [11] | HU Caishun. Study on Named Entity Recognition Algorithms in Audit Domain Based on Large LanguageModels [J]. Computer Science, 2025, 52(6A): 240700190-4. |
| [12] | GAO Hongkui, MA Ruixiang, BAO Qihao, XIA Shaojie, QU Chongxiao. Research on Hybrid Retrieval-augmented Dual-tower Model [J]. Computer Science, 2025, 52(6): 324-329. |
| [13] | WANG Zhiyi, HU Jun, XU Heng. Transition and Verification Method from RSML-e to Lustre Model for Flight Mode Transition [J]. Computer Science, 2025, 52(12): 48-59. |
| [14] | LI Hao, YANG Yumeng, ZHAO Boyang, ZHENG Puqi, LIN Hongfei. Adverse Drug Reaction Relationship Extraction Based on Chain of Thought Enhancement UnderHigh and Low Resources [J]. Computer Science, 2025, 52(12): 224-230. |
| [15] | HUANG Haixin, XU Chenglong, FU Yao. Research on Structured Pruning Algorithm Based on Information Fusion [J]. Computer Science, 2025, 52(11A): 241000041-6. |
|
||