计算机科学 ›› 2026, Vol. 53 ›› Issue (4): 337-346.doi: 10.11896/jsjkx.251000136

• 人工智能 • 上一篇    下一篇

自然语言语义表示的范畴论建模:系统综述与组合机制分析

李奕丹1, 崔建英1, 熊明辉2   

  1. 1 中山大学哲学系逻辑与认知研究所 广州 510275
    2 浙江大学数字法治实验室 杭州 310008
  • 收稿日期:2025-10-28 修回日期:2026-01-26 出版日期:2026-04-15 发布日期:2026-04-08
  • 通讯作者: 崔建英(cuijiany@mail.sysu.edu.cn)
  • 作者简介:(liyd66@mail2.sysu.edu.cn)
  • 基金资助:
    国家社会科学基金重大项目(19ZDA042)

Category-Theoretic Semantic Representation: Systematic Review and Compositional Mechanism Analysis

LI Yidan1, CUI Jianying1, XIONG Minghui2   

  1. 1 Institute of Logic and Cognition, Sun Yat-sen University, Guangzhou 510275, China
    2 Digital Rule of Law Laboratory, Zhejiang University, Hangzhou 310008, China
  • Received:2025-10-28 Revised:2026-01-26 Published:2026-04-15 Online:2026-04-08
  • About author:LI Yidan,born in 1994,Ph.Dcandidate.Her main research interests include formal semantics,category theory,artificial intelligence logic and natural language processing.
    CUI Jianying,born in 1975,Ph.D,associate professor,Ph.D supervisor.Her main research interests include formal argumentation theory,artificial intelligence logic and natural language processing.
  • Supported by:
    Major Program of National Social Science Foundation of China(19ZDA042).

摘要: 语义表示是自然语言处理(NLP)的核心挑战。当前的语义表示范式可归纳为两类:以逻辑形式为核心的符号主义方法以及基于分布式表示的联结主义方法。尽管后者在工程应用中取得了显著成效,但在刻画语言的组合结构、支持结构化推理以及实现可解释与可泛化的语义建模方面,逐渐暴露出被称为“组合性危机”的理论局限。现有方法中,基于范畴论的组合分布语义模型凭借其严谨的代数结构和类型驱动的组合范式,为统一符号的句法结构与分布的语义内容提供了一条极具潜力的数学路径。对此,从范畴论的数学视角出发,以“范畴(理论框架)-组合(核心机制)-量子(计算范式)”为主线,对基于范畴论的自然语言语义表示范式及其演进脉络进行系统梳理与评述。不同于按模型或任务划分的既有综述,聚焦语义组合机制本身,首先基于组合视角对句子语义表示模型进行归类与比较,剖析分布式语义方法在组合建模中的代表性进路及其内在局限,进而梳理其向组合分布语义发展的内在逻辑和研究趋势。在此基础上,重点阐述以字符串图为演算工具的范畴组合语义框架,并结合典型模型(如 DisCoCat 与 DisCoCirc)说明这类框架的形式化特征及其在量子计算语境下的扩展方向,为理解和评估符号主义方法、联结主义方法与量子计算方法在自然语言处理中的融合路径提供统一的理论视角。

关键词: 范畴论, 字符串图, 组合语义, 量子计算, 可解释性

Abstract: Semantic representation is a central challenge in natural language processing (NLP).Existing approaches can be broa-dly categorized into two paradigms:symbolic and connectionist methods.Although the latter have achieved remarkable practical success,they suffer from theoretical limitations-commonly referred to as the “compositionality crisis” in compositional modeling and semantic interpretability.In existing methods,categorical compositional distributional semantics provides a principled mathematical framework for unifying symbolic syntactic structure with distributed semantics via type-driven composition.From a categorical perspective,this paper surveys category-theoretic approaches to semantic representation along the conceptual line of “category theory-composition-quantum computation”.Unlike surveys organized by models or tasks,it focuses on semantic composition mechanisms,comparing sentence-level models from a compositional viewpoint,analyzing the limitations of distributed approaches,and outlining the theoretical shift toward compositional distributional semantics.Building on this,string diagram-based frameworks such as DisCoCat and DisCoCirc are presented,clarifying their formal properties and quantum extensions,offering a unified view of symbolic,connectionist,and quantum semantics.

Key words: Category theory, String diagrams, Compositional semantics, Quantum computing, Interpretability

中图分类号: 

  • B819
[1]MIKOLOV T,CHEN K,CORRADO G,et al.Efficient estimation of word representations in vector space[C]//Proceedings of the 1st International Conference on Learning Representations.2013.
[2]PENNINGTON J,SOCHER R,MANNING C D.GloVe:Global vectors for word representation[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing.2014:1532-1543.
[3]PETERS M E,NEUMANN M,IYYER M,et al.Deep contextualized word representations[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.2018:2227-2237.
[4]DEVLIN J,CHANG M W,LEE K,et al.BERT:Pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies,2019:4171-4186.
[5]MITCHELL J,LAPATA M.Vector-based models of semantic composition[C]//Proceedings of ACL-08:Human Language Technologies.2008:236-244.
[6]BLACOE W,LAPATA M.A comparison of vector-based representations for semantic composition[C]//Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Lear-ning.2012:546-556.
[7]DE FELICE G.Categorical tools for natural language processing[J].arXiv:2212.06636,2022.
[8]TULL S,LORENZ R,CLARK S,et al.Towards compositional interpretability for XAI[J].arXiv:2406.17583,2024.
[9]LI R,ZHAO X,MO M F.A brief overview of universal sentence representation methods:A linguistic view[J].ACM Computing Surveys,2022,55(3):1-42.
[10]ARORA S,LIANG Y,MA T.A simple but tough-to-beat baseline for sentence embeddings[C]//Proceedings of the 5th International Conference on Learning Representations.2017.
[11]RÜCKLÉ A,EGER S,PEYRARD M,et al.Concatenated p-mean word embeddings as universal cross-lingual sentence representations[C]//Proceedings of the 2018 Conference.2018.
[12]LE Q V,MIKOLOV T.Distributed representations of sentences and documents[C]//Proceedings of the 31st International Conference on Machine Learning.2014:1188-1196.
[13]HILL F,CHO K,KORHONEN A.Learning distributed repre-sentations of sentences from unlabelled data[C]//Proceedings of the Conference of the North American Chapter of the Asso-ciation for Computational Linguistics:Human Language Techno-logies.2016:1367-1377.
[14]ZHANG M,WU Y,LI W K,et al.Learning universal sentence representations with mean-max attention autoencoder[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing.2018.
[15]CHEN Q,WANG W,ZHANG Q L,et al.Ditto:A simple and efficient approach to improve sentence embeddings[C]//Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing.2023.
[16]KIROS R,ZHU Y,SALAKHUTDINOV R,et al.Skip-thought vectors[C]//Proceedings of the Advances in Neural Information Processing Systems.2015.
[17]LOGESWARAN L,LEE H.An efficient framework for learning sentence representations[C]//Proceedings of the 6th International Conference on Learning Representations.2018.
[18]NIE A,BENNETT E D,GOODMAN N D.DisSent:Sentence representation learning from explicit discourse relations[C]//Proceedings of the 2017 Conference.2017.
[19]SILEO D,VAN DE CRUYS T,PRADEL C,et al.Mining discourse markers for unsupervised sentence representation lear-ning[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.2019:3477-3486.
[20]KIROS J,CHAN W.InferLite:Simple universal sentence representations from natural language inference data[C]//Procee-dings of the Conference on Empirical Methods in Natural Language Processing.2018.
[21]REIMERS N,GUREVYCH I.Sentence-BERT:Sentence em-beddings using siamese BERT-networks[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing(EMNLP-IJCNLP).2019.
[22]JERNITE Y,BOWMAN S R,SONTAG D.Discourse-based objectives for fast unsupervised sentence representation learning[J].arXiv:1705.00557,2017.
[23]SUBRAMANIAN S,TRISCHLER A,BENGIO Y,et al.Lear-ning general purpose distributed sentence representations via large scale multi-task learning[C]//Proceedings of the 6th International Conference on Learning Representations.2018.
[24]CER D,YANG Y,KONG S Y,et al.Universal sentence encoder for English[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing.2018:169-174.
[25]SCHOPF T,SCHNEIDER D,MATTHES F.Efficient domainadaptation of sentence embeddings using adapters[J].arXiv:2307.03104,2023.
[26]SOCHER R,PERELYGIN A,WU J,et al.Recursive deep mo-dels for semantic compositionality over a sentiment treebank[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing.2013:1631-1642.
[27]TAI K S,SOCHER R,MANNING C D.Improved semantic representations from tree-structured long short-term memory networks[C]//Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7thInternatio-nal Joint Conference on Natural Language Processing.2015:1556-1566.
[28]GEHRING J,AULI M,GRANGIER D,et al.Convolutional sequence to sequence learning[C]//Proceedings of the 34th International Conference on Machine Learning.2017:1243-1252.
[29]COECKE B,SADRZADEH M,CLARK S.Mathematical foundations for a compositional distributional model of meaning[J].arXiv:1003.4394,2010.
[30]LAN Z,CHEN M,GOODMAN S,et al.ALBERT:A lite BERT for self-supervised learning of language representations[J].arXiv:1909.11942,2019.
[31]OPENAI.GPT-5 system card[EB/OL].https://openai.com/-index/gpt-5-system-card/.
[32]RUSSELL B.The principles of mathematics[M].London:Routledge,1903.
[33]AJDUKIEWICZ K.Die syntaktische Konnexität[J].StudiaPhilosophica,1935,1:1-27.
[34]BAR-HILLEL Y.Logical syntax and semantics[J].Language,1954,30(2):230.
[35]EILENBERG S,MAC LANE S.General theory of naturalequivalences[J].Transactions of the American Mathematical Society,1945,58:231-294.
[36]LAWVERE F W.Functorial semantics of algebraic theories[J].Proceedings of the National Academy of Sciences of the United States of America,1963,50(5):869-872.
[37]LAMBEK J.On the calculus of syntactic types[C]//Structure of Language and Its Mathematical Aspects.Providence:American Mathematical Society,1961:166-178.
[38]COECKE B,GREFENSTETTE E,SADRZADEH M.Lambekvs.Lambek:Functorial vector space semantics and string diagrams for Lambek calculus[J].Annals of Pure and Applied Logic,2013,164(11):1079-1100.
[39]JOYAL A,STREET R.The geometry of tensor calculus II[R].Unpublished manuscript,1995.
[40]KARTSAKLIS D,FAN I,YEUNG R,et al.Lambeq:An efficient high-level Python library for quantum NLP[J].arXiv:2110.04236,2021.
[41]DE FELICE G,DI LAVORE E,ROMÁN M,et al.Functorial language games for question answering[C]//Proceedings of the 3rd Annual International Applied Category Theory Conference(ACT 2020).2020:311-321.
[42]TOUMI A,DE FELICE G.Higher-order DisCoCat(Peirce-Lambek-Montague semantics)[J].arXiv:2311.17813,2023.
[43]ZENG W,COECKE B.Quantum algorithms for compositional natural language processing[J].Electronic Proceedings in Theo-retical Computer Science,2016,221:67-75.
[44]YEUNG R,KARTSAKLIS D.A CCG-based version of the DisCoCat framework[J].arXiv:2105.07720,2021.
[45]COECKE B.The mathematics of text structure[C]//The Interplay of Mathematics,Logic,and Linguistics.2021:181-217.
[46]WANG-MASCIANICA V,LIU J,COECKE B.Distilling textinto circuits[J].arXiv:2301.10595,2023.
[47]LAAKKONEN T,MEICHANETZIDIS K,COECKE B.Quantum algorithms for compositional text processing[J].Electronic Proceedings in Theoretical Computer Science,2024,406:162-196.
[48]LE DU S,HERNÁNDEZ SANTANA S,SCARPA G.A gentle introduction to quantum natural language processing[J].arXiv:2202.11766,2022.
[49]SORDONI A,NIE J Y,BENGIO Y.Modeling term dependen-cies with quantum language models for IR[C]//Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval.2013:653-662.
[50]XIE M,HOU Y,ZHANG P,et al.Modeling quantum entanglements in quantum language models[C]//Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence.2015.
[51]LI Q,MELUCCI M,TIWARI P.Quantum language model-based query expansion[C]//Proceedings of the 2018 ACM SIGIR International Conference on Theory of Information Retrieval.2018:183-186.
[52]JIANG Y,ZHANG P,GAO H,et al.A quantum interference inspired neural matching model for ad-hoc retrieval[C]//Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval.2020:19-28.
[53]PASIN A,CUNHA W,GONÇALVES M A,et al.A quantum annealing instance selection approach for efficient and effective transformer fine-tuning[C]//Proceedings of the 2024 ACM SIGIR International Conference on Theory of Information Retrieval.2024:205-214.
[54]LI Q,WANG B,MELUCCI M.CNM:An interpretable com-plex-valued network for matching[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.2019:4139-4148.
[55]MANSKY B,WÖRLE F,STEIN J K,et al.Adapting the DisCoCat-model for question answering in the Chinese language[C]//Proceedings of the IEEE International Conference on Quantum Computing and Engineering.2023:591-600.
[56]DUNEAU T,BRUHN S,MATOS G,et al.Scalable and interpretable quantum natural language processing:An implementation on trapped ions[J].arXiv:2409.08777,2024.
[57]ZHANG P,ZHANG J,MA X,et al.TextTN:Probabilistic encoding of language on tensor network[C]//Proceedings of the International Conference on Learning Representations.2021.
[58]RUSKANDA F Z,ABIWARDANI M R,AL BARI M A,et al.Quantum representation for sentiment classification[C]//Proceedings of the IEEE International Conference on Quantum Computing and Engineering.2022:67-78.
[59]QU Z,MENG Y,MUHAMMAD G,et al.QMFND:A quantum multimodal fusion-based fake news detection model for social media[J].Information Fusion,2024,104:102172.
[60]MEICHANETZIDIS K,TOUMI A,DE FELICE G,et al.Grammar-aware question-answering on quantum computers[J].ar-Xiv:2012.03756,2020.
[61]MEICHANETZIDIS K,TOUMI A,DE FELICE G,et al.Grammar-aware sentence classification on quantum computers[J].Quantum Machine Intelligence,2023,5(1):10.
[62]NIETO V.Towards machine translation with quantum computers[D].Stockholm:University of Stockholm,2021.
[63]BRADLEY T,TERILLA J,VLASSOPOULOS Y.An enriched category theory of language:from syntax to semantics[J].La Matematica,2022,1(2):551-580.
[64]LIU J,SHAIKH R A,RODATZ B,et al.A pipeline for discourse circuits from CCG[J].arXiv:2311.17892,2023.
[65]LIU T,WEI Y,WANG J.Research on distributional compositional categorical model in both classical and quantum natural language processing[C]//Proceedings of the SNPD 2024.2024:1-6.
[1] 李晖, 刘述娟, 鞠明媚, 王杰鹏, 姬迎松.
NISQ量子线路高频-密集量子门集策略优化算法
High Frequency-Dense Quantum Gate Set Optimization Algorithm for Quantum Circuit in NISQ Era
计算机科学, 2026, 53(4): 112-120. https://doi.org/10.11896/jsjkx.241200213
[2] 郑毅, 贾星昊, 张骏温, 任爽.
基于混合量子经典长-短距离特征扩展网络的图像分类
Image Classification Based on Hybrid Quantum-Classical Long-Short Range Feature Extension Network
计算机科学, 2026, 53(4): 277-283. https://doi.org/10.11896/jsjkx.250600108
[3] 温泽瑞, 姜天, 黄子健, 崔晓晖.
分区稀疏攻击:一种更高效的黑盒稀疏对抗攻击
Section Sparse Attack:A More Powerful Sparse Attack Method
计算机科学, 2026, 53(1): 323-330. https://doi.org/10.11896/jsjkx.241200002
[4] 张兴兰, 容潇军.
基于变分量子的离散对数求解算法
Variational Quantum Algorithm for Solving Discrete Logarithms
计算机科学, 2026, 53(1): 353-362. https://doi.org/10.11896/jsjkx.241100181
[5] 蒋云良, 金森洋, 张雄涛, 刘凯宁, 申情.
多尺度多粒度解耦蒸馏模糊分类器及其在癫痫脑电信号检测中的应用
Multi-scale Multi-granularity Decoupled Distillation Fuzzy Classifier and Its Application inEpileptic EEG Signal Detection
计算机科学, 2025, 52(9): 37-46. https://doi.org/10.11896/jsjkx.250300096
[6] 张静, 王宇平.
基于半直积的双平台密钥协商协议
Dual-platform Key Agreement Protocol Based on Semidirect Product
计算机科学, 2025, 52(6A): 240600036-6. https://doi.org/10.11896/jsjkx.240600036
[7] 张曜麟, 刘晓楠, 杜帅岐, 廉德萌.
基于矩阵乘积算符的混合量子压缩经典生成对抗网络
Hybrid Quantum-classical Compressed Generative Adversarial Networks Based on Matrix Product Operators
计算机科学, 2025, 52(6): 74-81. https://doi.org/10.11896/jsjkx.240500017
[8] 熊其冰, 苗启广, 杨天, 袁本政, 费洋扬.
一种基于混合量子卷积神经网络的恶意代码检测方法
Malicious Code Detection Method Based on Hybrid Quantum Convolutional Neural Network
计算机科学, 2025, 52(3): 385-390. https://doi.org/10.11896/jsjkx.240800006
[9] 陈自刚, 潘鼎, 冷涛, 朱海华, 陈龙, 周由胜.
基于局部梯度平滑的解释鲁棒性对抗训练方法
Explanation Robustness Adversarial Training Method Based on Local Gradient Smoothing
计算机科学, 2025, 52(2): 374-379. https://doi.org/10.11896/jsjkx.240400210
[10] 张晓明, 邱菁菁, 王会勇.
基于邻域匹配概率与类型商图的实体对齐解释方法
Explanation Method for Entity Alignment Based on Neighborhood Matching Probability andType Quotient Graph
计算机科学, 2025, 52(12): 260-270. https://doi.org/10.11896/jsjkx.241100081
[11] 王宝财, 吴国伟.
可解释的信用风险评估模型:基于注意力机制的规则提取方法
Interpretable Credit Risk Assessment Model:Rule Extraction Approach Based on AttentionMechanism
计算机科学, 2025, 52(10): 50-59. https://doi.org/10.11896/jsjkx.250300059
[12] 阮宁, 李淳, 马昊月, 贾异, 李涛.
量子元启发式算法及其应用综述
Review of Quantum-inspired Metaheuristic Algorithms and Its Applications
计算机科学, 2025, 52(10): 190-200. https://doi.org/10.11896/jsjkx.250500127
[13] 朱富坤, 滕臻, 邵文泽, 葛琦, 孙玉宝.
一种语义引导的神经网络关键数据路由路径算法
Semantic-guided Neural Network Critical Data Routing Path
计算机科学, 2024, 51(9): 155-161. https://doi.org/10.11896/jsjkx.230900109
[14] 辛博, 丁志军.
面向延迟标签场景下的可解释信用评估模型
Interpretable Credit Evaluation Model for Delayed Label Scenarios
计算机科学, 2024, 51(8): 45-55. https://doi.org/10.11896/jsjkx.230900107
[15] 乔帆, 王鹏, 汪卫.
基于异构特征融合的多维时间序列分类算法
Multivariate Time Series Classification Algorithm Based on Heterogeneous Feature Fusion
计算机科学, 2024, 51(2): 36-46. https://doi.org/10.11896/jsjkx.230100135
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!