计算机科学 ›› 2026, Vol. 53 ›› Issue (5): 119-128.doi: 10.11896/jsjkx.250600019
李敏波1, 王少华2, 吴大臻1
LI Minbo1, WANG Shaohua2, WU Dazhen1
摘要: 针对军工科研单位因保密分级分块管理导致的数据资源孤岛问题,以及数据资源智能检索与知识复用困难,提出了多源异构企业数据资源与数据资产的治理方案。通过企业数据空间属性图模型实现数据资源之间的图节点关联映射,构建融合BOM树型结构的数据资源知识图谱,涵盖研发工艺、生产制造与质检数据的层级关系、属性信息及关联关系。具体地,提出了一个新颖的RAG框架HireRAG,建立了基于C-HNSW的知识图谱社群化分层索引,低层保留细粒度的知识单元,高层社群提供全局摘要,以处理不同层次的检索;提出了一种图增强聚类算法,使得C-HNSW更好地捕捉知识图谱中的语义信息。实验表明,HireRAG相比现有的一些先进RAG框架更适合处理企业数据空间BOM关联数据,可以在实现最高准确率的同时保证检索效率优势。数据资产管理系统确保数据资产全过程合规入表。
中图分类号:
| [1]FRANKLIN M,HALEVY A,MAIER D.From databased todataspaces:A new abstraction for information management[J].SIGMOD Record,2005,34(4):27-33. [2]FAN S H,HOU M S.Dataspace:A New Data Organization and Management Model[J].Computer Science,2023,50(5):115-127. [3]DITTRICH J P,SALLES M A V.iDM:A unified and versatile data model for personal dataspace management[C]//Internatio-nal Conference on Very Large Data Bases(VLDB’06).2006:367-378. [4]WEN B L,JIAO S J,GUO J.Research on Data OrganizationMethod Based on Enterprise DataSpace[J].Computer Technology and Development,2020,30(12):56-60. [5]LI Y K,MENG X F,ZHANG X Y.Research on data space technology[J].Journal of Software,2008,19(8):2018-2031. [6]YAN X F,YU P S,HAN J W.Graph Indexing:A frequent structure-based approach[C]//Proceedings of the 24th International Conference on Management of Data(SIGMOD 2004).New York:ACM,2004:335-346. [7]HE H,SINGH A K.Closure-Tree:An index structure for graph queries[C]//Proceedings of the 22nd International Conference on Data Engineering(ICDE 2006).Dallas:IEEE Computer So-ciety,2006. [8]HOLDER L,COOK D,DJOKO S.Substructure discovery in the subdue system[C]//Proceedings of the AAAI Workshop of Conference on Knowledge Discovery in Databases.Menlo Park:AAAI,1994:169-180. [9]JIANG H L,WANG H X,YU P S,et al.GString:A novel approach for efficient search in graph databases[C]//Proceedings of the 23rd International Conference on Data Engineering(ICDE 2007).Dallas:IEEE Computer Society,2007:566-575. [10]VAN BRUGGEN R,BATON J.Learning Neo4j 3.x-second edition[M]//Birmingham:Packt Publishing,2017:33-36. [11]ELSAYED I,MUSLIMOVIC A,BREZANY P.Intelligent data spaces for science C WSEAS[C]//International Conference on Computational Intelligence,Man-Machine Systems and Cybernetics.2008:94-100. [12]YANG D,SHEN D,NIE T,et al.Layered graph data model for data management of dataspace support platform[C]//International Conference on Web-Age Information Management.Berlin:Springer,2011:353-365. [13]FRANKLIN M,HALEVY A,MAIER D.A first tutorial ondataspaces[C]//Proceedings of the VLDB Endowment.2008:1516-1517. [14]XUE Q,ZENG X Q,OUYANG Z Y.Discussion on Key Issues of Data Asset Inclusion the Background of Digitalization and Intelligence[J].The Chinese Certified Public Accountant,2024,10:105-113. [15]EDGE D,TRINH H,CHENG N,et al.From Local to Global:A Graph RAG Approach to Query-Focused Summarization[J].arXiv:2404.16130,2024. [16]GUO Z R,XIA L H,YU Y H,et al.Lightrag:Simple and fast retrieval-augmented generation[J].arXiv:2410.05779,2024. [17]GROVER A,LESKOVEC J.node2vec:Scalable feature learning for networks[C]//Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mi-ning.2016:855-864. [18]XU X,YURUK N,FENG Z,et al.Scan:A structural clustering algorithm for networks[C]//Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.2007:824-833. [19]WANG C,WANG F,ONEGA T.Network optimization ap-proach to delineating health care service areas:Spatially constrained Louvain and Leiden algorithms[J].Transactions in GIS,2021,25(2):1065-1081. [20]KOJIMA T,GU S S,REID M,et al.Large language models are zero-shot reasoners[J].Advances in Neural Information Processing Systems,2022,35:22199-22213. [21]ROBERTSON S E,WALKER S.Some simple effective approximations to the 2-poisson model for probabilistic weighted retrieval[C]//Proceedings of the Seventeenth Annual International ACM-SIGIR Conference on Research and Development in Information Retrieval(SIGIR’94).London:Springer,1994:232-241. [22]SARTHI P,ABDULLAH S,TULI A,et al.Raptor:Recursiveabstractive processing for tree-organized retrieval[C]//The Twelfth International Conference on Learning Representations.2024. [23]NUSSBAUM Z,MORRIS J X,DUDERSTADT B,et al.Nomic embed:Training a reproducible long context text embedder[J].arXiv:2402.01613,2024. |
|
||