Computer Science ›› 2025, Vol. 52 ›› Issue (6A): 240400074-7.doi: 10.11896/jsjkx.240400074

• Big Data & Data Science • Previous Articles     Next Articles

Research on Fusion Optimization Method of Heterogeneous Data Dictionary in Grass-roots SocialGrid Governance

WANG Qing, YANG Wanzhe, ZHANG Cong   

  1. College of Information Science and Engineering,Northeastern University,Shenyang 110000,China
  • Online:2025-06-16 Published:2025-06-12
  • About author:WANG Qing,born in 1969,Ph.D,associate professor.His main research interests include modeling and optimization,manufacturing and service planning and sche-duling,logistics & supply chain resources planning,e-Commerce business optimization,intelligent optimization algorithm.
  • Supported by:
    National Key Research and Development Program of China(2021YFC3300300).

Abstract: Data dictionary(DD) is an important part of the database system design content,and it is a collection of data lists that describes the attributes,composition and structure of the data in the database.In the development process of some general-purpose information systems,designers and developers often encounter the problem of how to integrate and optimize existing heterogeneous data dictionaries.Due to the lack of industry data standards or business scope limitations,these existing data dictionaries differ significantly in data representation definition,data composition and structure design,but their data content is highly convergable.It takes a lot of time and resources to manually maintain a converged data dictionary.Based on the business background of grass-roots social grid governance,this paper aims at the pain points of heterogeneous data dictionary fusion in the development of grass-roots social governance promotion digital application,and studies the optimization methods and related technologies of he-terogeneous data dictionary fusion.The methods and techniques of data dictionary fusion are designed,which consider the completeness of data information and the integrity of data structure,such as semantic deduplication and disambiguation,keyword extraction,similarity calculation and table structure fusion.Based on the experimental verification of data dictionary fusion optimization of grass-roots social grid governance business,the fusion efficiency and effect are significantly improved compared with the traditional data dictionary fusion method.

Key words: Data dictionary, Database design, Edit distance, Similarity calculation, Grass-roots social grid governance

CLC Number: 

  • TP392
[1]YVETTE A.Describing businesses with data dictionaries [J].Data Processing,1984,26(6):17-19.
[2]JULIA V D.Data dictionaries as a tool to greater productivity [J].Data Processing,1984,26(6):14-16.
[3]SHAMKANT B N,LARRY K.Role of data dictionaries in information resource management [J].Information & Management,1986,10(1):21-46.
[4]FIORA P,CLARA P.Explaining incompatibilities in data dic-tionary design through abduction [J].Data & Knowledge Engineering,1994,13(2):101-139.
[5]ANDREW D.ARENSON.Implementation of a shared data repository and common data dictionary for fetal alcohol spectrum disorders research [J].Alcohol,2010,44(7/8):643-647.
[6]CATHERINE L,ICHIRO F.Metadata Data Dictionary for Analog Sound Recordings [C]//Proceedings of the 6th ACM/IEEE-CS joint conference on Digital libraries(JCDL’06).2006:344.
[7]ALEXANDEROS B,IOANNIS K,VANA K.Dictionary datastructures for smartphone devices[C]//Proceedings of the 5th International Conference on PErvasive Technologies Related to Assistive Environments.2012:1-4.
[8]UDO D,ROLAND H.Cyber-physical system description model[J].Chinese instrument,2013(10):41-47.
[9]HUANG C H,YIN J,HOU F.A text similarity measurement method combining lexical semantic information and TF-IDF method[J].Journal of Computer Science,2011,34(5):856-864.
[10]LI M T,LUO J Y,YIN M J.A method to calculate the weightof text feature words combined with their meaning[J].Compu-ter Application,2012,32(5):1355-1358,1365.
[11]ZHAN Z J,LAING L N,YANG X P.Word similarity calculation based on Baidu Encyclopedia[J].Computer Science,2013,40(6):199-202.
[12]WANG Z Z,HE M,DU Y P.Text similarity calculation based on LDA topic model[J].Computer Science,2013,40(12):229-232.
[13]ZHANG H Y,LIU D B,WEN C Y.Research on word semantic similarity improvement algorithm based on Knownet [J].Computer Engineering,2015(2):151-156.
[14]XIN Y F,FU Y X,MA L.Short text classification based on frequent item feature extension[J].Computer Science,2019,46(z1):478-481.
[15]WANG H L.Predicts 2023:Synonyms [EB/OL].(2017-09-27) [2023-11-22].https://github.com/huyingxi/Synonyms/doc.
[16]LIU G Z,ZHANG J H,WANG H D.The application of TF-IDF algorithm in e-commerce simulation training platform is improved [J].Computer Simulation,2023,40(7):273-277.
[17]HUANG L,WU Y P,ZHU F Q.Research and improvement of automatic keyword extraction method[J].Computer Science,2014,41(6):204-207.
[18]WANG J.LI X J.Improved TFIDF label extraction algorithm [J].Software Engineering,2018,21(2):4-6.
[19]GRAVANO L,IPEIROFIS P G,JAGADISH H V.Approxi-mateString Joins in a Database[C]//Proceedings of the 27th International Conference on Very Large Data Bases.2001:491-500.
[20]SONDIK E J.The optimal control of partially observableMarkov processes over the infinite horizon:discountedcosts [J].Opera-tions Research,1978,26(6):282-304.
[21]SYAROFINA S,BUSTAMAM A,YANUSRA,et al.The distance function approach on the MiniBatchKMeans algorithm for the DPP-4 inhibitors on thediscovery of type 2 diabetes drugs [J].Procedia Computer Science,2021(179):127-134.
[1] SUN Haidong, LIU Wanping, HUANG Dong. DGA Domain Name Detection Method Based on Similarity [J]. Computer Science, 2023, 50(6A): 220400122-6.
[2] JIAN Kaiyu, SHI Yaqing, HUANG Song, XU Shanshan, YANG Zhongju. Review on Similarity of Business Process Models [J]. Computer Science, 2023, 50(6): 338-350.
[3] WU Zi-yi, LI Shao-mei, JIANG Meng-han, ZHANG Jian-peng. Ontology Alignment Method Based on Self-attention [J]. Computer Science, 2022, 49(9): 215-220.
[4] DOU Jia-wei. Privacy-preserving Hamming and Edit Distance Computation and Applications [J]. Computer Science, 2022, 49(9): 355-360.
[5] WANG Yi, LI Zheng-hao, CHEN Xing. Recommendation of Android Application Services via User Scenarios [J]. Computer Science, 2022, 49(6A): 267-271.
[6] CHEN Ying-ren, GUO Ying-nan, GUO Xiang, NI Yi-tao, CHEN Xing. Web Page Wrapper Adaptation Based on Feature Similarity Calculation [J]. Computer Science, 2021, 48(11A): 218-224.
[7] ZHONG Ya,GUO Yuan-bo,LIU Chun-hui,LI Tao. User Attributes Profiling Method and Application in Insider Threat Detection [J]. Computer Science, 2020, 47(3): 292-297.
[8] SUN Guo-zi, LYU Jian-wei, LI Hua-kang. MeTCa:Multi-entity Trusted Confirmation Algorithm Based on Edit Distance [J]. Computer Science, 2020, 47(12): 327-331.
[9] XU Fei-xiang,YE Xia,LI Lin-lin,CAO Jun-bo,WANG Xin. Comprehensive Calculation of Semantic Similarity of Ontology Concept Based on SA-BP Algorithm [J]. Computer Science, 2020, 47(1): 199-204.
[10] WU Yi-fan, CUI Yan-peng, HU Jian-wei. Alert Processing Method Based on Hierarchical Clustering [J]. Computer Science, 2019, 46(4): 203-209.
[11] LU Xian-hua, WANG Hong-jun. Design of Distributed News Clustering System Based on Big Data Computing Framework [J]. Computer Science, 2019, 46(11A): 220-223.
[12] XIANG Ying-zhuo, TAN Ju-xian, HAN Jie-si, SHI Hao. Survey of Graph Matching Algorithms [J]. Computer Science, 2018, 45(6): 27-31.
[13] XU Zhou-bo, ZHANG Kun, NING Li-hua and GU Tian-long. Summary of Graph Edit Distance [J]. Computer Science, 2018, 45(4): 11-18.
[14] CHEN Bing-chuan, CHEN Ai-xiang, WU Xiang-jun and LI Lei. Representation Tool of Data Relations in Database Design Based on Data Source-target Digraph [J]. Computer Science, 2017, 44(Z6): 470-474.
[15] YANG Yan and JIANG Guo-ping. Improved Method of Computer Virus Signature Automatic Extraction Based on N-Gram [J]. Computer Science, 2017, 44(Z11): 338-341.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!