计算机科学 ›› 2022, Vol. 49 ›› Issue (8): 97-107.doi: 10.11896/jsjkx.210700202

• 数据库&大数据&数据科学* 上一篇    下一篇

基于顶点粒k步搜索和粗糙集的强连通分量挖掘算法

程富豪1, 徐泰华1, 陈建军1, 宋晶晶1,2, 杨习贝1   

  1. 1 江苏科技大学计算机学院 江苏 镇江 212000
    2 数据科学与智能应用福建省高校重点实验室 福建 漳州 363000
  • 收稿日期:2021-07-20 修回日期:2021-10-23 发布日期:2022-08-02
  • 通讯作者: 徐泰华(xth19890410@163.com)
  • 作者简介:(cfh_vip@163.com)
  • 基金资助:
    国家自然科学基金(62006099,62076111,61906078);江苏省高等学校自然科学基金(20KJB520010);镇江市重点研发计划——社会发展(SH2018005)

Strongly Connected Components Mining Algorithm Based on k-step Search of Vertex Granule and Rough Set Theory

CHENG Fu-hao1, XU Tai-hua1, CHEN Jian-jun1, SONG Jing-jing1,2, YANG Xi-bei1   

  1. 1 School of Computer,Jiangsu University of Science and Technology,Zhenjiang,Jiangsu 212000,China
    2 Key Laboratory of Data Science and Intelligent Application,Fujian Province University,Zhangzhou,Fujian 363000,China
  • Received:2021-07-20 Revised:2021-10-23 Published:2022-08-02
  • About author:CHENG Fu-hao,born in 1994, postgraduate.His main research interests include granular computing and graph theory.
    XU Tai-hua,born in 1989,Ph.D,lecture,master supervisor,is a member of China Computer Federation.His main research interests include intelligent information processing,granular computing and graph theory.
  • Supported by:
    National Natural Science Foundation of China(62006099,62076111,61906078),Natural Science Foundation of Jiangsu Provincial Colleges and Universities(20KJB520010) and Key Research and Development Plan of Zhenjiang City-Social Development(SH2018005).

摘要: 强连通分量挖掘是图论中的经典问题之一,如何设计更高效率的串行强连通分量挖掘算法具有现实需求。GRSCC算法利用k步上近似和kR相关集这两个粗糙集算子所构成的SUB-RSCC函数,可实现简单有向图中的强连通分量挖掘,而SUB-RSCC函数的调用次数决定了挖掘效率。根据挖掘强连通分量时顶点间存在的相关性,GRSCC算法引入了粒化策略,减少了SUB-RSCC函数的调用次数,提高了挖掘效率。在GRSCC算法的基础上,分析发现了顶点间的另外两种强连通分量相关性,由此设计了一种新的顶点粒化策略,进而提出了一种顶点粒k步搜索方法,可更大程度地减少SUB-RSCC函数的调用次数。最后,提出了一种基于顶点粒k步搜索和粗糙集的强连通分量挖掘算法KGRSCC。实验结果表明,相比RSCC算法、GRSCC算法和Tarjan算法,KGRSCC算法具有更好的性能。

关键词: 粗糙集, 顶点粒k步搜索, 粒化策略, 强连通分量, 图论

Abstract: Strong connected components (SCCs) mining is one of the classic problems in graph theory.It has practical requirements to design a serial SCCs mining algorithm with high efficiency.GRSCC algorithm can use SUB-RSCC function to discover SCCs of simple digraphs.SUB-RSCC function is formed by two operators of rough set theory (RST),k-step upper approximation set and k-step R-related,which are the main contributors to time consumption.Then the invocation times of SUB-RSCC decide the efficiency of GRSCC algorithm.Based on the SCCs correlations among vertices,GRSCC algorithm introduces granulation strategy to reduce the invocation times of SUB-RSCC function,then improve the mining efficiency.Two new SCCs correlations are found by analysis of SCCs in the framework of RST,then a new vertex granulation strategy is designed to granulate the vertex set of target digraphs.In order to reduce the invocation times of SUB-RSCC function to a greater extent,a method called k-step search of vertex granule is proposed.Finally,combining with GRSCC algorithm,an algorithm called KGRSCC for mining SCCs based on k-step search of vertex granule and RST is proposed.Experimental results show that,compared with RSCC,GRSCC and Tarjan algorithms,the proposed KGRSCC algorithm has better performance.

Key words: k-step search of vertex granule, Granulation strategy, Graph theory, Rough set, Strongly connected components

中图分类号: 

  • TP181
[1]PEARCE D J,KELLY P H,HANKIN C.Efficient field-sensitive pointer analysis of C[J].ACM Transactions on Programming Languages and Systems,2007,30(1):4-14.
[2]BURKE M.An interval-based approach to exhaustive and incremental interprocedural data-flow analysis[J].ACM Transactions on Programming Languages and Systems,1990,12(3):341-395.
[3]YANG H G,SHEN D R,KOU Y,et al.Strongly connectedcomponents based efficient computation of page rank[J].Frontiers of Computer Science,2018,12(6):1208-1219.
[4]ADAMIC L A.The Small World Web [C]//Proceedings of International Conference on Theory and Practice of Digital Lib-raries.Springer,1999:443-452.
[5]IOANNIDIS Y,RAMAKRISHNAN R,WINGER L.Transitive closure algorithms based on graph traversal[J].ACM Transactions on Database Systems,1993,18(3):512-576.
[6]NUUTILA E,SOISALON-SOININEN E.On finding thestrongly connected components in a directed graph[J].Information Processing Letters,1994,49(1):9-14.
[7]PEARCE D J.A space-efficient algorithm for finding strongly connected components[J].Information Processing Letters,2016,116(1):47-52.
[8]TARJAN R.Depth-First Search and Linear Graph Algorithms[J].SIAM Journal on Computing,1972,1(2):146-160.
[9]SHARIR M.A strong-connectivity algorithm and its applica-tions in data flow analysis[J].Computers& Mathematics with Applications,1981,7(1):67-72.
[10]GABOW H N.Path-based depth-first search for strong and biconnected components[J].Information Processing Letters,2000,74(3):107-114.
[11]GAZIT H,MILLER G L.An improved parallel algorithm that computes the BFS numbering of a directed graph[J].Information Processing Letters,1988,28(2):61-65.
[12]FLEISCHER L K,HENDRICKSON B,PINAR A.OnIdenti-fying Strongly Connected Components in Parallel[J].Lecture Notes in Computer Science,2000,1800(4):505-511.
[13]MCLENDON III W,HENDRICKSON B,PLIMPTON S J,et al.Finding strongly connected components in distributed graphs[J].Journal of Parallel and Distributed Computing,2005,65(8):901-910.
[14]BARNAT J,BAUCH P,BRIM L,et al.Computing StronglyConnected Components in Parallel on CUDA [C]//Proceedings of International Parallel and Distributed Processing Sympo-sium.IEEE,2011:544-555.
[15]LOWE G.Concurrent depth-first search algorithms based onTarjan’s Algorithm[J].International Journal on Software Tools for Technology Transfer,2016,18(2):129-147.
[16]PAWLAK Z.Rough sets:Theoretical aspects of reasoning about data[M].Kluwer Academic Publishers,1992.
[17]YAO Y Y,ZHANG X Y.Class-specific attribute reducts inrough set theory[J].Information Sciences,2017,418-419:601-618.
[18]QIAN Y H,LIANG X Y,WANG Q,et al.Local rough set:A solution to rough data analysis in big data[J].International Journal of Approximate Reasoning,2018,97:38-63.
[19]WANG G Y,MA X A,YU H.Monotonic uncertainty measures for attribute reduction in probabilistic rough set model[J].International Journal of Approximate Reasoning,2015,59:41-67.
[20]YANG X B,ZHANG M,DOU H L,et al.Neighborhood systems-based rough sets in incomplete information system[J].Knowledge-Based Systems,2011,24(6):858-867.
[21]YANG X B,YAO Y Y.Ensemble selector for attribute reduction[J].Applied Soft Computing,2018,70:1-11.
[22]YANG X B,LIANG S C,YU H L,et al.Pseudo-label neighborhood rough set:measures and attribute reductions[J].International Journal of Approximate Reasoning,2019,105:112-129.
[23]SONG J J,TSANG E,CHEN D G,et al.Minimal decision cost reduct in fuzzy decision-theoretic rough set model[J].Know-ledge-Based Systems,2017,126:104-112.
[24]LIU K Y,YANG X B,YU H L,et al.Rough set based semi-supervised feature selection via ensemble selector[J].Knowledge-Based Systems,2019,165:282-296.
[25]XU T H,WANG G Y.Finding strongly connected components of simple digraphs based on generalized rough sets theory[J].Knowledge Based Systems,2018,149:88-98.
[26]XU T H,WANG G Y,YANG J.Finding strongly connectedcomponents of simple digraphs based on granulation strategy[J].International Journal of Approximate Reasoning,2020,118:64-78.
[27]BANG-JENSEN J,GUTIN G Z.Digraphs Theory,Algorithmsand Applications,Second Edition[M].Berlin:Springer Science &Business Media,2009.
[28]YAO Y Y.Two views of the theory of rough sets in finite universes[J].International Journal of Approximate Reasoning,1996,15(4):291-317.
[29]JARVINEN J.Lattice theory for rough sets[J].Transactions on Rough Sets,2007,6:400-498.
[30]FAN T F.Rough set analysis of relational structures[J].Information Sciences,2013,221:230-244.
[31]CHEN J K,LI J J,LIN Y J.Computing connected components of simple undirected graphs based on generalized rough sets[J].Knowledge-Based Systems,2013,37:80-85.
[32]DAVIS T A,HU Y F.The university of Florida sparse matrix collection[J].ACM Transactions on Mathematical Software,2011,38(1):1-25.
[1] 许思雨, 秦克云.
基于剩余格的模糊粗糙集的拓扑性质
Topological Properties of Fuzzy Rough Sets Based on Residuated Lattices
计算机科学, 2022, 49(6A): 140-143. https://doi.org/10.11896/jsjkx.210200123
[2] 方连花, 林玉梅, 吴伟志.
随机多尺度序决策系统的最优尺度选择
Optimal Scale Selection in Random Multi-scale Ordered Decision Systems
计算机科学, 2022, 49(6): 172-179. https://doi.org/10.11896/jsjkx.220200067
[3] 陈于思, 艾志华, 张清华.
基于三角不等式判定和局部策略的高效邻域覆盖模型
Efficient Neighborhood Covering Model Based on Triangle Inequality Checkand Local Strategy
计算机科学, 2022, 49(5): 152-158. https://doi.org/10.11896/jsjkx.210300302
[4] 孙林, 黄苗苗, 徐久成.
基于邻域粗糙集和Relief的弱标记特征选择方法
Weak Label Feature Selection Method Based on Neighborhood Rough Sets and Relief
计算机科学, 2022, 49(4): 152-160. https://doi.org/10.11896/jsjkx.210300094
[5] 王子茵, 李磊军, 米据生, 李美争, 解滨.
基于误分代价的变精度模糊粗糙集属性约简
Attribute Reduction of Variable Precision Fuzzy Rough Set Based on Misclassification Cost
计算机科学, 2022, 49(4): 161-167. https://doi.org/10.11896/jsjkx.210500211
[6] 王志成, 高灿, 邢金明.
一种基于正域的三支近似约简
Three-way Approximate Reduction Based on Positive Region
计算机科学, 2022, 49(4): 168-173. https://doi.org/10.11896/jsjkx.210500067
[7] 薛占熬, 侯昊东, 孙冰心, 姚守倩.
带标记的不完备双论域模糊概率粗糙集中近似集动态更新方法
Label-based Approach for Dynamic Updating Approximations in Incomplete Fuzzy Probabilistic Rough Sets over Two Universes
计算机科学, 2022, 49(3): 255-262. https://doi.org/10.11896/jsjkx.201200042
[8] 李艳, 范斌, 郭劼, 林梓源, 赵曌.
基于k-原型聚类和粗糙集的属性约简方法
Attribute Reduction Method Based on k-prototypes Clustering and Rough Sets
计算机科学, 2021, 48(6A): 342-348. https://doi.org/10.11896/jsjkx.201000053
[9] 薛占熬, 孙冰心, 侯昊东, 荆萌萌.
基于多粒度粗糙直觉犹豫模糊集的最优粒度选择方法
Optimal Granulation Selection Method Based on Multi-granulation Rough Intuitionistic Hesitant Fuzzy Sets
计算机科学, 2021, 48(10): 98-106. https://doi.org/10.11896/jsjkx.200800074
[10] 薛占熬, 张敏, 赵丽平, 李永祥.
集对优势关系下多粒度决策粗糙集的可变三支决策模型
Variable Three-way Decision Model of Multi-granulation Decision Rough Sets Under Set-pair Dominance Relation
计算机科学, 2021, 48(1): 157-166. https://doi.org/10.11896/jsjkx.191200175
[11] 桑彬彬, 杨留中, 陈红梅, 王生武.
优势关系粗糙集增量属性约简算法
Incremental Attribute Reduction Algorithm in Dominance-based Rough Set
计算机科学, 2020, 47(8): 137-143. https://doi.org/10.11896/jsjkx.190700188
[12] 陈玉金, 徐吉辉, 史佳辉, 刘宇.
基于直觉犹豫模糊集的三支决策模型及其应用
Three-way Decision Models Based on Intuitionistic Hesitant Fuzzy Sets and Its Applications
计算机科学, 2020, 47(8): 144-150. https://doi.org/10.11896/jsjkx.190800041
[13] 周俊丽, 管延勇, 徐法升, 王洪凯.
覆盖近似空间中的核及其性质
Core in Covering Approximation Space and Its Properties
计算机科学, 2020, 47(6A): 526-529. https://doi.org/10.11896/JsJkx.190600003
[14] 高琳, 段国林, 姚涛.
基于图论的组织互操作性建模与评估研究
Research on Organizational Interoperability Modeling and Evaluation Based on Graph Theory
计算机科学, 2020, 47(6A): 572-576. https://doi.org/10.11896/JsJkx.190900114
[15] 张琴, 陈红梅, 封云飞.
一种基于粗糙集和密度峰值的重叠社区发现方法
Overlapping Community Detection Method Based on Rough Sets and Density Peaks
计算机科学, 2020, 47(5): 72-78. https://doi.org/10.11896/jsjkx.190400160
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!