基于网络结构的正则化逻辑回归

doi:10.11896/jsjkx.201100106

计算机科学 ›› 2021, Vol. 48 ›› Issue (7): 281-291.doi: 10.11896/jsjkx.201100106

基于网络结构的正则化逻辑回归

胡艳梅¹, 杨波², 多滨¹

1 成都理工大学计算机与网络安全学院成都610059
2 电子科技大学计算机科学与工程学院成都611731

收稿日期:2020-11-13 修回日期:2021-03-24 出版日期:2021-07-15 发布日期:2021-07-02
通讯作者: 胡艳梅(huyanmei@cdut.edu.cn)
基金资助:
国家自然科学基金(61802034,61977013);国家重点研发项目基金(2019YFC1509602);四川省科技计划重点研发项目基金(2021YFG0333)

Logistic Regression with Regularization Based on Network Structure

HU Yan-mei¹, YANG Bo², DUO Bin¹

1 College of Computer Science and Cyber Security,Chengdu University of Technology,Chengdu 610059,China
2 School of Computer Science and Engineering,University of Electronic Science and Technology of China,Chengdu 611731,China

Received:2020-11-13 Revised:2021-03-24 Online:2021-07-15 Published:2021-07-02
About author:HU Yan-mei,born in 1984,Ph.D,asso-ciate professor,master supervisor,is a member of China Computer Federation.Her main research interests include data mining,social and information networks analysis,machine learning and evolutionary computation.
Supported by:
Natural Science Foundation of China(61802034,61977013),National Key Research and Development Program of China(2019YFC1509602) and Sichuan Science and Technology Program(2021YFG0333).

摘要/Abstract

摘要： 逻辑回归是一个应用广泛的分类模型,但由于高维数据分类任务在实际应用中变得越来越频繁,使得分类模型面临着巨大的挑战。应对该挑战的一种有效方法是对模型进行正则化。许多已有的正则化逻辑回归直接运用L1范数罚作为正则化罚项,而不考虑特征之间的复杂关联关系。也有一些研究工作基于特征的组信息设计了正则化罚项,但它们假设组信息是预先给定的。文中从网络的视角对特征数据中存在的潜在模式进行挖掘,并基于此提出了一个基于网络结构的正则化逻辑回归。首先,以网络的形式描述特征数据并构建出特征网络;其次,从网络科学的角度对特征网络进行观察和分析,并基于此设计罚函数;然后,以该罚函数为正则化罚项,提出网络结构Lasso逻辑回归;最后,结合Nesterov加速近端梯度下降法和Moreau-Yosida正则化方法,推导了模型的求解过程。在真实数据集上的实验结果显示,所提网络结构Lasso逻辑回归表现优异,这表明从网络的视角观察和分析特征数据是研究正则化模型的一个具有潜力的方向。

关键词: 近端梯度下降法, 逻辑回归, 特征选择, 网络结构, 正则化罚项

Abstract: Logistic regression is widely used as classification model.However,as the task of high-dimensional data classification becomes more and more frequent in practical application,the classification model is facing great challenge.Regularization is an effective approach to this challenge.Many existing regularized logistic regression models directly use L1-norm penalty as regularized penalty term without considering the complex relationships among features.There are also some regularization penalty terms designed on the basis of group information of features,but assuming that the group information is prior knowledge.This paper explores the pattern hidden in feature data from the perspective of network and then proposes a regularized logistic regression model based on the network structure.Firstly,this paper constructs feature network by describing feature data in the form of network.Secondly,it observes and analyzes the feature network from the perspective of network science and designs a penalty function based on the observation.Thirdly,it proposes a logistic regression model with network structured Lasso by taking the penalty function as regularized penalty term.Lastly,it infers the solution of the model by combining the Nesterov’s accelerated proximal gradient method and the Moreau-Yosida regularization method.Experiments on real datasets show that the proposed regularized logistic regression performs excellently,which demonstrates that observing and analyzing feature data from the perspective of network is a potential way to study regularized model.

Key words: Feature selection, Logistic regression, Network structure, Proximal gradient method, Regularized penalty term

中图分类号:

TP181

胡艳梅, 杨波, 多滨. 基于网络结构的正则化逻辑回归[J]. 计算机科学, 2021, 48(7): 281-291. https://doi.org/10.11896/jsjkx.201100106

HU Yan-mei, YANG Bo, DUO Bin. Logistic Regression with Regularization Based on Network Structure[J]. Computer Science, 2021, 48(7): 281-291. https://doi.org/10.11896/jsjkx.201100106

参考文献

[1]GEORGIANA I,GÖKHAN B,GERHARD W.Fast LogisticRegression for Text Categorization with Variable-length N-grams[C]//International Conference on Knowledge Discovery and Data Mining.ACM Digital Library,2008:354-362.
[2]FRIEDMAN J,HASTIE T,TIBSHIRANI R.Additive Logistic Regression:a Statistical View of Boosting[J].The Annals of Statistics,2000,28:337-407.
[3]JURAFSKY D,MARTIN J H.Speech and Language Proces-sing:An Introduction to Natural Language Processing[M]//Computational Linguistics and Speech Recognition.Upper Saddle River:Prentice Hall,2000.
[4]ZHANG X C,ZHANG Q Z,WANG X F,et al.StructuredSparse Logistic Regression with Application to Lung Cancer Prediction Using Breath Volatile Biomarkers[J].Statistic in Medicine,2020,39(7):955-967.
[5]LEE S,LEE H,ABBEEL P,et al.Efficient L1 Regularized Logistic Regression[C]//The 21st AAAI Conference on Artificial Intelligence.AAAI,2006:401-408.
[6]LIU J,CHEN J,YE J P.Large-scale Sparse Logistic Regression[C]//International on Knowledge Discovery and Data Mining.ACM Digital Library,2009:547-556.
[7]NG A Y.Feature Selection,L1 vs L2 Regularization,and Rota-tional Invariance[C]//International Conference on Machine Learning.ACM Digital Library,2004:78-85.
[8]ZOU H,HASTIE T.Regularization and Variable Selection viathe Elastic Net[J].Journal of the Royal Statistical Society:Series B(Statistical Methodology),2005,67(2):301-320.
[9]ZENG L M,XIE J.Group Variable Selection via Scad-l2[J].A Journal of Theoretical and Applied Statistics,2014,48(1):49-66.
[10]BECKER N,TOEDT G,LICHTER P,et al.Elastic SCAD as a Novel Penalization Method for SVM Classification Task in High-dimensional Data[J].BMC Bioinformatics,2011,12(1):1-13.
[11]LORBERT A,RAMADGE P J.The Pairwise Elastic Net Support Vector Machine for Automatic fMRI Feature Selection[C]//International Conference on Speech and Signal Proces-sing.IEEE,2013:1036-1040.
[12]LORBERT A,EIS D,KOSTINA V,et al.Exploiting Covariate Similarity in Sparse Regression via the Pairwise Elastic Net[C]//International Conference on Artificial Intelligence and Statistics.PMLR,2010:477-484.
[13]GRAVE E,OBOZINSKI G R,BACH F R.Trace Lasso:a Trace Norm Regularization for Correlated Designs[C]//Annual Conference on Neural Information Processing Systems.ACM Digital Library,2011:2187-2195.
[14]TIBSHIRANI R,SAUNDERS M,ROSSET S,et al.Sparsityand Smoothness via the Fused Lasso[J].Journal of the Royal Statistical Society:Series B (Statistics Methodology),2005,67(1):91-108.
[15]RINALDO A.Properties and Refinements of the Fused Lasso[J].The Annals of Statistics,2009,37(5B):2922-2952.
[16]TIBSHIRANI R,WANG P.Spatial Smoothing and Hot SpotDetection for CGH Data Using the Fused Lasso[J].Biostatistics,2008,9(1):18-29.
[17]RAPAPORT F,BARILLOT E,VERT J P.Classification of ArrayCGH Data Using Fused SVM[J].Bioinformatics,2008,24(13):375-382.
[18]ZHOU J,LIU J,NARAYAN V A,et al.Modeling Disease Progression via Fused Sparse Group Lasso[C]//International Conference on Knowledge Discovery and Data Mining.ACM Digital Library,2012:1095-1103.
[19]YE G B,XIE X.Split Bregman Method for Large Scale Fused Lasso[J].Computational Statistics & Data Analysis,2011,55(4):1552-1569.
[20]HOEFLING H.A Path Algorithm for The Fused Lasso Signal Approximator[J].Journal of Computational and Graphical Statistics,2010,19(4):984-1066.
[21]DAYE Z J,JENG X J.Shrinkage and Model Selection with Correlated Variables via Weighted Fusion[J].Computational Statistics & Data Analysis,2009,53(4):1284-1298.
[22]BONDELL H D,REICH B J.Simultaneous Regression Shrin-kage,Variable Selection,and Supervised Clustering of Predictors with OSCAR[J].Biometrics,2008,64(1):115-123.
[23]ZENG X R,FIGUEIREDO M A.Solving OSCAR Regularization Problems by Fast Approximate Proximal Splitting Algorithms[J].Digital Signal Processing,2014,31(1):124-135.
[24]YUAN M,LIN Y.Model Selection and Estimation in Regressionwith Grouped Variables[J].Journal of the Royal Statistics So-ciety:Series B (Statistical Methodology),2006,68(1):49-67.
[25]JACOB L,OBOZINSKI G,VERT J P.Group Lasso with Overlap and Graph Lasso[C]//Annual International Conference on Machine Learning.ACM Digital Library,2009:433-440.
[26]LIU J,YE J.Moreau-Yosida Regularization for Grouped Tree Structure Learning[C]//Annual Conference on Neural Information Processing Systems.ACM Digital Library,2010:1459-1467.
[27]SIMON N,FRIEDMAN J,HASTIE T,et al.A Sparse-Group Lasso[J].Journal of Computational and Graphical Statistics,2013,22(2):231-245.
[28]FRIEDMAN J,HASTIE T,TIBSHIRANI R.A Note on theGroup Lasso and a Sparse Group Lasso[J].arXiv:1001.0736,2010.
[29]LIU J W,CUI L P,LIU Z Y,et al.Sparse on the Regularized Sparse Models[J].Chinese Journal of Computers,2015,38(7):1307-1325.
[30]YE J,LIU J.Sparse Methods for Biomedical Data[J].ACMSIGKDD Explorations Newsletter,2012,14(1):4-14.
[31]VINCENT M,HANSEN N R.Sparse Group Lasso and HighDimensional Multinomial Classification[J].Computational Statistics & Data Analysis,2014,71:771-786.
[32]NESTEROV Y.Smooth Minimization of Non-Smooth Functions[J].Mathematical Programming,2005,103(1):127-152.
[33]NESTEROV Y.Gradient Methods for Minimizing CompositeObjective Function[J].Mathematical Programming,2013,140(1):125-161.
[34]TIBSHIRANI R.Regression Shrinkage and Selection via theLasso[J].Journal of the Royal Statistical Society:Series B (Methodological),1996,58(1):267-288.
[35]SHEVADE S K,KEERTHI S S.A Simple and Efficient Algorithm for Gene Selection Using Sparse Logistic Regression[J].Bioinformatics,2003,19(17):2246-2253.
[36]HU Y,REN Y,WANG Q.A Feature Selection Based on Network Structure for Credit Card Default Prediction[C]//Chinese Conference on Computer Supported Cooperative Work and Social Computing.Springer,2019:275-286.
[37]WANG M,WANG C,YU X J,et al.Community Detection in Social Networks:an In-depth Benchmarking Study with a Procedure-oriented Framework[C]//International Conference on Very Large Data Bases.ACM Digital Library,2015,8(10):998-1009.
[38]YUAN L,LIU J,YE J.Efficient Methods for OverlappingGroup Lasso[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2013,35(9):2104-2116.
[39]XU X,YURUK N,FENG Z,et al.SCAN:a Structural Clustering Algorithm for Networks[C]//International Conference on Knowledge Discovery and Data Mining.ACM Digital Library,2007:824-833.

相关文章 15

[1]	李斌, 万源. 基于相似度矩阵学习和矩阵校正的无监督多视角特征选择 Unsupervised Multi-view Feature Selection Based on Similarity Matrix Learning and Matrix Alignment 计算机科学, 2022, 49(8): 86-96. https://doi.org/10.11896/jsjkx.210700124
[2]	胡艳羽, 赵龙, 董祥军. 一种用于癌症分类的两阶段深度特征选择提取算法 Two-stage Deep Feature Selection Extraction Algorithm for Cancer Classification 计算机科学, 2022, 49(7): 73-78. https://doi.org/10.11896/jsjkx.210500092
[3]	康雁, 王海宁, 陶柳, 杨海潇, 杨学昆, 王飞, 李浩. 混合改进的花授粉算法与灰狼算法用于特征选择 Hybrid Improved Flower Pollination Algorithm and Gray Wolf Algorithm for Feature Selection 计算机科学, 2022, 49(6A): 125-132. https://doi.org/10.11896/jsjkx.210600135
[4]	杨健楠, 张帆. 一种结合双注意力机制和层次网络结构的细碎农作物分类方法 Classification Method for Small Crops Combining Dual Attention Mechanisms and Hierarchical Network Structure 计算机科学, 2022, 49(6A): 353-357. https://doi.org/10.11896/jsjkx.210200169
[5]	储安琪, 丁志军. 基于灰狼优化算法的信用评估样本均衡化与特征选择同步处理 Application of Gray Wolf Optimization Algorithm on Synchronous Processing of Sample Equalization and Feature Selection in Credit Evaluation 计算机科学, 2022, 49(4): 134-139. https://doi.org/10.11896/jsjkx.210300075
[6]	孙林, 黄苗苗, 徐久成. 基于邻域粗糙集和Relief的弱标记特征选择方法 Weak Label Feature Selection Method Based on Neighborhood Rough Sets and Relief 计算机科学, 2022, 49(4): 152-160. https://doi.org/10.11896/jsjkx.210300094
[7]	李宗然, 陈秀宏, 陆赟, 邵政毅. 鲁棒联合稀疏不相关回归 Robust Joint Sparse Uncorrelated Regression 计算机科学, 2022, 49(2): 191-197. https://doi.org/10.11896/jsjkx.210300034
[8]	张叶, 李志华, 王长杰. 基于核密度估计的轻量级物联网异常流量检测方法 Kernel Density Estimation-based Lightweight IoT Anomaly Traffic Detection Method 计算机科学, 2021, 48(9): 337-344. https://doi.org/10.11896/jsjkx.200600108
[9]	杨蕾, 降爱莲, 强彦. 基于自编码器和流形正则的结构保持无监督特征选择 Structure Preserving Unsupervised Feature Selection Based on Autoencoder and Manifold Regularization 计算机科学, 2021, 48(8): 53-59. https://doi.org/10.11896/jsjkx.200700211
[10]	侯春萍, 赵春月, 王致芃. 基于自反馈最优子类挖掘的视频异常检测算法 Video Abnormal Event Detection Algorithm Based on Self-feedback Optimal Subclass Mining 计算机科学, 2021, 48(7): 199-205. https://doi.org/10.11896/jsjkx.200800146
[11]	周钢, 郭福亮. 基于特征选择的高维数据集成学习方法研究 Research on Ensemble Learning Method Based on Feature Selection for High-dimensional Data 计算机科学, 2021, 48(6A): 250-254. https://doi.org/10.11896/jsjkx.200700102
[12]	丁思凡, 王锋, 魏巍. 一种基于标签相关度的Relief特征选择算法 Relief Feature Selection Algorithm Based on Label Correlation 计算机科学, 2021, 48(4): 91-96. https://doi.org/10.11896/jsjkx.200800025
[13]	滕俊元, 高猛, 郑小萌, 江云松. 噪声可容忍的软件缺陷预测特征选择方法 Noise Tolerable Feature Selection Method for Software Defect Prediction 计算机科学, 2021, 48(12): 131-139. https://doi.org/10.11896/jsjkx.201000168
[14]	张亚钏, 李浩, 宋晨明, 卜荣景, 王海宁, 康雁. 混合人工化学反应优化和狼群算法的特征选择 Hybrid Artificial Chemical Reaction Optimization with Wolf Colony Algorithm for Feature Selection 计算机科学, 2021, 48(11A): 93-101. https://doi.org/10.11896/jsjkx.210100067
[15]	董明刚, 黄宇扬, 敬超. 基于遗传实例和特征选择的K近邻训练集优化方法 K-Nearest Neighbor Classification Training Set Optimization Method Based on Genetic Instance and Feature Selection 计算机科学, 2020, 47(8): 178-184. https://doi.org/10.11896/jsjkx.190700089

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

基于网络结构的正则化逻辑回归

Logistic Regression with Regularization Based on Network Structure

PDF (PC)

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

Metrics

本文评价

推荐阅读 0