计算机科学 ›› 2021, Vol. 48 ›› Issue (7): 281-291.doi: 10.11896/jsjkx.201100106

• 人工智能 • 上一篇    下一篇

基于网络结构的正则化逻辑回归

胡艳梅1, 杨波2, 多滨1   

  1. 1 成都理工大学计算机与网络安全学院 成都610059
    2 电子科技大学计算机科学与工程学院 成都611731
  • 收稿日期:2020-11-13 修回日期:2021-03-24 出版日期:2021-07-15 发布日期:2021-07-02
  • 通讯作者: 胡艳梅(huyanmei@cdut.edu.cn)
  • 基金资助:
    国家自然科学基金(61802034,61977013);国家重点研发项目基金(2019YFC1509602);四川省科技计划重点研发项目基金(2021YFG0333)

Logistic Regression with Regularization Based on Network Structure

HU Yan-mei1, YANG Bo2, DUO Bin1   

  1. 1 College of Computer Science and Cyber Security,Chengdu University of Technology,Chengdu 610059,China
    2 School of Computer Science and Engineering,University of Electronic Science and Technology of China,Chengdu 611731,China
  • Received:2020-11-13 Revised:2021-03-24 Online:2021-07-15 Published:2021-07-02
  • About author:HU Yan-mei,born in 1984,Ph.D,asso-ciate professor,master supervisor,is a member of China Computer Federation.Her main research interests include data mining,social and information networks analysis,machine learning and evolutionary computation.
  • Supported by:
    Natural Science Foundation of China(61802034,61977013),National Key Research and Development Program of China(2019YFC1509602) and Sichuan Science and Technology Program(2021YFG0333).

摘要: 逻辑回归是一个应用广泛的分类模型,但由于高维数据分类任务在实际应用中变得越来越频繁,使得分类模型面临着巨大的挑战。应对该挑战的一种有效方法是对模型进行正则化。许多已有的正则化逻辑回归直接运用L1范数罚作为正则化罚项,而不考虑特征之间的复杂关联关系。也有一些研究工作基于特征的组信息设计了正则化罚项,但它们假设组信息是预先给定的。文中从网络的视角对特征数据中存在的潜在模式进行挖掘,并基于此提出了一个基于网络结构的正则化逻辑回归。首先,以网络的形式描述特征数据并构建出特征网络;其次,从网络科学的角度对特征网络进行观察和分析,并基于此设计罚函数;然后,以该罚函数为正则化罚项,提出网络结构Lasso逻辑回归;最后,结合Nesterov加速近端梯度下降法和Moreau-Yosida正则化方法,推导了模型的求解过程。在真实数据集上的实验结果显示,所提网络结构Lasso逻辑回归表现优异,这表明从网络的视角观察和分析特征数据是研究正则化模型的一个具有潜力的方向。

关键词: 近端梯度下降法, 逻辑回归, 特征选择, 网络结构, 正则化罚项

Abstract: Logistic regression is widely used as classification model.However,as the task of high-dimensional data classification becomes more and more frequent in practical application,the classification model is facing great challenge.Regularization is an effective approach to this challenge.Many existing regularized logistic regression models directly use L1-norm penalty as regularized penalty term without considering the complex relationships among features.There are also some regularization penalty terms designed on the basis of group information of features,but assuming that the group information is prior knowledge.This paper explores the pattern hidden in feature data from the perspective of network and then proposes a regularized logistic regression model based on the network structure.Firstly,this paper constructs feature network by describing feature data in the form of network.Secondly,it observes and analyzes the feature network from the perspective of network science and designs a penalty function based on the observation.Thirdly,it proposes a logistic regression model with network structured Lasso by taking the penalty function as regularized penalty term.Lastly,it infers the solution of the model by combining the Nesterov’s accelerated proximal gradient method and the Moreau-Yosida regularization method.Experiments on real datasets show that the proposed regularized logistic regression performs excellently,which demonstrates that observing and analyzing feature data from the perspective of network is a potential way to study regularized model.

Key words: Feature selection, Logistic regression, Network structure, Proximal gradient method, Regularized penalty term

中图分类号: 

  • TP181
[1]GEORGIANA I,GÖKHAN B,GERHARD W.Fast LogisticRegression for Text Categorization with Variable-length N-grams[C]//International Conference on Knowledge Discovery and Data Mining.ACM Digital Library,2008:354-362.
[2]FRIEDMAN J,HASTIE T,TIBSHIRANI R.Additive Logistic Regression:a Statistical View of Boosting[J].The Annals of Statistics,2000,28:337-407.
[3]JURAFSKY D,MARTIN J H.Speech and Language Proces-sing:An Introduction to Natural Language Processing[M]//Computational Linguistics and Speech Recognition.Upper Saddle River:Prentice Hall,2000.
[4]ZHANG X C,ZHANG Q Z,WANG X F,et al.StructuredSparse Logistic Regression with Application to Lung Cancer Prediction Using Breath Volatile Biomarkers[J].Statistic in Medicine,2020,39(7):955-967.
[5]LEE S,LEE H,ABBEEL P,et al.Efficient L1 Regularized Logistic Regression[C]//The 21st AAAI Conference on Artificial Intelligence.AAAI,2006:401-408.
[6]LIU J,CHEN J,YE J P.Large-scale Sparse Logistic Regression[C]//International on Knowledge Discovery and Data Mining.ACM Digital Library,2009:547-556.
[7]NG A Y.Feature Selection,L1 vs L2 Regularization,and Rota-tional Invariance[C]//International Conference on Machine Learning.ACM Digital Library,2004:78-85.
[8]ZOU H,HASTIE T.Regularization and Variable Selection viathe Elastic Net[J].Journal of the Royal Statistical Society:Series B(Statistical Methodology),2005,67(2):301-320.
[9]ZENG L M,XIE J.Group Variable Selection via Scad-l2[J].A Journal of Theoretical and Applied Statistics,2014,48(1):49-66.
[10]BECKER N,TOEDT G,LICHTER P,et al.Elastic SCAD as a Novel Penalization Method for SVM Classification Task in High-dimensional Data[J].BMC Bioinformatics,2011,12(1):1-13.
[11]LORBERT A,RAMADGE P J.The Pairwise Elastic Net Support Vector Machine for Automatic fMRI Feature Selection[C]//International Conference on Speech and Signal Proces-sing.IEEE,2013:1036-1040.
[12]LORBERT A,EIS D,KOSTINA V,et al.Exploiting Covariate Similarity in Sparse Regression via the Pairwise Elastic Net[C]//International Conference on Artificial Intelligence and Statistics.PMLR,2010:477-484.
[13]GRAVE E,OBOZINSKI G R,BACH F R.Trace Lasso:a Trace Norm Regularization for Correlated Designs[C]//Annual Conference on Neural Information Processing Systems.ACM Digital Library,2011:2187-2195.
[14]TIBSHIRANI R,SAUNDERS M,ROSSET S,et al.Sparsityand Smoothness via the Fused Lasso[J].Journal of the Royal Statistical Society:Series B (Statistics Methodology),2005,67(1):91-108.
[15]RINALDO A.Properties and Refinements of the Fused Lasso[J].The Annals of Statistics,2009,37(5B):2922-2952.
[16]TIBSHIRANI R,WANG P.Spatial Smoothing and Hot SpotDetection for CGH Data Using the Fused Lasso[J].Biostatistics,2008,9(1):18-29.
[17]RAPAPORT F,BARILLOT E,VERT J P.Classification of ArrayCGH Data Using Fused SVM[J].Bioinformatics,2008,24(13):375-382.
[18]ZHOU J,LIU J,NARAYAN V A,et al.Modeling Disease Progression via Fused Sparse Group Lasso[C]//International Conference on Knowledge Discovery and Data Mining.ACM Digital Library,2012:1095-1103.
[19]YE G B,XIE X.Split Bregman Method for Large Scale Fused Lasso[J].Computational Statistics & Data Analysis,2011,55(4):1552-1569.
[20]HOEFLING H.A Path Algorithm for The Fused Lasso Signal Approximator[J].Journal of Computational and Graphical Statistics,2010,19(4):984-1066.
[21]DAYE Z J,JENG X J.Shrinkage and Model Selection with Correlated Variables via Weighted Fusion[J].Computational Statistics & Data Analysis,2009,53(4):1284-1298.
[22]BONDELL H D,REICH B J.Simultaneous Regression Shrin-kage,Variable Selection,and Supervised Clustering of Predictors with OSCAR[J].Biometrics,2008,64(1):115-123.
[23]ZENG X R,FIGUEIREDO M A.Solving OSCAR Regularization Problems by Fast Approximate Proximal Splitting Algorithms[J].Digital Signal Processing,2014,31(1):124-135.
[24]YUAN M,LIN Y.Model Selection and Estimation in Regressionwith Grouped Variables[J].Journal of the Royal Statistics So-ciety:Series B (Statistical Methodology),2006,68(1):49-67.
[25]JACOB L,OBOZINSKI G,VERT J P.Group Lasso with Overlap and Graph Lasso[C]//Annual International Conference on Machine Learning.ACM Digital Library,2009:433-440.
[26]LIU J,YE J.Moreau-Yosida Regularization for Grouped Tree Structure Learning[C]//Annual Conference on Neural Information Processing Systems.ACM Digital Library,2010:1459-1467.
[27]SIMON N,FRIEDMAN J,HASTIE T,et al.A Sparse-Group Lasso[J].Journal of Computational and Graphical Statistics,2013,22(2):231-245.
[28]FRIEDMAN J,HASTIE T,TIBSHIRANI R.A Note on theGroup Lasso and a Sparse Group Lasso[J].arXiv:1001.0736,2010.
[29]LIU J W,CUI L P,LIU Z Y,et al.Sparse on the Regularized Sparse Models[J].Chinese Journal of Computers,2015,38(7):1307-1325.
[30]YE J,LIU J.Sparse Methods for Biomedical Data[J].ACMSIGKDD Explorations Newsletter,2012,14(1):4-14.
[31]VINCENT M,HANSEN N R.Sparse Group Lasso and HighDimensional Multinomial Classification[J].Computational Statistics & Data Analysis,2014,71:771-786.
[32]NESTEROV Y.Smooth Minimization of Non-Smooth Functions[J].Mathematical Programming,2005,103(1):127-152.
[33]NESTEROV Y.Gradient Methods for Minimizing CompositeObjective Function[J].Mathematical Programming,2013,140(1):125-161.
[34]TIBSHIRANI R.Regression Shrinkage and Selection via theLasso[J].Journal of the Royal Statistical Society:Series B (Methodological),1996,58(1):267-288.
[35]SHEVADE S K,KEERTHI S S.A Simple and Efficient Algorithm for Gene Selection Using Sparse Logistic Regression[J].Bioinformatics,2003,19(17):2246-2253.
[36]HU Y,REN Y,WANG Q.A Feature Selection Based on Network Structure for Credit Card Default Prediction[C]//Chinese Conference on Computer Supported Cooperative Work and Social Computing.Springer,2019:275-286.
[37]WANG M,WANG C,YU X J,et al.Community Detection in Social Networks:an In-depth Benchmarking Study with a Procedure-oriented Framework[C]//International Conference on Very Large Data Bases.ACM Digital Library,2015,8(10):998-1009.
[38]YUAN L,LIU J,YE J.Efficient Methods for OverlappingGroup Lasso[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2013,35(9):2104-2116.
[39]XU X,YURUK N,FENG Z,et al.SCAN:a Structural Clustering Algorithm for Networks[C]//International Conference on Knowledge Discovery and Data Mining.ACM Digital Library,2007:824-833.
[1] 李斌, 万源.
基于相似度矩阵学习和矩阵校正的无监督多视角特征选择
Unsupervised Multi-view Feature Selection Based on Similarity Matrix Learning and Matrix Alignment
计算机科学, 2022, 49(8): 86-96. https://doi.org/10.11896/jsjkx.210700124
[2] 胡艳羽, 赵龙, 董祥军.
一种用于癌症分类的两阶段深度特征选择提取算法
Two-stage Deep Feature Selection Extraction Algorithm for Cancer Classification
计算机科学, 2022, 49(7): 73-78. https://doi.org/10.11896/jsjkx.210500092
[3] 康雁, 王海宁, 陶柳, 杨海潇, 杨学昆, 王飞, 李浩.
混合改进的花授粉算法与灰狼算法用于特征选择
Hybrid Improved Flower Pollination Algorithm and Gray Wolf Algorithm for Feature Selection
计算机科学, 2022, 49(6A): 125-132. https://doi.org/10.11896/jsjkx.210600135
[4] 杨健楠, 张帆.
一种结合双注意力机制和层次网络结构的细碎农作物分类方法
Classification Method for Small Crops Combining Dual Attention Mechanisms and Hierarchical Network Structure
计算机科学, 2022, 49(6A): 353-357. https://doi.org/10.11896/jsjkx.210200169
[5] 储安琪, 丁志军.
基于灰狼优化算法的信用评估样本均衡化与特征选择同步处理
Application of Gray Wolf Optimization Algorithm on Synchronous Processing of Sample Equalization and Feature Selection in Credit Evaluation
计算机科学, 2022, 49(4): 134-139. https://doi.org/10.11896/jsjkx.210300075
[6] 孙林, 黄苗苗, 徐久成.
基于邻域粗糙集和Relief的弱标记特征选择方法
Weak Label Feature Selection Method Based on Neighborhood Rough Sets and Relief
计算机科学, 2022, 49(4): 152-160. https://doi.org/10.11896/jsjkx.210300094
[7] 李宗然, 陈秀宏, 陆赟, 邵政毅.
鲁棒联合稀疏不相关回归
Robust Joint Sparse Uncorrelated Regression
计算机科学, 2022, 49(2): 191-197. https://doi.org/10.11896/jsjkx.210300034
[8] 张叶, 李志华, 王长杰.
基于核密度估计的轻量级物联网异常流量检测方法
Kernel Density Estimation-based Lightweight IoT Anomaly Traffic Detection Method
计算机科学, 2021, 48(9): 337-344. https://doi.org/10.11896/jsjkx.200600108
[9] 杨蕾, 降爱莲, 强彦.
基于自编码器和流形正则的结构保持无监督特征选择
Structure Preserving Unsupervised Feature Selection Based on Autoencoder and Manifold Regularization
计算机科学, 2021, 48(8): 53-59. https://doi.org/10.11896/jsjkx.200700211
[10] 侯春萍, 赵春月, 王致芃.
基于自反馈最优子类挖掘的视频异常检测算法
Video Abnormal Event Detection Algorithm Based on Self-feedback Optimal Subclass Mining
计算机科学, 2021, 48(7): 199-205. https://doi.org/10.11896/jsjkx.200800146
[11] 周钢, 郭福亮.
基于特征选择的高维数据集成学习方法研究
Research on Ensemble Learning Method Based on Feature Selection for High-dimensional Data
计算机科学, 2021, 48(6A): 250-254. https://doi.org/10.11896/jsjkx.200700102
[12] 丁思凡, 王锋, 魏巍.
一种基于标签相关度的Relief特征选择算法
Relief Feature Selection Algorithm Based on Label Correlation
计算机科学, 2021, 48(4): 91-96. https://doi.org/10.11896/jsjkx.200800025
[13] 滕俊元, 高猛, 郑小萌, 江云松.
噪声可容忍的软件缺陷预测特征选择方法
Noise Tolerable Feature Selection Method for Software Defect Prediction
计算机科学, 2021, 48(12): 131-139. https://doi.org/10.11896/jsjkx.201000168
[14] 张亚钏, 李浩, 宋晨明, 卜荣景, 王海宁, 康雁.
混合人工化学反应优化和狼群算法的特征选择
Hybrid Artificial Chemical Reaction Optimization with Wolf Colony Algorithm for Feature Selection
计算机科学, 2021, 48(11A): 93-101. https://doi.org/10.11896/jsjkx.210100067
[15] 董明刚, 黄宇扬, 敬超.
基于遗传实例和特征选择的K近邻训练集优化方法
K-Nearest Neighbor Classification Training Set Optimization Method Based on Genetic Instance and Feature Selection
计算机科学, 2020, 47(8): 178-184. https://doi.org/10.11896/jsjkx.190700089
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!