Computer Science ›› 2021, Vol. 48 ›› Issue (7): 281-291.doi: 10.11896/jsjkx.201100106

• Artificial Intelligence • Previous Articles     Next Articles

Logistic Regression with Regularization Based on Network Structure

HU Yan-mei1, YANG Bo2, DUO Bin1   

  1. 1 College of Computer Science and Cyber Security,Chengdu University of Technology,Chengdu 610059,China
    2 School of Computer Science and Engineering,University of Electronic Science and Technology of China,Chengdu 611731,China
  • Received:2020-11-13 Revised:2021-03-24 Online:2021-07-15 Published:2021-07-02
  • About author:HU Yan-mei,born in 1984,Ph.D,asso-ciate professor,master supervisor,is a member of China Computer Federation.Her main research interests include data mining,social and information networks analysis,machine learning and evolutionary computation.
  • Supported by:
    Natural Science Foundation of China(61802034,61977013),National Key Research and Development Program of China(2019YFC1509602) and Sichuan Science and Technology Program(2021YFG0333).

Abstract: Logistic regression is widely used as classification model.However,as the task of high-dimensional data classification becomes more and more frequent in practical application,the classification model is facing great challenge.Regularization is an effective approach to this challenge.Many existing regularized logistic regression models directly use L1-norm penalty as regularized penalty term without considering the complex relationships among features.There are also some regularization penalty terms designed on the basis of group information of features,but assuming that the group information is prior knowledge.This paper explores the pattern hidden in feature data from the perspective of network and then proposes a regularized logistic regression model based on the network structure.Firstly,this paper constructs feature network by describing feature data in the form of network.Secondly,it observes and analyzes the feature network from the perspective of network science and designs a penalty function based on the observation.Thirdly,it proposes a logistic regression model with network structured Lasso by taking the penalty function as regularized penalty term.Lastly,it infers the solution of the model by combining the Nesterov’s accelerated proximal gradient method and the Moreau-Yosida regularization method.Experiments on real datasets show that the proposed regularized logistic regression performs excellently,which demonstrates that observing and analyzing feature data from the perspective of network is a potential way to study regularized model.

Key words: Feature selection, Logistic regression, Network structure, Proximal gradient method, Regularized penalty term

CLC Number: 

  • TP181
[1]GEORGIANA I,GÖKHAN B,GERHARD W.Fast LogisticRegression for Text Categorization with Variable-length N-grams[C]//International Conference on Knowledge Discovery and Data Mining.ACM Digital Library,2008:354-362.
[2]FRIEDMAN J,HASTIE T,TIBSHIRANI R.Additive Logistic Regression:a Statistical View of Boosting[J].The Annals of Statistics,2000,28:337-407.
[3]JURAFSKY D,MARTIN J H.Speech and Language Proces-sing:An Introduction to Natural Language Processing[M]//Computational Linguistics and Speech Recognition.Upper Saddle River:Prentice Hall,2000.
[4]ZHANG X C,ZHANG Q Z,WANG X F,et al.StructuredSparse Logistic Regression with Application to Lung Cancer Prediction Using Breath Volatile Biomarkers[J].Statistic in Medicine,2020,39(7):955-967.
[5]LEE S,LEE H,ABBEEL P,et al.Efficient L1 Regularized Logistic Regression[C]//The 21st AAAI Conference on Artificial Intelligence.AAAI,2006:401-408.
[6]LIU J,CHEN J,YE J P.Large-scale Sparse Logistic Regression[C]//International on Knowledge Discovery and Data Mining.ACM Digital Library,2009:547-556.
[7]NG A Y.Feature Selection,L1 vs L2 Regularization,and Rota-tional Invariance[C]//International Conference on Machine Learning.ACM Digital Library,2004:78-85.
[8]ZOU H,HASTIE T.Regularization and Variable Selection viathe Elastic Net[J].Journal of the Royal Statistical Society:Series B(Statistical Methodology),2005,67(2):301-320.
[9]ZENG L M,XIE J.Group Variable Selection via Scad-l2[J].A Journal of Theoretical and Applied Statistics,2014,48(1):49-66.
[10]BECKER N,TOEDT G,LICHTER P,et al.Elastic SCAD as a Novel Penalization Method for SVM Classification Task in High-dimensional Data[J].BMC Bioinformatics,2011,12(1):1-13.
[11]LORBERT A,RAMADGE P J.The Pairwise Elastic Net Support Vector Machine for Automatic fMRI Feature Selection[C]//International Conference on Speech and Signal Proces-sing.IEEE,2013:1036-1040.
[12]LORBERT A,EIS D,KOSTINA V,et al.Exploiting Covariate Similarity in Sparse Regression via the Pairwise Elastic Net[C]//International Conference on Artificial Intelligence and Statistics.PMLR,2010:477-484.
[13]GRAVE E,OBOZINSKI G R,BACH F R.Trace Lasso:a Trace Norm Regularization for Correlated Designs[C]//Annual Conference on Neural Information Processing Systems.ACM Digital Library,2011:2187-2195.
[14]TIBSHIRANI R,SAUNDERS M,ROSSET S,et al.Sparsityand Smoothness via the Fused Lasso[J].Journal of the Royal Statistical Society:Series B (Statistics Methodology),2005,67(1):91-108.
[15]RINALDO A.Properties and Refinements of the Fused Lasso[J].The Annals of Statistics,2009,37(5B):2922-2952.
[16]TIBSHIRANI R,WANG P.Spatial Smoothing and Hot SpotDetection for CGH Data Using the Fused Lasso[J].Biostatistics,2008,9(1):18-29.
[17]RAPAPORT F,BARILLOT E,VERT J P.Classification of ArrayCGH Data Using Fused SVM[J].Bioinformatics,2008,24(13):375-382.
[18]ZHOU J,LIU J,NARAYAN V A,et al.Modeling Disease Progression via Fused Sparse Group Lasso[C]//International Conference on Knowledge Discovery and Data Mining.ACM Digital Library,2012:1095-1103.
[19]YE G B,XIE X.Split Bregman Method for Large Scale Fused Lasso[J].Computational Statistics & Data Analysis,2011,55(4):1552-1569.
[20]HOEFLING H.A Path Algorithm for The Fused Lasso Signal Approximator[J].Journal of Computational and Graphical Statistics,2010,19(4):984-1066.
[21]DAYE Z J,JENG X J.Shrinkage and Model Selection with Correlated Variables via Weighted Fusion[J].Computational Statistics & Data Analysis,2009,53(4):1284-1298.
[22]BONDELL H D,REICH B J.Simultaneous Regression Shrin-kage,Variable Selection,and Supervised Clustering of Predictors with OSCAR[J].Biometrics,2008,64(1):115-123.
[23]ZENG X R,FIGUEIREDO M A.Solving OSCAR Regularization Problems by Fast Approximate Proximal Splitting Algorithms[J].Digital Signal Processing,2014,31(1):124-135.
[24]YUAN M,LIN Y.Model Selection and Estimation in Regressionwith Grouped Variables[J].Journal of the Royal Statistics So-ciety:Series B (Statistical Methodology),2006,68(1):49-67.
[25]JACOB L,OBOZINSKI G,VERT J P.Group Lasso with Overlap and Graph Lasso[C]//Annual International Conference on Machine Learning.ACM Digital Library,2009:433-440.
[26]LIU J,YE J.Moreau-Yosida Regularization for Grouped Tree Structure Learning[C]//Annual Conference on Neural Information Processing Systems.ACM Digital Library,2010:1459-1467.
[27]SIMON N,FRIEDMAN J,HASTIE T,et al.A Sparse-Group Lasso[J].Journal of Computational and Graphical Statistics,2013,22(2):231-245.
[28]FRIEDMAN J,HASTIE T,TIBSHIRANI R.A Note on theGroup Lasso and a Sparse Group Lasso[J].arXiv:1001.0736,2010.
[29]LIU J W,CUI L P,LIU Z Y,et al.Sparse on the Regularized Sparse Models[J].Chinese Journal of Computers,2015,38(7):1307-1325.
[30]YE J,LIU J.Sparse Methods for Biomedical Data[J].ACMSIGKDD Explorations Newsletter,2012,14(1):4-14.
[31]VINCENT M,HANSEN N R.Sparse Group Lasso and HighDimensional Multinomial Classification[J].Computational Statistics & Data Analysis,2014,71:771-786.
[32]NESTEROV Y.Smooth Minimization of Non-Smooth Functions[J].Mathematical Programming,2005,103(1):127-152.
[33]NESTEROV Y.Gradient Methods for Minimizing CompositeObjective Function[J].Mathematical Programming,2013,140(1):125-161.
[34]TIBSHIRANI R.Regression Shrinkage and Selection via theLasso[J].Journal of the Royal Statistical Society:Series B (Methodological),1996,58(1):267-288.
[35]SHEVADE S K,KEERTHI S S.A Simple and Efficient Algorithm for Gene Selection Using Sparse Logistic Regression[J].Bioinformatics,2003,19(17):2246-2253.
[36]HU Y,REN Y,WANG Q.A Feature Selection Based on Network Structure for Credit Card Default Prediction[C]//Chinese Conference on Computer Supported Cooperative Work and Social Computing.Springer,2019:275-286.
[37]WANG M,WANG C,YU X J,et al.Community Detection in Social Networks:an In-depth Benchmarking Study with a Procedure-oriented Framework[C]//International Conference on Very Large Data Bases.ACM Digital Library,2015,8(10):998-1009.
[38]YUAN L,LIU J,YE J.Efficient Methods for OverlappingGroup Lasso[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2013,35(9):2104-2116.
[39]XU X,YURUK N,FENG Z,et al.SCAN:a Structural Clustering Algorithm for Networks[C]//International Conference on Knowledge Discovery and Data Mining.ACM Digital Library,2007:824-833.
[1] LI Bin, WAN Yuan. Unsupervised Multi-view Feature Selection Based on Similarity Matrix Learning and Matrix Alignment [J]. Computer Science, 2022, 49(8): 86-96.
[2] HU Yan-yu, ZHAO Long, DONG Xiang-jun. Two-stage Deep Feature Selection Extraction Algorithm for Cancer Classification [J]. Computer Science, 2022, 49(7): 73-78.
[3] YANG Jian-nan, ZHANG Fan. Classification Method for Small Crops Combining Dual Attention Mechanisms and Hierarchical Network Structure [J]. Computer Science, 2022, 49(6A): 353-357.
[4] KANG Yan, WANG Hai-ning, TAO Liu, YANG Hai-xiao, YANG Xue-kun, WANG Fei, LI Hao. Hybrid Improved Flower Pollination Algorithm and Gray Wolf Algorithm for Feature Selection [J]. Computer Science, 2022, 49(6A): 125-132.
[5] LI Jing-tai, WANG Xiao-dan. XGBoost for Imbalanced Data Based on Cost-sensitive Activation Function [J]. Computer Science, 2022, 49(5): 135-143.
[6] CHU An-qi, DING Zhi-jun. Application of Gray Wolf Optimization Algorithm on Synchronous Processing of Sample Equalization and Feature Selection in Credit Evaluation [J]. Computer Science, 2022, 49(4): 134-139.
[7] SUN Lin, HUANG Miao-miao, XU Jiu-cheng. Weak Label Feature Selection Method Based on Neighborhood Rough Sets and Relief [J]. Computer Science, 2022, 49(4): 152-160.
[8] LI Zong-ran, CHEN XIU-Hong, LU Yun, SHAO Zheng-yi. Robust Joint Sparse Uncorrelated Regression [J]. Computer Science, 2022, 49(2): 191-197.
[9] ZHANG Ye, LI Zhi-hua, WANG Chang-jie. Kernel Density Estimation-based Lightweight IoT Anomaly Traffic Detection Method [J]. Computer Science, 2021, 48(9): 337-344.
[10] YANG Lei, JIANG Ai-lian, QIANG Yan. Structure Preserving Unsupervised Feature Selection Based on Autoencoder and Manifold Regularization [J]. Computer Science, 2021, 48(8): 53-59.
[11] HOU Chun-ping, ZHAO Chun-yue, WANG Zhi-peng. Video Abnormal Event Detection Algorithm Based on Self-feedback Optimal Subclass Mining [J]. Computer Science, 2021, 48(7): 199-205.
[12] ZHOU Gang, GUO Fu-liang. Research on Ensemble Learning Method Based on Feature Selection for High-dimensional Data [J]. Computer Science, 2021, 48(6A): 250-254.
[13] DING Si-fan, WANG Feng, WEI Wei. Relief Feature Selection Algorithm Based on Label Correlation [J]. Computer Science, 2021, 48(4): 91-96.
[14] TENG Jun-yuan, GAO Meng, ZHENG Xiao-meng, JIANG Yun-song. Noise Tolerable Feature Selection Method for Software Defect Prediction [J]. Computer Science, 2021, 48(12): 131-139.
[15] ZHANG Ya-chuan, LI Hao, SONG Chen-ming, BU Rong-jing, WANG Hai-ning, KANG Yan. Hybrid Artificial Chemical Reaction Optimization with Wolf Colony Algorithm for Feature Selection [J]. Computer Science, 2021, 48(11A): 93-101.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!