计算机科学 ›› 2015, Vol. 42 ›› Issue (9): 226-229.doi: 10.11896/j.issn.1002-137X.2015.09.043

• 人工智能 • 上一篇    下一篇

数据拟合中光滑参数的优化

王丽,王文剑,姜高霞   

  1. 山西大学计算机与信息技术学院 太原030006,山西大学计算机与信息技术学院 太原030006,山西大学计算机与信息技术学院 太原030006
  • 出版日期:2018-11-14 发布日期:2018-11-14
  • 基金资助:
    本文受国家自然科学基金(61273291,71031006),山西省回国留学人员科研资助

Optimization for Smoothing Parameter in Process of Data Fitting

WANG Li, WANG Wen-jian and JIANG Gao-xia   

  • Online:2018-11-14 Published:2018-11-14

摘要: 数据的函数化是函数数据分析(Functional Data Analysis,FDA)的基础,也是区别于其它分析方法的关键步骤。数据拟合作为数据函数化的主要方法,通常可转化为损失函数和正则项的优化问题,其中的光滑参数就起着权衡损失和过拟合风险的作用。在光滑参数的选择方法中,广义交叉验证(Generalized Cross-Validation,GCV)是一种通用而且较好的参数选择方法,然而GCV是对离散值进行计算,欲得到较准确的光滑参数仍需做大量的计算。针对此问题,提出拟合优化和差分两种求解策略以提高最优光滑参数的求解效率,并在算法精度及效率方面进行了比较分析。在模拟数据和真实数据上的实验结果表明:两种策略与常用的网格法相比,求解效率有较大提高,且算法精度方面几乎相同,此外差分求解策略在精度方面略优于拟合优化求解策略,而拟合优化求解策略的效率更高。

关键词: 光滑参数,广义交叉验证,差分求解策略,拟合优化求解策略

Abstract: Data functionalizing is the basis of functional data analysis (FDA) and important step differed from other analysis methods.As the main approach of data functionalizing,data fitting usually can be converted into an optimization problem including loss function and the regularization term,and smoothing parameter plays a compromising role in weighing loss and the risk of over fitting.Generalized cross-validation (GCV) is a general and better parameter selection way,but massive calculation may be needed in order to get a more accurate smoothing parameter because GCV is calculated on discrete values.Aiming at this problem,the fitting optimization and the finite difference solution strategies were proposed to improve the solution efficiency of selection of the optimal smoothing parameter,and their precision and efficiency were compared and analyzed.The experiment results on simulated and real data sets demonstrate that the two proposed strategies are greatly improved in efficiency compared with the conventional grid method with almost the same precision.The finite difference solution strategy is better than the fitting optimization solution strategy in terms of algorithm precision,and the latter is more efficient.

Key words: Smoothing parameter,Generalized cross-validation,Finite difference solution strategy,Fitting optimization solution strategy

[1] 杨威.函数型回归模型的成分选取[D].吉林:东北师范大学,2009 Yang Wei.Variable selection in functional regression model [D] Jilin:Northeast Normal University,2009
[2] 曾玉钰,翁金钟.函数数据聚类分析方法探析[J].统计与信息论坛,2007,22(5):10-14 Zeng Yu-yu,Weng Jin-zhong.Initiative research of the clustering approach about functional data [J].Statistics and Information Forum,2007,22(5):10-14
[3] 丁晗.基于函数型数据分析的高中学习成绩评价与预测[D].吉林:东北师范大学,2009 Ding Han.High school academic performance evaluation and prediction based on functional data analysis [D].Jilin:Northeast Normal University,2009
[4] 李树良.心电图形状分析的统计方法[D].上海:华东师范大学,2011 Li Shu-liang.Electrocardiogram (ecg) shape analysis of statistical method [D].Shanghai:East China Normal University,2011
[5] Florindo J B,Backes A R,de Castro M,et al.A comparative study on multiscale fractal dimension descriptors [J].Pattern Recognition Letters,2012,33(6):798-806
[6] Song J J,Deng W,Lee H J,et al.Optimal classification for time-course gene expression data using functional data analysis [J].Computational Biology and Chemistry,2008,32(6):426-432
[7] Park C,Koo J Y,Kim S,et al.Classification of gene functions using support vector machine for time-course gene expression data[J].Computational Statistics & Data Analysis,2008,52(5):2578-2587
[8] Suhaila J,Jemain A A,Hamdan M F,et al.Comparing rainfall patterns between regions in Peninsular Malaysia via a functional data analysis technique[J].Journal of Hydrology,2011,411(3):197-206
[9] 朱建平.基于模型参数基展开的函数回归及其应用[J].商业经济与管理,2009,208(2):81-85 Zhu Jian-ping.The functional regression model based on basis function expansion of model parameter and its application [J].Journal of business economics,2009,208(2):81-85
[10] 靳刘蕊.函数型数据分析方法及应用研究[D].厦门:厦门大学,2008 Jin Liu-rui.The study on the methods of functional data analysis and their application [D].Xiamen:Xiamen University,2008
[11] 李静,田卫东.基于B样条隶属函数的模糊推理系统[J].计算机应用,2011,31(2):490-492 Li Jing,Tian Wei-dong.Fuzzy inference system based on B-spline membership function [J].Journal of Computer Applications,2011,31(2):490-492
[12] 李红广.基于B样条基粗糙惩罚的某些约束函数型数据光滑方法研究[D].上海:华东师范大学,2008 Li Hong-guang.Research on methods of functional data with some constraints Based on b-spline basis with roughness penalty [D].Shanghai:East China Normal University,2008
[13] 王剑.线性回归系数的Stein估计[D].武汉:华中科技大学,2007 Wang Jian.Stein estimation for linear regression coefficient [D].Wuhan:Huazhong University of Science and Technology,2007
[14] Ramsay J O,Silverman B W.Functional Data Analysis(secondedition)[M].NewYork:Springer,2005
[15] 姜高霞,王文剑.经济周期波动的函数型时序分解方法[J].统计与信息论坛,2014,29(3):22-28 Jiang Gao-xia,Wang Wen-jian.Functional decomposition model of time series on business cycle analysis [J].Statistics and Information Forum,2014,29(3):22-28
[16] Graven P,Wabba G.Smoothing noisy data with spline functions [J].Number Math,1979,31(4):377-403
[17] Ramsay J O,Silverman B W.Applied functional data analysis:methods and case studies [M].Springer,2002
[18] 顾坚,刘伟.面向NUMA集群的代数多重网格算法优化[J].计算机科学,2014,4(2):114-118 Gu Jian,Liu Wei.Optimizing algebraic multigrid on NUMA-based cluster system [J].Computer Science,2014,41(2):114-118

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!