计算机科学 ›› 2024, Vol. 51 ›› Issue (3): 128-134.doi: 10.11896/jsjkx.221200055
郑伟楠1, 於志勇1,2, 黄昉菀1,2
ZHENG Weinan1, YU Zhiyong1,2, HUANG Fangwan1,2
摘要: 随着物联网的发展,众多传感器采集到大量具有丰富数据相关性的时间序列,为各种数据挖掘应用提供强大的数据支持。然而,一些客观或主观原因(如设备故障、稀疏感知等)往往会造成采集到的数据出现不同程度的缺失。虽然已有很多方法被提出用于解决这一问题,但这些方法在数据相关性方面或考虑不够全面,或计算成本过高。而且,现有方法仅关注对缺失值的补全,未能兼顾下游应用。针对上述不足,设计了一种兼顾补全与预测任务的双通道回声状态网络。两个通道的网络虽共用输入层,但具有各自的储备池和输出层。两者最大的区别是左/右通道的输出层分别表示输入层前/后一个时刻对应的目标值或预补值。最后将两个通道的估计值进行融合,充分利用来自缺失时刻之前和之后的数据相关性以进一步提升性能。两种缺失现象下(随机缺失和分段缺失)不同缺失率的实验结果表明,所提模型无论是在补全精度还是预测精度上都优于目前流行的各类方法。
中图分类号:
[1]LIN W C,TSAI C F.Missing value imputation:a review andanalysis of the literature(2006-2017)[J].Artificial Intelligence Review,2020,53(2):1487-1509. [2]WANG L,ZHANG D,WANG Y,et al.Sparse mobile crowd-sensing:challenges and opportunities[J].IEEE Communications Magazine,2016,54(7):161-167. [3]WANG S,ZHENG F,ZHAO D.Research on causal network of high-dimensional time series with insufficient information[J].Journal of Chinese Computer Systems,2023,44(5):981-990. [4]HUANG F,ZHENG W,GUO W,et al.Estimating missing data for sparsely sensed time series with exogenous variables using bidirectional-feedback echo state networks[J].CCF Transactions on Pervasive Computing and Interaction,2023,5(1):45-63. [5]YOON J,ZAME W R,VAN DER SCHAAR M.Estimatingmissing data in temporal data streams using multi-directional recurrent neural networks[J].IEEE Transactions on Biomedical Engineering,2018,66(5):1477-1490. [6]LIU Y,ZHAO N,VANOS J K,et al.Effects of synoptic wea-ther on ground-level PM2.5 concentrations in the United States[J].Atmospheric Environment,2017,148:297-305. [7]BOQUET G,MORELL A,SERRANO J,et al.A variationalutoencoder solution for road traffic forecasting systems:Missing data imputation,dimension reduction,model selection and anomaly detection[J].Transportation Research Part C:Emerging Technologies,2020,115:102622. [8]ZHANG Z,LIN X,LI M,et al.A customized deep learning approach to integrate network-scale online traffic data imputation and prediction[J].Transportation Research Part C:Emerging Technologies,2021,132:103372. [9]GRIGORYEVA L,ORTEGA J P.Echo state networks are universal[J].Neural Networks,2018,108:495-508. [10]NOORN M,AL BAKRI ABDULLAHM M,YAHAYAA S,et al.Comparison of linear interpolation method and mean me-thod to replace the missing values in environmental data set[C]//Materials Science Forum.Trans. Tech. Publications Ltd.,2015,803:278-281. [11]KARIM S A A,ISMAIL M T,OTHMAN M,et al.Rational cubic spline interpolation for missing solar data imputation[J].Journal of Engineering and Applied Sciences,2018,13(9):2587-2592. [12]SONG X,GUO Y,LI N,et al.Missing data prediction based on compressive sensing in time series[J].Computer Science,2019,46(6):35-40. [13]JADHAV A,PRAMOD D,RAMANATHAN K.Comparison of performance of data imputation methods for numeric dataset[J].Applied Artificial Intelligence,2019,33(10):913-933. [14]LOH W Y,ZHANG Q,ZHANG W,et al.Missing data,imputation and regression trees[J].Statistica Sinica,2020,30(4):1697-1722. [15]SANTOS M S,ABREU P H,WILK S,et al.How distance metrics influence missing data imputation with k-nearest neighbours[J].Pattern Recognition Letters,2020,136:111-119. [16]ZHANG S,GONG L,ZENG Q,et al.Imputation of GPS coordinate time series using MissForest[J].Remote Sensing,2021,13(12):2312. [17]LAQUEUR H S,SHEV A B,KAGAWA R M C.SuperMICE:an ensemble machine learning approach to multiple imputation by chained equations[J].American Journal of Epidemiology,2022,191(3):516-525. [18]XIONG Z,WEI Y,XU R,et al.Low-rank traffic matrix completion with marginal information[J].Journal of Computational and Applied Mathematics,2022,410:114219. [19]YU H F,RAO N,DHILLON I S.Temporal regularized matrix factorization for high-dimensional time series prediction[C]//Proceedings of the 30th International Conference on Neural Information Processing Systems.2016:847-855. [20]KONG L,XIA M,LIU X Y,et al.Data loss and reconstruction in wireless sensor networks[J].IEEE Transactions on Parallel and Distributed Systems,2013,25(11):2818-2828. [21]CHEN X,YANG J,SUN L.A nonconvex low-rank tensor completion model for spatiotemporal traffic data imputation[J].Transportation Research Part C:Emerging Technologies,2020,117(8):102673.1-102673-12. [22]KIM Y J,CHI M.Temporal belief memory:imputing missing data during RNN training[C]//Proceedings of the 27th International Joint Conference on Artificial Intelligence.2018:2326-2332. [23]YUAN H,XU G,YAO Z,et al.Imputation of missing data in time series for air pollutants using long short-term memory recurrent neural networks[C]//Proceedings of the 2018 ACM International Joint Conference and 2018 International Symposium on Pervasive and Ubiquitous Computing and Wearable Compu-ters.2018:1293-1300. [24]CHE Z,PURUSHOTHAM S,CHO K,et al.Recurrent neural networks for multivariate time series with missing values[J].Scientific Reports,2018,8(1):1-12. [25]YU Z,ZHENG X,HUANG F,et al.A framework based onsparse representation model for time series prediction in smart city[J].Frontiers of Computer Science,2021,15(1):1-13. [26]CHEN X,SUN L.Bayesian temporal factorization for multidimensional time series prediction[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2021,14(9):4659-4673. [27]TIAN Y,ZHANG K,LI J,et al.LSTM-based traffic flow prediction with missing data[J].Neurocomputing,2018,318:297-305. |
|