计算机科学 ›› 2021, Vol. 48 ›› Issue (6A): 213-219.doi: 10.11896/jsjkx.201100193

• 大数据&数据科学 • 上一篇    下一篇

基于时空轨迹数据的异常检测

郭奕杉, 刘漫丹   

  1. 华东理工大学信息科学与工程学院 上海200237
  • 出版日期:2021-06-10 发布日期:2021-06-17
  • 通讯作者: 刘漫丹(liumandan@ecust.edu.cn)
  • 作者简介:nous_to_get_her@163.com

Anomaly Detection Based on Spatial-temporal Trajectory Data

GUO Yi-shan, LIU Man-dan   

  1. School of Information Science and Engineering,East China University of Science and Technology,Shanghai 200237,China
  • Online:2021-06-10 Published:2021-06-17
  • About author:GUO Yi-shan,born in 1997,postgradua-te.Her main research interests include data mining in networks and intelligent optimization algorithm.
    LIU Man-dan,born in 1973,Ph.D,professor,Ph.D supervisor.Her main research interests include control and optimization,application of intelligent methods,such as neural network and evolutionary computing,in control process.

摘要: 伴随着智能设备的普及和无线通信技术的发展,用户在使用无线网络满足各种需求时,无线网络也记录下了用户上网留下的大量时空轨迹数据。针对时空轨迹数据的异常检测已经成为数据挖掘领域一个新的研究热点。为了更好地关注学生健康发展,促进校园信息化建设,以真实校园上网数据为例,提出了一种基于多尺度阈值和密度相结合的谱聚类算法(Spectral Clustering Algorithm Based on The Combination of Multi-Scale Threshold And Density,MSTD-SC),使用基于最短时间距离子序列(Shortest Time Distance-Shortest Time Distance Subsequences,STD-STDSS)的亲和距离函数来构造初始相似度矩阵,进一步引入协方差尺度阈值和空间尺度阈值对相似度矩阵进行0-1化处理,以此得到更精确的样本相似度,接着对相似度矩阵进行特征值分解,得到新的特征向量空间,最后采用DBSCAN聚类避免了K-means算法需要人工确定聚类数目的缺陷。利用轮廓系数评估多种算法得到的实验结果,MSTD-SC算法体现出了更好的聚类性能。将其应用于用户个体的异常检测中,异常用户名单被验证是有效可信的。

关键词: 谱聚类, 时空轨迹数据, 相似度, 校园无线网络, 异常检测

Abstract: With the popularization of smart devices and the development of wireless communication technology,when users use wireless networks to meet various needs,wireless networks also record a large number of users' spatial-temporal trajectory data.Anomaly detection for spatial-temporal trajectory data becomes a new research hotspot in the field of data mining.In order to better pay attention to the healthy development of students and promote the informatization construction of campus,a spectral clustering algorithm based on the combination of multi-scalethreshold and density (MSTD-SC) is proposed,taking the real internet usage data ofcampus as an example.Firstly,it uses the affinity distance function based on the shortest time distance-shortest time distance subsequences (STD-STDSS) to construct the initial adjacency matrix.Then it introduces the covariance scale eigenvector space by threshold and spatial scale eigenvector space by threshold to perform 0-1 processing on the adjacency matrix to obtain more accurate sample similarity.Next,comstructing a eigenvalue decomposition of the adjacency matrix.Finally,it uses DBSCAN clustering algorithm to avoid to manually determine the number of clusters.Using Silhouette Index to evaluate the experimental results obtained by multiple algorithms,MSTD-SC algorithm reflects better clustering performance.Applying it to individual user anomaly detection,the abnormal user list is verified to be effective and credible.

Key words: Anomaly detection, Campus wireless network, Similarity, Spatial-temporal trajectory data, Spectral clustering

中图分类号: 

  • TP393
[1] LV S L.Research on the Model of University Network Public Opinion Analysis System Based on Web Logs[J].Popular Standardization,2020(20):181-182.
[2] LV S,ZHANG Y,JI G,et al.A Novel Algorithm for Detecting Spatial-Temporal Trajectory Outlier[C]//International Confe-rence on Computer Science & Electronic Technology.2016.
[3] YANG Q.Research on the spatial-temporal characteristics ofcollege students' campus activities based on WiFi data[D].Wuhan:Central China Normal University,2016.
[4] MAO J,JIN C,ZHANG Z,et al.Anomaly Detection for Trajectory Big Data:Advancements and Framework[J].Journal of Software,2017,28(1):17-34.
[5] MAO J,WANG T,JIN C,et al.Feature Grouping-Based Outlier Detection Upon Streaming Trajectories[J].Ieee Transactions on Knowledge and Data Engineering,2017,29(12):2696-2709.
[6] HARTIGAN J A,WONG M A.A K-means Clustering Algo-rithm:Algorithm AS 136 [J].Applied Stats,1979,28(1):100-108.
[7] DING F,WANG J,GE J,et al.Anomaly detection in large-scale trajectories using hybrid grid-based hierarchical clustering[J].International Journal of Robotics & Automation,2018,33(5):474-480.
[8] HUI F,PENG N,JING S,et al.Driving Behavior Clustering and Abnormal Detection Method Based on Agglomerative Hierarchy[J].Computer Engineering,2018,44(12):196-201.
[9] MA M X,NGAN H,LIU W.Density-based Outlier Detection by Local Outlier Factor on Largescale Traffic Data[J].Electronic Imaging,2016(2):385.
[10] WANG Y,PENG T,HAN J Y,et al.Density-Based Distributed Clustering Method[J].Journal of Software,2017,28(11):2836-3850.
[11] LI N,QIANG Y,SUN Y,et al.Research on identification of aircraft abnormal trajectory in terminal area[J].China Safety Scien-ce Journal(CSSJ),2018,28(11):21-27.
[12] LUXBURG U.A Tutorial on Spectral Clustering[J].Statistics and Computing,2004,17:395-416.
[13] SHI J,MALIK J M.Normalized Cuts and Image Segmentation[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2000,22(8):888-904.
[14] NG A Y,JORDAN M I,WEISS Y.On Spectral Clustering:Analysis and an Algorithm[C]//Proceedings of the 14th International Conference on Neural Information Processing Systems:Natural and Synthetic.2001:849-856.
[15] DU T T,WEN G Q,WU L,et al.Spectral clustering algorithm based on local covariance matrix[J].Computer Engineering and Applications,2019,55(14):148-154,176.
[16] BHISSY K,FALEET F,ASHOUR W.Spectral Clustering Using Optimized Gaussian Kernel Function[J].International Journal of Artificial Intelligence and Application for Smart Devices,2014,2:41-56.
[17] YU Q,LI Q,CHEN C,et al.Abnormal Trajectory Detection Method Based on BP Neural Network[J].Computer Enginee-ring,2019,45(7):229-236,241.
[18] TONG T,ZHU X,DU T.Connected graph decomposition for spectral clustering[J].Multimedia Tools and Applications,2019,78(23).
[19] FANG M J,LIU M D.Similar measurement of time-space trajectory based on campus wireless network[J].Computer Engineering and Design,2020,41(11):3001-3008.
[20] VLACHOS M,KOLLIOS G,GUNOPULOS D.Discoveringsimilar multidimensional trajectories[C]//18th International Conference on Data Engineering.IEEE,2002:673-684.
[21] PENG X,ZHANG L,YI Z.Scalable Sparse Subspace Clustering[C]//2013 IEEE Conference on Computer Vision and Pattern Recognition.2013:430-437.
[22] LI H,LIU J,LIU R W,et al.A Dimensionality Reduction-Based Multi-Step Clustering Method for Robust Vessel Trajectory Analysis[J].Sensors,2017,17(8).
[23] ESTER M,KRIEGEL H-P,SANDER J,et al.A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise [C]//Proc.Int.Conf.Knowledg Discovery & Data Mining.1996:226-231.
[24] WANG L J,DING S F,JIA H J.Spectral Clustering Algorithm Based on Message Passing[J].Data Acquisition and Processing,2019,34(3):548-557.
[25] ROUSSEEUW P J.Silhouettes:A graphical aid to the interpretation and validation of cluster analysis[J].Journal of Computational and Applied Mathematics,1987,20:53-65.
[1] 吴子仪, 李邵梅, 姜梦函, 张建朋.
基于自注意力模型的本体对齐方法
Ontology Alignment Method Based on Self-attention
计算机科学, 2022, 49(9): 215-220. https://doi.org/10.11896/jsjkx.210700190
[2] 柴慧敏, 张勇, 方敏.
基于特征相似度聚类的空中目标分群方法
Aerial Target Grouping Method Based on Feature Similarity Clustering
计算机科学, 2022, 49(9): 70-75. https://doi.org/10.11896/jsjkx.210800203
[3] 徐天慧, 郭强, 张彩明.
基于全变分比分隔距离的时序数据异常检测
Time Series Data Anomaly Detection Based on Total Variation Ratio Separation Distance
计算机科学, 2022, 49(9): 101-110. https://doi.org/10.11896/jsjkx.210600174
[4] 李其烨, 邢红杰.
基于最大相关熵的KPCA异常检测方法
KPCA Based Novelty Detection Method Using Maximum Correntropy Criterion
计算机科学, 2022, 49(8): 267-272. https://doi.org/10.11896/jsjkx.210700175
[5] 王馨彤, 王璇, 孙知信.
基于多尺度记忆残差网络的网络流量异常检测模型
Network Traffic Anomaly Detection Method Based on Multi-scale Memory Residual Network
计算机科学, 2022, 49(8): 314-322. https://doi.org/10.11896/jsjkx.220200011
[6] 李斌, 万源.
基于相似度矩阵学习和矩阵校正的无监督多视角特征选择
Unsupervised Multi-view Feature Selection Based on Similarity Matrix Learning and Matrix Alignment
计算机科学, 2022, 49(8): 86-96. https://doi.org/10.11896/jsjkx.210700124
[7] 杜航原, 李铎, 王文剑.
一种面向电商网络的异常用户检测方法
Method for Abnormal Users Detection Oriented to E-commerce Network
计算机科学, 2022, 49(7): 170-178. https://doi.org/10.11896/jsjkx.210600092
[8] 曾志贤, 曹建军, 翁年凤, 蒋国权, 徐滨.
基于注意力机制的细粒度语义关联视频-文本跨模态实体分辨
Fine-grained Semantic Association Video-Text Cross-modal Entity Resolution Based on Attention Mechanism
计算机科学, 2022, 49(7): 106-112. https://doi.org/10.11896/jsjkx.210500224
[9] 王毅, 李政浩, 陈星.
基于用户场景的Android 应用服务推荐方法
Recommendation of Android Application Services via User Scenarios
计算机科学, 2022, 49(6A): 267-271. https://doi.org/10.11896/jsjkx.210700123
[10] 黄少滨, 孙雪薇, 李熔盛.
基于跨句上下文信息的神经网络关系分类方法
Relation Classification Method Based on Cross-sentence Contextual Information for Neural Network
计算机科学, 2022, 49(6A): 119-124. https://doi.org/10.11896/jsjkx.210600150
[11] 成科扬, 王宁, 崔宏纲, 詹永照.
基于局部注意力图互迁移的可解释性优化方法
Interpretability Optimization Method Based on Mutual Transfer of Local Attention Map
计算机科学, 2022, 49(5): 64-70. https://doi.org/10.11896/jsjkx.210400176
[12] 陈壮, 邹海涛, 郑尚, 于化龙, 高尚.
基于用户覆盖及评分差异的多样性推荐算法
Diversity Recommendation Algorithm Based on User Coverage and Rating Differences
计算机科学, 2022, 49(5): 159-164. https://doi.org/10.11896/jsjkx.210300263
[13] 武玉坤, 李伟, 倪敏雅, 许志骋.
单类支持向量机融合深度自编码器的异常检测模型
Anomaly Detection Model Based on One-class Support Vector Machine Fused Deep Auto-encoder
计算机科学, 2022, 49(3): 144-151. https://doi.org/10.11896/jsjkx.210100142
[14] 冷佳旭, 谭明圮, 胡波, 高新波.
基于隐式视角转换的视频异常检测
Video Anomaly Detection Based on Implicit View Transformation
计算机科学, 2022, 49(2): 142-148. https://doi.org/10.11896/jsjkx.210900266
[15] 刘意, 毛莺池, 程杨堃, 高建, 王龙宝.
基于邻域一致性的异常检测序列集成方法
Locality and Consistency Based Sequential Ensemble Method for Outlier Detection
计算机科学, 2022, 49(1): 146-152. https://doi.org/10.11896/jsjkx.201000156
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!