计算机科学 ›› 2013, Vol. 40 ›› Issue (9): 254-256.

• 人工智能 • 上一篇    下一篇

基于滑动窗口密度聚类的数据流偏倚采样算法

胡志冬,任永功,杨雪   

  1. 辽宁师范大学计算机与信息技术学院 大连116029;辽宁师范大学计算机与信息技术学院 大连116029;辽宁师范大学计算机与信息技术学院 大连116029
  • 出版日期:2018-11-16 发布日期:2018-11-16
  • 基金资助:
    本文受辽宁省计划项目基金(2012232001),辽宁省自然科学基金(201202119)资助

Bias Sampling Data Stream Based on Sliding Window Density Clustering Algorithm Research

HU Zhi-dong,REN Yong-gong and YANG Xue   

  • Online:2018-11-16 Published:2018-11-16

摘要: 对于移动计算领域的移动对象轨迹数据流的管理,最普遍采用的技术手段是采样技术,而传统的均匀采样易丢失一些关键的变化数据,造成信息丢失现象。针对这一问题,提出一种基于概率密度聚类的数据流偏倚采样算法。该算法在滑动窗口模型下,充分利用了轨迹数据流自身的分布特性,结合偏倚采样算法思想克服了均匀采样的数据丢失问题。算法首先采用基于数据存在密度的聚类技术将滑动窗口划分为强簇、弱簇和过度簇,然后针对不同的簇给予不同的采样率,进行偏倚采样,进而得到最终的数据流摘要。经过实际数据集的实验检测,证明算法较好地保证了采样质量,并具有较快的数据处理能力。

关键词: 轨迹数据流,滑动窗口,密度聚类,偏倚采样 中图法分类号TP301文献标识码A

Abstract: In management of the mobile object trajectory data stream in the field of mobile computing,the most commonly used technical means is sampling techniques,but the traditional uniform sampling is easy to lose some of the key changes in data,resulting in the phenomenon of loss of information.To solve this problem,we proposed a data stream based on the probability density clustering bias sampling algorithm.The algorithm in a sliding window model,makes full use of the distribution of characteristics of the the trajectory data stream itself,combines a bias sampling algorithm ideo-logy to overcome uniformly sampled data loss problems.Firstly the sliding window is divided into a strong cluster clustering techniques based on density data exists,weak clusters and excessive cluster,and then different sampling rates for different clusters biased sampling are given,thereby to obtain a final summary of the data stream.The experimental testing results of the set of actual data show that the algorithm ensures the sampling quality and has faster data processing capability.

Key words: Trajectory data stream,Sliding window,Density clustering,Bias sampling

[1] Kun-Ta C,Hung-Leng C,Ming- Syan C.Feature-preserved sampling over streaming data[J].ACM trans.Knowl.Discov.Data,2009,2(4):1-45
[2] 张春阳,周继恩,钱权,等.抽样在数据挖掘中的应用研究[J].计算机科学,2004,1(2):126-128
[3] Dimitris S,Antonios D,Timos S.Hierachically compressedwavelet synopses[J].The VLDB Journal,2009,18(1):203-231
[4] 余波,朱东华,刘嵩,等.密度偏差抽样技术在聚类算法中的应用研究[J].计算机科学,2009,6(2):207-209
[5] 戴东波,赵杠,孙圣力.基于概率数据流的有效聚类算法[J].软件学报,2009,0(5):1313-1328
[6] 常建龙,曹锋,周傲英.基于滑动窗口的进化数据流聚类[J].软件学报,2007,8(4):905-918
[7] 程转流,胡为成.滑动窗口模型下的概率数据流聚类[J].计算机工程与应用,2011,7(4):141-145
[8] B Ying-yi,C Lei,Wai-Chee F A,et al.Efficient anomaly monitoring over moving object trajectory streams[C]∥Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.Paris,France,ACM,2009

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!