计算机科学 ›› 2019, Vol. 46 ›› Issue (6): 55-63.doi: 10.11896/j.issn.1002-137X.2019.06.007

• 大数据与数据科学* • 上一篇    下一篇

复杂事件管理的多元时序数据处理技术研究

李志国1, 钟将2, 钟璐蔓1   

  1. (重庆工商大学管理学院 重庆400067)1
    (重庆大学计算机学院 重庆400044)2
  • 收稿日期:2018-10-08 发布日期:2019-06-24
  • 通讯作者: 钟璐蔓(1974-),女,硕士,助教,主要研究方向为区域经济大数据、复杂事件处理,E-mail:lunazhong@ctbu.edu.cn
  • 作者简介:李志国(1977-),男,博士,高级工程师,主要研究方向为数据科学、战略管理、精准招商、区块链等,E-mail:lizhiguo@ctbu.edu.cn;钟 将(1974-),男,教授,博士生导师,主要研究方向为数据挖掘、可信计算机系统、服务计算;
  • 基金资助:
    国家重点研发计划(2017YFB1402400),国家高技术研究发展863计划 (2015AA015308),中央高校业务基金项目(106112014CDJZR188801)资助。

Study on Processing Technology for Complex Event Management Based on Multivariate Time Series Data

LI Zhi-guo1, ZHONG Jiang2, ZHONG Lu-man1   

  1. (School of Management,Chongqing Technology and Business University,Chongqing 400067,China)1
    (College of Computer Science,Chongqing University,Chongqing 400044,China)2
  • Received:2018-10-08 Published:2019-06-24

摘要: 随着数据量变得不断庞大,将不同业务系统数据融合在一起挖掘潜在价值变得越来越有意义。复杂事件处理技术就是将业务数据抽象为事件序列,通过复杂事件描述方法将有潜在价值的复合数据描述为特定的事件匹配结构。复杂事件检测引擎从大量事件流中检测出满足匹配结构的事件序列,最终输出数据融合结果。但传统复杂事件描述只适用于输入事件流为单一原子事件类型,且谓词约束为简单的属性值比较或聚合操作,事件间为简单的时序约束。这使得传统检测方法无法满足诸如医学、金融等对时间要求比较精确、事件谓词约束要求更加丰富的应用领域。因此,设计了一种能够支持多元事件输入的基于TCN的量化时序约束表示模型和基于时段特征约束的谓词约束表示模型,并且提出了并行化的复杂事件检测算法(PARALLEL-TCSEQ-DETECTION检测算法),使得复杂事件检测方法更加高效。对2045支股票2亿条记录的分析结果表明了提出的复杂事件处理技术的可行性与高效性。

关键词: CEP, TCN, 并行, 时序特征, 事件检测模型

Abstract: As the amount of data becomes bigger and bigger,it is increasingly meaningful to combine different business system data to mine potential values.The complex event processing technology abstracts the business data as an event sequence,and describes the potentially valuable composite data as a specific event matching structure through the event description method.Then the event detection engine detects the event sequence meeting the matching structure from a large number of event flows,and finally outputs the data fusion results.However,in the traditional event description,the input event flow of the event engine is a single atomic event type,the event predicate constraint contains a simple attribute value comparison operation or simple aggregation operation,and the time constraint between events is simple.This makes the traditional detection method cannot be suitable for some application fields in which the time is required to be more accurate and the event predicate constraint is required to be more complex,such as medicine and finance.In light of this,this paper designed a multivariate event input supported quantitative timing constraint representation model based on TCN and predicate constraint representation model based on time-interval feature constraint,and proposed a parallel detection algorithm for complex events(PARALLEL-TCSEQ-DETECTION).The method makes the complex event detection more efficient.The analysis results based on 200 million records of 2045 stocks demonstrate the validity and high efficiency of the proposed processing technology for the complex events.

Key words: CEP, Event detection model, Parallel, TCN, Timing feature

中图分类号: 

  • TP391
[1]CUGOLA G,MARGARA A.Processing flows of information: From data stream to complex event processing [J].ACM Computing Surveys(CSUR),2012,44(3):1-62.
[2]ETZION O,NIBLETT P,LUCKHAM D C.Event processing in action[M].Greenwich:Manning,2011.
[3]WANG Y H,CAO K,ZHANG X M.Complex event processing over distributed probabilistic event streams[J].Computers & Mathematics with Applications,2013,66(10):1808-1821.
[4]DAYARATHNA M,PERERA S.Recent advancements in event processing[J].ACM Computing Surveys(CSUR),2018,51(2):33-69.
[5]XIAO F,ZHAN C,LAI H,et al.New parallel processing strategies in complex event processing systems with data streams[J].International Journal of Distributed Sensor Networks,2017,13(8):1-15.
[6]Kam P,Fu A W C.Discovering temporal patterns for interval-based events[C]∥International Conference on Data Warehousing and Knowledge discovery.Springer Berlin Heidelberg,2000:317-326.
[7]ALLEN J F.Maintaining knowledge about temporal intervals [J].Communications of the ACM,1983,26(11):832-843.
[8]CHANDRASEKARAN S,COOPER O,DESHPANDE A,et al.TelegraphCQ:continuous dataflow processing[C]∥Proceedings of the 2003 ACM SIGMOD international conference on Management of data.ACM,2003:668-668.
[9]ARASU A,BABU S,WIDOM J.The CQL continuous query language:semantic foundations and query execution[J].The VLDB Journal-The International Journal on Very Large Data Bases,2006,15(2):121-142.
[10]PATEL D,HSU W,LEE M L.Mining relationships among interval-based events for classification[C]∥Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data.ACM,2008:393-404.
[11]WU S Y,CHEN Y L.Mining nonambiguous temporal patterns for interval-based events[J].IEEE Transactions on Knowledge and Data Engineering,2007,19(6):742-758.
[12]BRENNA L,DEMERS A,GEHRKE J,et al.Cayuga:a high-performance event processing engine[C]∥Proceedings of the 2007 ACM SIGMOD International Conference on Management of Data.ACM,2007:1100-1102.
[13]DEMERS A J,GEHRKE J,PANDA B,et al.Cayuga:A General Purpose Event Monitoring System[C]∥CIDR.2007:412-422.
[14]GYLLSTROM D,WU E,CHAE H J,et al.SASE:Complex Event Processing over Streams[J/OL].arXiv preprint cs/0612128,2006.
[15]DIAO Y,IMMERMAN N,GYLLSTROM D.Sase+:An agile language for kleene closure over event streams[R].Amherst,UMass Technical Report,2007.
[16]CUGOLA G,MARGARA A,MATTEUCCI M,et al.Introducing uncertainty in complex event processing:model,implementation,and validation[J].Computing,2015,97(2):103-144.
[17]WANG F,LIU S,LIU P,et al.Bridging physical and virtual worlds:complex event processing for RFID data streams[C]∥International Conference on Extending Database Technology.Springer Berlin Heidelberg,2006:588-607.
[18]CHEN Q,LI Z,LIU H.Optimizing complex event processing over RFID data streams[C]∥IEEE 24th International Confe-rence on Data Engineering.IEEE,2008:1442-1444.
[19]WU E,DIAO Y,RIZVI S.High-performance complex event processing over streams[C]∥Proceedings of the 2006 ACM SIGMOD International Conference on Management of Data.ACM,2006:407-418.
[20]ALVES D C,ALEXANDRE,et al.Embedded event processing:U.S.Patent 9,712,645[P].2017-7-18.
[21]AGRAWAL J,DIAO Y,GYLLSTROM D,et al.Efficient pattern matching over event streams[C]∥Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data.ACM,2008:147-160.
[22]https://en.wikipedia.org/wiki/Esper.
[23]VILAIN M B,KAUTZ H A.Constraint Propagation Algo-rithms for Temporal Reasoning[C]∥AAAI.1986:377-382.
[24]NEBEL B,BÜRCKERT H J.Reasoning about temporal rela-tions:a maximal tractable subclass of Allen’s interval algebra[J].Journal of the ACM(JACM),1995,42(1):43-66.
[25]LEE O J,JUNG J E.Sequence clustering-based automated rule generation for adaptive complex event processing[J].Future Generation Computer Systems,2017,66(9):100-109.
[26]LI Z G,ZHONG J.Summary of the Application of Data Science in Domestic Management Studies [J].Computer Science,2018,45(9):38-45.(in Chinese)
李志国,钟将.数据科学在国内管理学研究中的应用综述[J].计算机科学,2018,45(9):38-45.
[1] 王子凯, 朱健, 张伯钧, 胡凯.
区块链与智能合约并行方法研究与实现
Research and Implementation of Parallel Method in Blockchain and Smart Contract
计算机科学, 2022, 49(9): 312-317. https://doi.org/10.11896/jsjkx.210800102
[2] 魏恺轩, 付莹.
基于重参数化多尺度融合网络的高效极暗光原始图像降噪
Re-parameterized Multi-scale Fusion Network for Efficient Extreme Low-light Raw Denoising
计算机科学, 2022, 49(8): 120-126. https://doi.org/10.11896/jsjkx.220200179
[3] 宗迪迪, 谢益武.
基于法线迭代的模型中轴生成方法
Model Medial Axis Generation Method Based on Normal Iteration
计算机科学, 2022, 49(6A): 764-770. https://doi.org/10.11896/jsjkx.210400050
[4] 余本功, 张子薇, 王惠灵.
一种融合多层次情感和主题信息的TS-AC-EWM在线商品排序方法
TS-AC-EWM Online Product Ranking Method Based on Multi-level Emotion and Topic Information
计算机科学, 2022, 49(6A): 165-171. https://doi.org/10.11896/jsjkx.210400238
[5] 陈鑫, 李芳, 丁海昕, 孙唯哲, 刘鑫, 陈德训, 叶跃进, 何香.
面向国产异构众核架构的CFD非结构网格计算并行优化方法
Parallel Optimization Method of Unstructured-grid Computing in CFD for DomesticHeterogeneous Many-core Architecture
计算机科学, 2022, 49(6): 99-107. https://doi.org/10.11896/jsjkx.210400157
[6] 刘江, 刘文博, 张矩.
OpenFoam中多面体网格生成的MPI+OpenMP混合并行方法
Hybrid MPI+OpenMP Parallel Method on Polyhedral Grid Generation in OpenFoam
计算机科学, 2022, 49(3): 3-10. https://doi.org/10.11896/jsjkx.210700060
[7] 李家振, 纪庆革.
动态低采样环境光遮蔽的实时光线追踪分子渲染
Dynamic Low-sampling Ambient Occlusion Real-time Ray Tracing for Molecular Rendering
计算机科学, 2022, 49(1): 175-180. https://doi.org/10.11896/jsjkx.210200042
[8] 胡京徽, 许鹏.
一种基于图像分类的航空紧固件产品自动分类方法
Automatic Classification of Aviation Fastener Products Based on Image Classification
计算机科学, 2021, 48(6A): 63-66. https://doi.org/10.11896/jsjkx.200900163
[9] 傅天豪, 田鸿运, 金煜阳, 杨章, 翟季冬, 武林平, 徐小文.
一种面向构件化并行应用程序的性能骨架分析方法
Performance Skeleton Analysis Method Towards Component-based Parallel Applications
计算机科学, 2021, 48(6): 1-9. https://doi.org/10.11896/jsjkx.201200115
[10] 何亚茹, 庞建民, 徐金龙, 朱雨, 陶小涵.
基于神威平台的Floyd并行算法的实现和优化
Implementation and Optimization of Floyd Parallel Algorithm Based on Sunway Platform
计算机科学, 2021, 48(6): 34-40. https://doi.org/10.11896/jsjkx.201100051
[11] 季琰, 戴华, 姜莹莹, 杨庚, 易训.
面向混合云的可并行多关键词Top-k密文检索技术
Parallel Multi-keyword Top-k Search Scheme over Encrypted Data in Hybrid Clouds
计算机科学, 2021, 48(5): 320-327. https://doi.org/10.11896/jsjkx.200300160
[12] 李繁, 严星, 张晓宇.
基于GPU的特征脸算法优化研究
Optimization of GPU-based Eigenface Algorithm
计算机科学, 2021, 48(4): 197-204. https://doi.org/10.11896/jsjkx.200600033
[13] 冯凯, 马鑫玉.
(n,k)-冒泡排序网络的子网络可靠性
Subnetwork Reliability of (n,k)-bubble-sort Networks
计算机科学, 2021, 48(4): 43-48. https://doi.org/10.11896/jsjkx.201100139
[14] 张晓, 张思蒙, 石佳, 董聪, 李战怀.
Ceph分布式存储系统性能优化技术研究综述
Review on Performance Optimization of Ceph Distributed Storage System
计算机科学, 2021, 48(2): 1-12. https://doi.org/10.11896/jsjkx.201000149
[15] 陈自民, 卢艺文, 郭燕.
基于区块并行的以太坊智能合约高速重放
High-speed Replay of Ethereum Smart Contracts Based on Block Parallel
计算机科学, 2021, 48(2): 289-294. https://doi.org/10.11896/jsjkx.200500105
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!