计算机科学 ›› 2016, Vol. 43 ›› Issue (9): 120-123.doi: 10.11896/j.issn.1002-137X.2016.09.023

• 2015 年第三届CCF 大数据学术会议 • 上一篇    下一篇

面向临床检验指标的非同步时间序列聚类算法研究

陈德华,韩学士,乐嘉锦,朱立峰   

  1. 东华大学计算机科学与技术学院 上海200051,东华大学计算机科学与技术学院 上海200051,东华大学计算机科学与技术学院 上海200051,上海交通大学医学院附属瑞金医院计算机中心 上海200025
  • 出版日期:2018-12-01 发布日期:2018-12-01

Efficient and Effective Clustering Algorithm for Asynchronous Time Series of Clinical Laboratory Indicators

CHEN De-hua, HAN Xue-shi, LE Jia-jin and ZHU Li-feng   

  • Online:2018-12-01 Published:2018-12-01

摘要: 对临床检验指标时间序列进行聚类,从中发现临床检验指标变化趋势相似的患者群体,对开展精准医疗具有非常重要的价值。考虑到不同患者的检验次数及检验时间点不完全同步,首先通过对非同步时间序列进行预处理,实现不同时间序列维度及时间点的同步化。在此基础上,通过引入一个用户自定义参数即噪声点占有率NoisePro,对DBScan算法进行改进,提出了一种基于密度划分思想的非同步临床检验指标时间序列聚类LabTS-CLU算法。最后利用某三甲医院十余万糖尿病患者近10年的糖化血红蛋白时间序列数据集进行实验,结果证明了所提算法的有效性。

关键词: 临床检验指标,非同步时间序列,密度聚类

Abstract: Clustering for asynchronous time series of clinical laboratory indicators,and finding the patient group with similar variation trends of clinical laboratory indicators,have a very important value for the conduct of precision medicine.Taking into account the frequency of inspection and the testing time points of different patients are not fully synchronized, asynchronous time series were preprocessed to achieve the synchronization of different time dimensions and time points.On this basis,we improved the DBScan algorithm by introducing a user-defined parameter namely noises share NoisePro.Then,we proposed a LabTS-CLU time series clustering algorithm of asynchronous clinical test indicators based on density divided thoughts.Finally,experimental results on the time series of glycated hemoglobin dataset of more than 100 thousand diabetics in the past 10 years from a hospital demonstrate the effectiveness of the proposed algorithm.

Key words: Clinical indicators,Asynchronous time series,Density clustering

[1] Fravolini M L,Cascianelli S.Pier Giorgio Fabietti:A Learning Strategy for the Autonomous Control of Type 1 Diabetes[J].Applied Artificial Intelligence,2015,29(6):531-562
[2] Goodwin G C,Medioli A M,Carrasco D S,et al.YongjiFu:A fundamental control limitation for linear positive systems with application to Type 1 diabetes treatment[J].Automatica,2015,55:73-77
[3] Kellner D,Klappstein J,Dietmayer K.Grid-based DBSCAN for clustering extended objects in radar data[C]∥2012 IEEE Conference on Intelligent Vehicles Symposium.2012:365-370
[4] Aghabozorgi S R,WahTeh Y.Clustering of large time seriesdatasets[J].Intelligent Data Analysis,2014,18(5):793-817
[5] Li Bin,Tan Li-xiang,Zhang Jing-song.Time series symbolicmethods facing data mining[J].Journal of Circuit and Systems,2000,5(2):9-14(in Chinese) 李斌,谭立湘,章劲松.而向数据挖掘的时间序列符号化力法研究[J].电路与系统学报,2000,5(2):9-14
[6] Li Ai-guo,Qin Zheng.On-line segmentation of time-series data [J].Journal of Software,2004,5(11):1671-1679(in Chinese) 李爱国,覃征.在线分割时间序列[J].数据软件学报,2004,5(11):1671-1679
[7] Keogh E,Kasetty S.On the need for time series data miningbenchmarks:A survey and empirical demonstration[J].Data Mi-ning and Knowledge Discovery,2003,7(4):349-371
[8] Tewari G,Snyder J,Sander P V.Signal-speciallized parame-terization for piecewise linear reconstruction[C]∥Proceedings of the Eurographics Symposium on Ueometry Processing.New York,USA,2004:55-64
[9] Tseng Y J,Ping Xiao-ou,Liang J D,et al.FeipeiLai:Multiple-Time-Series Clinical Data Processing for Classification With Merging Algorithm and Statistical Measures[J].IEEE Journal of Biomedical and Health Informatics,2015,9(3):1036-1043
[10] Sitaram R,Zhang H,Guan C,et al.Temporal classification ofmulti-channel near-infrared spectroscopy signals of motor imagery for developing abrain-computer interface[J].NeuroImage,2007,34(4):1416-1427
[11] Yin Z,Zhang J.Identification of temporal variations in mental workload using locally-linear embedding based EEG feature reduction andsupport vector machine based clustering and classification techniques[J].Comput.Methods Programs Biomed., 2014,115(3):119-134

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!