计算机科学 ›› 2022, Vol. 49 ›› Issue (11): 170-178.doi: 10.11896/jsjkx.211000040

• 计算机图形学&多媒体 • 上一篇    下一篇

基于多尺度特征融合的驾驶员注意力分散检测方法

张宇欣1,2, 陈益强2   

  1. 1 全球能源互联网发展合作组织 北京 100031
    2 中国科学院计算技术研究所 北京 100094
  • 收稿日期:2021-10-08 修回日期:2022-03-15 出版日期:2022-11-15 发布日期:2022-11-03
  • 通讯作者: 陈益强(yqchen@ict.ac.cn)
  • 作者简介:(yuxin-zhang@geidco.org)
  • 基金资助:
    国家重点研发计划(2020YFC2007104)

Driver Distraction Detection Based on Multi-scale Feature Fusion Network

ZHANG Yu-xin1,2, CHEN Yi-qiang2   

  1. 1 Global Energy Interconnection Development and Cooperation Organization,Beijing 100031,China
    2 Institute of Computing Technology,Chinese Academy of Sciences,Beijing 100094,China
  • Received:2021-10-08 Revised:2022-03-15 Online:2022-11-15 Published:2022-11-03
  • About author:ZHANG Yu-xin,born in 1993,Ph.D,is a member of China Computer Federation.Her main research interests include anomaly detection,deep learning and activity recognition.
    CHEN Yi-qiang,born in 1973,Ph.D,professor,is a senior member of IEEE and fellow of China Computer Federation.His main research interests include human computer interaction,pervasive computing and wearable computing.
  • Supported by:
    National Key R & D Program of China(2020YFC2007104).

摘要: 近年来,道路交通事故的发生逐年增加。驾驶员注意力不集中是造成交通事故的主要原因之一。该项工作利用多源数据来检测驾驶员是否注意力分散。由于每个数据源能为其余数据源提供一定的信息,即多源数据之间的关联性较强,因此对不同来源的数据进行同等处理或对多源特征进行简单的连接整合会导致特征耦合度高,不能保证挖掘任务的有效性。另外,注意力分散驾驶可能受到许多因素的影响,当已知类别的集合中不存在驾驶员注意力分散的类型时,常见的有监督方法可能会导致分类错误。对此,提出了一种基于多尺度特征融合的驾驶员注意力分散检测方法(Multi-Scale Feature Fusion Network,MSFFN)。首先,通过多个嵌入式子网络从多源数据中学习低维表示。然后,提出一种多尺度特征融合方法,从时空关联性的角度聚合这些特征表示,降低多源特征之间的耦合度。最后,设计基于卷积长短期记忆的编解码模型进行无监督检测。在训练阶段,模型仅对正常驾驶实例进行训练,确定正常数据的一类分类边界。在检测阶段,计算模型重构误差并将其作为每一个测试数据的评分,从而做出细粒度的检测决策。该方法在公开的驾驶员行为数据集上取得了很好的实验结果,优于现有方法。

关键词: 驾驶员注意力分散, 无监督学习, 多源, 多尺度融合, 编解码器

Abstract: The occurrence of road traffic accidents has increased year by year.Driver inattention during driving is one of the major causes of traffic accidents.In this paper,we utilize multi-source data to detect driver distraction.However,the correlations derived from multi-source data will generate feature of high-dimensional entanglement.Existing methods perform similar processing for data of different sources or simply stick to concatenate multi-source features,which are not easy to catch the key feature of high-dimensional entanglement.And distracted driving can be affected by many factors.Supervised methods might cause misclassification when the type of driver distraction does not exist in the set of the known categories.Therefore,we propose a multi-dcale feature fusion network approach to tackle these challenges.Basically,it first learns low-dimensional representations from multi-source data through multiple embedding subnetworks,and then proposes a multi-scale feature Fusion method to aggregate these representations from the perspective of spatial-temporal correlation,thereby reducing the entanglement of feature.Finally,we utilize a ConvLSTM encoder-decoder model to detect driver distraction.Experimental results on a public loaded drive dataset show that the proposed method outperforms the existing methods.

Key words: Driver distraction, Unsupervised learning, Multi-source, Multi-scale fusion, Encoder-decoder

中图分类号: 

  • TP183
[1]ASCONE D,TONJA LINDSEY T,VARGHESE C.An examination of driver distraction as recorded in NHTSA databases[R].United States.National Highway Traffic Safety Administration,2009.
[2]ERAQI H M,ABOUELNAGA Y,SAAD M H et al.Driver Distraction Identification with an Ensemble of Convolutional Neural Networks[J].Journal of Advanced Transportation,2019,2019(PT.1):1-12.
[3]HANSEN J,BUSSO C,ZHENG Y,et al.Driver Modeling forDetection and Assessment of Driver Distraction:Examples from the UTDrive Test Bed[J].IEEE Signal Processing Magazine,2017,34(4):130-142.
[4]DHUPATI L S,KAR S,RAJAGURE A,et al.A novel drowsiness detection scheme based on speech analysis with validation using simultaneous EEG recordings[C]//2010 IEEE International Conference on Automation Science and Engineering.IEEE,2010:917-921.
[5]MURPHY-CHUTORIAN E,TRIVEDI M M.Head Pose Estimation and Augmented Reality Tracking:An Integrated System and Evaluation for Monitoring Driver Awareness[J].IEEE Transactions on Intelligent Transportation Systems,2010,11(2):300-311.
[6]DEHZANGI O,SAHU V,TAHERISADR M,et al.Multi-modal system to detect on-the-road driver distraction[C]//2018 21st International Conference on Intelligent Transportation Systems(ITSC).IEEE,2018:2191-2196.
[7]HU R,SINGH A.Transformer is all you need:Multimodal multitask learning with a unified transformer[J].arXiv:2102.10772,2021.
[8]DU Y,RAMAN C,BLACK A W,et al.Multimodal polynomial fusion for detecting driver distraction[J].arXiv:1810.10565,2018.
[9]WANG H,MEGHAWAT A,MORENCY L P,et al.Select-additive learning:Improving generalization in multimodal sentiment analysis[C]//2017 IEEE International Conference on Multimedia and Expo(ICME).IEEE,2017:949-954.
[10]KUTILA M,JOKELA M,MARKKULA G,et al.Driver dis-traction detection with a camera vision system[C]//2007 IEEE International Conference on Image Processing.IEEE,2007:VI-201-VI-204.
[11]CRAYE C,KARRAY F.Driver distraction detection and recognition using RGB-D sensor[J].arXiv:1502.00250,2015.
[12]XIAO D,FENG C.Detection of drivers visual attention using smartphone[C]//2016 12th International Conference on Natural Computation,Fuzzy Systems and Knowledge Discovery(ICNC-FSKD).IEEE,2016:630-635.
[13]YANG J,CHANG T N,HOU E.Driver distraction detection for vehicular monitoring[C]//IECON 2010-36th Annual Confe-rence on IEEE Industrial Electronics Society.IEEE,2010:108-113.
[14]WALI M K,MURUGAPPAN M,AHMMAD B.Wavelet Pa-cket Transform Based Driver Distraction Level Classification Using EEG[J].Mathematical Problems in Engineering,2013,2013(pt.13):841-860.
[15]SATHYANARAYANA A,NAGESWAREN S,GHASEMZA-DEH H,et al.Body sensor networks for driver distraction identification[C]//IEEE International Conference on Vehicular Electronics and Safety.IEEE,2008.
[16]LECHNER G,FELLMANN M,FESTL A,et al.A lightweight framework for multi-device integration and multi-sensor fusion to explore driver distraction[C]//International Conference on Advanced Information Systems Engineering.Cham:Springer,2019:80-95.
[17]TANPRASERT T,SAIPRASERT C,THAJCHAYAPONG S.Combining unsupervised anomaly detection and neural networks for driver identification[J].Journal of Advanced Transportation,2017,UNSP 6057830.
[18]ZHANG Y,CHEN Y,GAO C.Deep unsupervised multi-modal fusion network for detecting driver distraction[J].Neurocomputing,2021,421:26-38.
[19]ALJARRAHl A A,ALI A H.Human activity recognition using PCA and BiLSTM recurrent neural networks[C]//2019 2nd International Conference on Engineering Technology and its Applications(IICETA).IEEE,2019:156-160.
[20]HOWARD A G,ZHU M,CEHN B,et al.Mobilenets:Efficient convolutional neural networks for mobile vision applications[J].arXiv:1704.04861,2017.
[21]VUKOTIC V,RAYMOND C,GRAVIER G.Bidirectional joint representation learning with symmetrical deep neural networks for multimodal and crossmodal applications[C]//Proceedings of the 2016 ACM on International Conference on Multimedia Retrieval.2016:343-346.
[22]WANG W,OOI B C,YANG X,et al.Effective multi-modal retrieval based on stacked auto-encoders[J].Proceedings of the VLDB Endowment,2014,7(8):649-660.
[23]SHI X J,CHEN Z R,WANG H,et al.Convolutional LSTM network:A machine learning approach for precipitation nowcasting[C]//Advances in Neural Information Processing Systems.2015:802-810.
[24]ZHANG H,LI J,JI Y,et al.A character-level sequence-to-sequence method for subtitle learning[C]//2016 IEEE 14th International Conference on Industrial Informatics(INDIN).IEEE,2016:780-783.
[25]TAAMNEH S,TSIAMYRTZIS P,DCOSTA M,et al.A multimodal dataset for various forms of distracted driving [J].Scientific Data,2017,4(1):1-21.
[26]KETKAR N.Introduction to keras [M]//Deep Learning with Python.Springer,2017:97-111.
[27]VAN DER MAATEN L,HINTON G.Visualizing data usingtsne.[J].Journal of Machine Learning Research,2008,9(2605):2579-2605.
[28]RAMANAN K.Control Techniques for Complex Networks[J].Journal of the American Statistical Association,2009,104(487):1274-1275.
[29]YANG J B,NGUYEN M N,SAN P P,et al.Deep convolutional neural networks on multichannel time series for human activity recognition [C]//Twenty-Fourth International Joint Conference on Artificial Intelligence.2015.
[30]LIM W,JANG D,LEE T.Speech emotion recognition usingconvolutional and recurrent neural networks [C]//2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference(APSIPA).IEEE,2016:1-4.
[31]SIMONYAN K,ZISSERMAN A.Very deep convolutional networks for large-scale image recognition[J].arXiv:1409.1556,2014.
[32]SZEGEDY C,LIU W,JIA Y,et al.Going deeper with convolutions [C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2015:1-9.
[33]NGIAM J,KHOSLA A,KIM M,et al.Multimodal deep learning [C]//ICML.2011.
[34]ZADEH A,CHEN M,PORIA S,et al.Tensor fusion network for multimodal sentiment analysis[J].arXiv:1707.07250,2017.
[35]HINTON G E,SALAKHUTDINOV R R.Reducing the dimensionality of data with neural networks[J].Science,2006,313(5786):504-507.
[36]SRIVASTAVA N,MANSIMOV E,SALAKHUDINOV R.Unsupervised learning of video representations using lstms [C]//International Conference on Machine Learning.2015:843-852.
[37]DUMAN T B,BAYRAM B,I·NCE G.Acoustic anomaly detection using convolutional autoencoders in industrial processes [C]//14th International Conference on Soft Computing Models in Industrial and Environmental Applications(SOCO 2019).Cham:Springer International Publishing,2020:432-442.
[38]PARK S,KIM M,LEE S.Anomaly detection for http using convolutional autoencoders [J].IEEE Access,2018,6:70884-70901.
[1] 宋杰, 梁美玉, 薛哲, 杜军平, 寇菲菲.
基于无监督集群级的科技论文异质图节点表示学习方法
Scientific Paper Heterogeneous Graph Node Representation Learning Method Based onUnsupervised Clustering Level
计算机科学, 2022, 49(9): 64-69. https://doi.org/10.11896/jsjkx.220500196
[2] 陈晶, 吴玲玲.
多源异构环境下的车联网大数据混合属性特征检测方法
Mixed Attribute Feature Detection Method of Internet of Vehicles Big Datain Multi-source Heterogeneous Environment
计算机科学, 2022, 49(8): 108-112. https://doi.org/10.11896/jsjkx.220300273
[3] 魏恺轩, 付莹.
基于重参数化多尺度融合网络的高效极暗光原始图像降噪
Re-parameterized Multi-scale Fusion Network for Efficient Extreme Low-light Raw Denoising
计算机科学, 2022, 49(8): 120-126. https://doi.org/10.11896/jsjkx.220200179
[4] 侯宏旭, 孙硕, 乌尼尔.
蒙汉神经机器翻译研究综述
Survey of Mongolian-Chinese Neural Machine Translation
计算机科学, 2022, 49(1): 31-40. https://doi.org/10.11896/jsjkx.210900006
[5] 吴成凤, 蔡莉, 李劲, 梁宇.
基于多源位置数据的居民出行频繁模式挖掘
Frequent Pattern Mining of Residents’ Travel Based on Multi-source Location Data
计算机科学, 2021, 48(7): 155-163. https://doi.org/10.11896/jsjkx.200800072
[6] 刘志鑫, 张泽华, 张杰.
基于多层次多视角的图注意力Top-N推荐方法
Top-N Recommendation Method for Graph Attention Based on Multi-level and Multi-view
计算机科学, 2021, 48(4): 104-110. https://doi.org/10.11896/jsjkx.200800027
[7] 匡广生, 郭岩, 俞晓明, 刘悦, 程学旗.
基于图的多源数据融合框架研究
Study on Multi-source Data Fusion Framework Based on Graph
计算机科学, 2021, 48(11): 170-175. https://doi.org/10.11896/jsjkx.201100004
[8] 纪南巡, 孙晓燕, 李祯其.
多源异构用户生成内容的融合向量化表示学习
Fusion Vectorized Representation Learning of Multi-source Heterogeneous User-generated Contents
计算机科学, 2021, 48(10): 51-58. https://doi.org/10.11896/jsjkx.200900194
[9] 庄奕, 杨家海.
限时点到多点跨数据中心传输的多源树调度算法
Multi-source Tree-based Scheduling Algorithm for Deadline-aware P2MP Inter-datacenter Transfers
计算机科学, 2020, 47(7): 213-219. https://doi.org/10.11896/jsjkx.200300069
[10] 李金霞, 赵志刚, 李强, 吕慧显, 李明生.
改进的局部和相似性保持特征选择算法
Improved Locality and Similarity Preserving Feature Selection Algorithm
计算机科学, 2020, 47(6A): 480-484. https://doi.org/10.11896/JsJkx.20190800095
[11] 王成章, 白晓明, 杜金栗.
图像的扩散界面无监督聚类算法
Diffuse Interface Based Unsupervised Images Clustering Algorithm
计算机科学, 2020, 47(5): 149-153. https://doi.org/10.11896/jsjkx.190300125
[12] 罗月,童卞,景帅,张蒙,饶永明,闫峰.
基于卷积去噪自编码器的芯片表面弱缺陷检测方法
Detection Method of Chip Surface Weak Defect Based on Convolution Denoising Auto-encoders
计算机科学, 2020, 47(2): 118-125. https://doi.org/10.11896/jsjkx.190100141
[13] 张良成, 王运锋.
动态自适应的多雷达信息加权融合方法
Dynamic Adaptive Multi-radar Tracks Weighted Fusion Method
计算机科学, 2020, 47(11A): 321-326. https://doi.org/10.11896/jsjkx.2004000145
[14] 任雪婷, 赵涓涓, 强彦, Saad Abdul RAUF, 刘继华.
联合成对学习和图像聚类的无监督肺癌亚型识别
Lung Cancer Subtype Recognition with Unsupervised Learning Combining Paired Learning and Image Clustering
计算机科学, 2020, 47(10): 200-206. https://doi.org/10.11896/jsjkx.190900073
[15] 陈深进, 薛洋.
基于改进卷积神经网络的短时公交客流预测
Short-term Bus Passenger Flow Prediction Based on Improved Convolutional Neural Network
计算机科学, 2019, 46(5): 175-184. https://doi.org/10.11896/j.issn.1002-137X.2019.05.027
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!