Computer Science ›› 2022, Vol. 49 ›› Issue (11): 170-178.doi: 10.11896/jsjkx.211000040

• Computer Graphics & Multimedia • Previous Articles     Next Articles

Driver Distraction Detection Based on Multi-scale Feature Fusion Network

ZHANG Yu-xin1,2, CHEN Yi-qiang2   

  1. 1 Global Energy Interconnection Development and Cooperation Organization,Beijing 100031,China
    2 Institute of Computing Technology,Chinese Academy of Sciences,Beijing 100094,China
  • Received:2021-10-08 Revised:2022-03-15 Online:2022-11-15 Published:2022-11-03
  • About author:ZHANG Yu-xin,born in 1993,Ph.D,is a member of China Computer Federation.Her main research interests include anomaly detection,deep learning and activity recognition.
    CHEN Yi-qiang,born in 1973,Ph.D,professor,is a senior member of IEEE and fellow of China Computer Federation.His main research interests include human computer interaction,pervasive computing and wearable computing.
  • Supported by:
    National Key R & D Program of China(2020YFC2007104).

Abstract: The occurrence of road traffic accidents has increased year by year.Driver inattention during driving is one of the major causes of traffic accidents.In this paper,we utilize multi-source data to detect driver distraction.However,the correlations derived from multi-source data will generate feature of high-dimensional entanglement.Existing methods perform similar processing for data of different sources or simply stick to concatenate multi-source features,which are not easy to catch the key feature of high-dimensional entanglement.And distracted driving can be affected by many factors.Supervised methods might cause misclassification when the type of driver distraction does not exist in the set of the known categories.Therefore,we propose a multi-dcale feature fusion network approach to tackle these challenges.Basically,it first learns low-dimensional representations from multi-source data through multiple embedding subnetworks,and then proposes a multi-scale feature Fusion method to aggregate these representations from the perspective of spatial-temporal correlation,thereby reducing the entanglement of feature.Finally,we utilize a ConvLSTM encoder-decoder model to detect driver distraction.Experimental results on a public loaded drive dataset show that the proposed method outperforms the existing methods.

Key words: Driver distraction, Unsupervised learning, Multi-source, Multi-scale fusion, Encoder-decoder

CLC Number: 

  • TP183
[1]ASCONE D,TONJA LINDSEY T,VARGHESE C.An examination of driver distraction as recorded in NHTSA databases[R].United States.National Highway Traffic Safety Administration,2009.
[2]ERAQI H M,ABOUELNAGA Y,SAAD M H et al.Driver Distraction Identification with an Ensemble of Convolutional Neural Networks[J].Journal of Advanced Transportation,2019,2019(PT.1):1-12.
[3]HANSEN J,BUSSO C,ZHENG Y,et al.Driver Modeling forDetection and Assessment of Driver Distraction:Examples from the UTDrive Test Bed[J].IEEE Signal Processing Magazine,2017,34(4):130-142.
[4]DHUPATI L S,KAR S,RAJAGURE A,et al.A novel drowsiness detection scheme based on speech analysis with validation using simultaneous EEG recordings[C]//2010 IEEE International Conference on Automation Science and Engineering.IEEE,2010:917-921.
[5]MURPHY-CHUTORIAN E,TRIVEDI M M.Head Pose Estimation and Augmented Reality Tracking:An Integrated System and Evaluation for Monitoring Driver Awareness[J].IEEE Transactions on Intelligent Transportation Systems,2010,11(2):300-311.
[6]DEHZANGI O,SAHU V,TAHERISADR M,et al.Multi-modal system to detect on-the-road driver distraction[C]//2018 21st International Conference on Intelligent Transportation Systems(ITSC).IEEE,2018:2191-2196.
[7]HU R,SINGH A.Transformer is all you need:Multimodal multitask learning with a unified transformer[J].arXiv:2102.10772,2021.
[8]DU Y,RAMAN C,BLACK A W,et al.Multimodal polynomial fusion for detecting driver distraction[J].arXiv:1810.10565,2018.
[9]WANG H,MEGHAWAT A,MORENCY L P,et al.Select-additive learning:Improving generalization in multimodal sentiment analysis[C]//2017 IEEE International Conference on Multimedia and Expo(ICME).IEEE,2017:949-954.
[10]KUTILA M,JOKELA M,MARKKULA G,et al.Driver dis-traction detection with a camera vision system[C]//2007 IEEE International Conference on Image Processing.IEEE,2007:VI-201-VI-204.
[11]CRAYE C,KARRAY F.Driver distraction detection and recognition using RGB-D sensor[J].arXiv:1502.00250,2015.
[12]XIAO D,FENG C.Detection of drivers visual attention using smartphone[C]//2016 12th International Conference on Natural Computation,Fuzzy Systems and Knowledge Discovery(ICNC-FSKD).IEEE,2016:630-635.
[13]YANG J,CHANG T N,HOU E.Driver distraction detection for vehicular monitoring[C]//IECON 2010-36th Annual Confe-rence on IEEE Industrial Electronics Society.IEEE,2010:108-113.
[14]WALI M K,MURUGAPPAN M,AHMMAD B.Wavelet Pa-cket Transform Based Driver Distraction Level Classification Using EEG[J].Mathematical Problems in Engineering,2013,2013(pt.13):841-860.
[15]SATHYANARAYANA A,NAGESWAREN S,GHASEMZA-DEH H,et al.Body sensor networks for driver distraction identification[C]//IEEE International Conference on Vehicular Electronics and Safety.IEEE,2008.
[16]LECHNER G,FELLMANN M,FESTL A,et al.A lightweight framework for multi-device integration and multi-sensor fusion to explore driver distraction[C]//International Conference on Advanced Information Systems Engineering.Cham:Springer,2019:80-95.
[17]TANPRASERT T,SAIPRASERT C,THAJCHAYAPONG S.Combining unsupervised anomaly detection and neural networks for driver identification[J].Journal of Advanced Transportation,2017,UNSP 6057830.
[18]ZHANG Y,CHEN Y,GAO C.Deep unsupervised multi-modal fusion network for detecting driver distraction[J].Neurocomputing,2021,421:26-38.
[19]ALJARRAHl A A,ALI A H.Human activity recognition using PCA and BiLSTM recurrent neural networks[C]//2019 2nd International Conference on Engineering Technology and its Applications(IICETA).IEEE,2019:156-160.
[20]HOWARD A G,ZHU M,CEHN B,et al.Mobilenets:Efficient convolutional neural networks for mobile vision applications[J].arXiv:1704.04861,2017.
[21]VUKOTIC V,RAYMOND C,GRAVIER G.Bidirectional joint representation learning with symmetrical deep neural networks for multimodal and crossmodal applications[C]//Proceedings of the 2016 ACM on International Conference on Multimedia Retrieval.2016:343-346.
[22]WANG W,OOI B C,YANG X,et al.Effective multi-modal retrieval based on stacked auto-encoders[J].Proceedings of the VLDB Endowment,2014,7(8):649-660.
[23]SHI X J,CHEN Z R,WANG H,et al.Convolutional LSTM network:A machine learning approach for precipitation nowcasting[C]//Advances in Neural Information Processing Systems.2015:802-810.
[24]ZHANG H,LI J,JI Y,et al.A character-level sequence-to-sequence method for subtitle learning[C]//2016 IEEE 14th International Conference on Industrial Informatics(INDIN).IEEE,2016:780-783.
[25]TAAMNEH S,TSIAMYRTZIS P,DCOSTA M,et al.A multimodal dataset for various forms of distracted driving [J].Scientific Data,2017,4(1):1-21.
[26]KETKAR N.Introduction to keras [M]//Deep Learning with Python.Springer,2017:97-111.
[27]VAN DER MAATEN L,HINTON G.Visualizing data usingtsne.[J].Journal of Machine Learning Research,2008,9(2605):2579-2605.
[28]RAMANAN K.Control Techniques for Complex Networks[J].Journal of the American Statistical Association,2009,104(487):1274-1275.
[29]YANG J B,NGUYEN M N,SAN P P,et al.Deep convolutional neural networks on multichannel time series for human activity recognition [C]//Twenty-Fourth International Joint Conference on Artificial Intelligence.2015.
[30]LIM W,JANG D,LEE T.Speech emotion recognition usingconvolutional and recurrent neural networks [C]//2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference(APSIPA).IEEE,2016:1-4.
[31]SIMONYAN K,ZISSERMAN A.Very deep convolutional networks for large-scale image recognition[J].arXiv:1409.1556,2014.
[32]SZEGEDY C,LIU W,JIA Y,et al.Going deeper with convolutions [C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2015:1-9.
[33]NGIAM J,KHOSLA A,KIM M,et al.Multimodal deep learning [C]//ICML.2011.
[34]ZADEH A,CHEN M,PORIA S,et al.Tensor fusion network for multimodal sentiment analysis[J].arXiv:1707.07250,2017.
[35]HINTON G E,SALAKHUTDINOV R R.Reducing the dimensionality of data with neural networks[J].Science,2006,313(5786):504-507.
[36]SRIVASTAVA N,MANSIMOV E,SALAKHUDINOV R.Unsupervised learning of video representations using lstms [C]//International Conference on Machine Learning.2015:843-852.
[37]DUMAN T B,BAYRAM B,I·NCE G.Acoustic anomaly detection using convolutional autoencoders in industrial processes [C]//14th International Conference on Soft Computing Models in Industrial and Environmental Applications(SOCO 2019).Cham:Springer International Publishing,2020:432-442.
[38]PARK S,KIM M,LEE S.Anomaly detection for http using convolutional autoencoders [J].IEEE Access,2018,6:70884-70901.
[1] SONG Jie, LIANG Mei-yu, XUE Zhe, DU Jun-ping, KOU Fei-fei. Scientific Paper Heterogeneous Graph Node Representation Learning Method Based onUnsupervised Clustering Level [J]. Computer Science, 2022, 49(9): 64-69.
[2] CHEN Jing, WU Ling-ling. Mixed Attribute Feature Detection Method of Internet of Vehicles Big Datain Multi-source Heterogeneous Environment [J]. Computer Science, 2022, 49(8): 108-112.
[3] WEI Kai-xuan, FU Ying. Re-parameterized Multi-scale Fusion Network for Efficient Extreme Low-light Raw Denoising [J]. Computer Science, 2022, 49(8): 120-126.
[4] CHEN Zhang-hui, XIONG Yun. Stylized Image Captioning Model Based on Disentangle-Retrieve-Generate [J]. Computer Science, 2022, 49(6): 180-186.
[5] HOU Hong-xu, SUN Shuo, WU Nier. Survey of Mongolian-Chinese Neural Machine Translation [J]. Computer Science, 2022, 49(1): 31-40.
[6] WU Cheng-feng, CAI Li, LI Jin, LIANG Yu. Frequent Pattern Mining of Residents’ Travel Based on Multi-source Location Data [J]. Computer Science, 2021, 48(7): 155-163.
[7] QIU Jia-zuo, XIONG De-yi. Frontiers in Neural Question Generation:A Literature Review [J]. Computer Science, 2021, 48(6): 159-167.
[8] LIU Zhi-xin, ZHANG Ze-hua, ZHANG Jie. Top-N Recommendation Method for Graph Attention Based on Multi-level and Multi-view [J]. Computer Science, 2021, 48(4): 104-110.
[9] TENG Jian, TENG Fei, LI Tian-rui. Travel Demand Forecasting Based on 3D Convolution and LSTM Encoder-Decoder [J]. Computer Science, 2021, 48(12): 195-203.
[10] JIANG Qi, SU Wei, XIE Ying, ZHOUHONG An-ping, ZHANG Jiu-wen, CAI Chuan. End-to-End Chinese-Braille Automatic Conversion Based on Transformer [J]. Computer Science, 2021, 48(11A): 136-141.
[11] KUANG Guang-sheng, GUO Yan, YU Xiao-ming, LIU Yue, CHENG Xue-qi. Study on Multi-source Data Fusion Framework Based on Graph [J]. Computer Science, 2021, 48(11): 170-175.
[12] JI Nan-xun, SUN Xiao-yan, LI Zhen-qi. Fusion Vectorized Representation Learning of Multi-source Heterogeneous User-generated Contents [J]. Computer Science, 2021, 48(10): 51-58.
[13] YAN Xu, MA Shuai, ZENG Feng-jiao, GUO Zheng-hua, WU Jun-long, YANG Ping, XU Bing. Light Field Depth Estimation Method Based on Encoder-decoder Architecture [J]. Computer Science, 2021, 48(10): 212-219.
[14] ZHUANG Yi, YANG Jia-hai. Multi-source Tree-based Scheduling Algorithm for Deadline-aware P2MP Inter-datacenter Transfers [J]. Computer Science, 2020, 47(7): 213-219.
[15] LI Jin-xia, ZHAO Zhi-gang, LI Qiang, LV Hui-xian and LI Ming-sheng. Improved Locality and Similarity Preserving Feature Selection Algorithm [J]. Computer Science, 2020, 47(6A): 480-484.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!