Computer Science ›› 2025, Vol. 52 ›› Issue (11): 364-372.doi: 10.11896/jsjkx.250300047

• Information Security • Previous Articles     Next Articles

Research Status and Challenges of Eavesdropping Attacks and Defenses Targeting VoiceAssistants

HUANG Wenbin1,2, REN Ju3, CAO Hangcheng4, JIANG Hongbo5, XIONG Lizhi1, CHEN Xianyi1, FU Zhangjie1   

  1. 1 School of Computer Science,School of Cyber Science and Engineering,Nanjing University of Information Science and Technology,Nanjing 210044,China
    2 The State Key Laboratory of Blockchain and Data Security,Zhejiang University,Hangzhou 310027,China
    3 Department of Computer Science and Technology,Tsinghua University,Beijing 100084,China
    4 Department of Computer Science,City University of Hong Kong,Hong Kong 999077,China5 College of Computer Science and Electronics Engineering,Hunan University,Changsha 410082,China
  • Received:2025-03-10 Revised:2025-04-28 Online:2025-11-15 Published:2025-11-06
  • About author:HUANG Wenbin,born in 1995,asso-ciate professor.His main research in-terests include smart sensing security,mobile system security,artificial intelligence and its application security.
    XIONG Lizhi,born in 1988,professor.His main research interests include artificial intelligence and security,multimedia information security,adversarial attack and defense.
  • Supported by:
    Natural Science Foundation of Jiangsu Province,China(BK20240694),National Natural Science Foundation of China(62502218),Open Research Fund of the State Key Laboratory of Blockchain and Data Security,Zhejiang University(A2530),Startup Foundation for Introducing Talent of NUIST(2024r045) and Jiangsu Provincial Science and Technology Major Project(BG2024042).

Abstract: Voice assistants serve as convenient interfaces for human-computer voice interaction,finding widespread application across various settings including homes,sports,and vehicles.They play a pivotal role in facilitating the intelligent advancement of industries,such as healthcare,finance,and education.However,the widespread adoption and convenience of voice assistants have also precipitated significant concerns regarding the eavesdropping on user conversations,consequently leading issues of user privacy disclosure.Existing literature primarily focuses on voice spoofing attacks and defenses,as well as adversarial sample attacks and defenses.However,there remains a notable gap in the analysis and synthesis of voice eavesdropping attacks and defenses.To address this gap,this work delves deeply into the mechanisms of eavesdropping attacks on voice assistants and meticulously reviews existing research in this domain.Firstly,this work conducts a comprehensive review and in-depth analysis of various eavesdropping attack methods,categorizs them based on their implementation strategies.It explores the means of attack,targets,necessary technology and permissions,and concealment techniques employed,aiming to provide a comprehensive understanding of the potential threats faced by voice assistants.Secondly,recent research efforts aimed at defending against voice assistant eavesdropping attacks are systematically reviewed.Through the classification and summarization of different defense technologies,coupled with insights into their application scenarios and detection effectiveness,the paper highlights the shortcomings and challenges of existing defense mechanisms,thereby offering valuable insights for enhancing the security of voice assistants.Lastly,this study meticulously analyzes the primary research challenges in the realm of eavesdropping attacks and defenses,while also discussing potential future research directions.By identifying these challenges and proposing future avenues of exploration,the paper aims to guide ongoing research endeavors towards bolstering the resilience of voice assistant systems against eavesdropping threats.

Key words: Voice assistant security, Voice eavesdropping attacks, Voice eavesdropping defense

CLC Number: 

  • TP393
[1]ZHANG J,LI H,ZHANG S M,et al.Review of Pre-training Methods for Visually-rich Document Understanding[J].Computer Science,2025,52(1):259-276.
[2]QIU J,HAN R,WEI Z F,et al.Research of public infrastructure system and security policy in cyberspace[J].Chinese Journal of Network and Information Security,2021,7(6):56-67.
[3]ZHANG R,JIANG C,WU S,et al.Wi-Fi sensing for joint gesture recognition and human identification from few samples in human-computer interaction[J].IEEE Journal on Selected Areas in Communications,2022,40(7):2193-2205.
[4]WANG L,CHEN M,LU L,et al.VoiceListener:A Training-free and Universal Eavesdropping Attack on Built-in Speakers of Mobile Devices[C]//Proceedings of the ACM on Interactive,Mobile,Wearable and Ubiquitous Technologies.2023:1-22.
[5]REN K,MENG Q R,YAN S K,et al.Survey of artificial intelligence data security and privacy protection[J].Chinese Journal of Network and Information Security,2021,7(1):1-10.
[6]SHEN X C,GE Y H,CHEN B,et al.Research on construction technology of artificial intelligence security knowledge graph[J].Chinese Journal of Network and Information Security,2023,9(2):164-174.
[7]ZHANG R,ZHANG P Y,SUN C L.Speech enhancement me-thod based on multi-domain fusion and neural architecture search[J].Journal on Communications,2024,45(2):225-239.
[8]CHIN J,DESAI S,LIN S,et al.Like my aunt dorothy:effects of conversational styles on perceptions,acceptance and metaphorical descriptions of voice assistants during later adulthood[C]//Proceedings of the ACM on Human-Computer Interaction.2024:1-21.
[9]WANG A H,ZHANG L,SONG W,et al.Review of End-to-EndStreaming Speech Recognition[J].Computer Engineering and Applications,2023,59(2):22-33.
[10]JIANG Y,LI W,HOSSAIN M S,et al.A snapshot research and implementation of multimodal information fusion for data-driven emotion recognition[J].Information Fusion,2020,53:209-221.
[11]MENG Y,LI S F,ZHANG Y C,et al.Information Physical Integration System Security for Smart Home Platform[J].Computer Research and Development,2019,56(11):2349-2364.
[12]WALKER P,SAXENA N.Evaluating the effectiveness of protection jamming devices in mitigating smart speaker eavesdropping attacks using gaussian white noise[C]//Proceedings of the 37th Annual Computer Security Applications Conference.2021:414-424.
[13]ZHU H,WANG X,JIANG Y,et al.Secure Voice Interactions With Smart Devices[J].IEEE Transactions on Mobile Computing,2021,22(1):515-526.
[14]WANG C,XIE L,LIN Y,et al.Thru-the-wall eavesdropping on loudspeakers via RFID by capturing sub-mm level vibration[C]//Proceedings of the ACM on Interactive,Mobile,Wearable and Ubiquitous Technologies.2021:1-25.
[15]CHENG L,WILSON C,LIAO S,et al.Dangerous skills got certified:Measuring the trustworthiness of skill certification in voice personal assistant platforms[C]//Proceedings of the 2020 ACM SIGSAC Conference on Computer and Communications Security.2020:1699-1716.
[16]EDU J S,SUCH J M,SUAREZ-TANGIL G.Smart home personal assistants:a security and privacy review[J].ACM Computing Surveys(CSUR),2020,53(6):1-36.
[17]QIN Y,YU C,LI Z,et al.Proximic:Convenient voice activation via close-to-mic speech detected by a single microphone[C]//Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems.2021:1-12.
[18]YAN C,JI X,WANG K,et al.A survey on voice assistant security:Attacks and countermeasures[J].ACM Computing Surveys,2022,55(4):1-36.
[19]WANG Z,LIU D,SUN Y,et al.A survey on IoT-enabled home automation systems:Attacks and defenses[J].IEEE Communications Surveys & Tutorials,2022,24(4):2292-2328.
[20]LI J,CHEN C,AZGHADI M R,et al.Security and privacyproblems in voice assistant applications:A survey[J].Compu-ters & Security,2023,134:103448.
[21]HUANG W,TANG W,JIANG H,et al.Stop deceiving! an effective defense scheme against voice impersonation attacks on smart devices[J].IEEE Internet of Things Journal,2021,9(7):5304-5314.
[22]REN Y,PENG H,LI L,et al.Generalized voice spoofing detection via integral knowledge amalgamation[C]//IEEE/ACM Transactions on Audio,Speech,and Language Processing.2023:2461-2475.
[23]CHEN M,LU L,WANG J,et al.VoiceCloak:AdversarialExample Enabled Voice De-Identification with Balanced Privacy and Utility[C]//Proceedings of the ACM on Interactive,Mobile,Wearable and Ubiquitous Technologies.2023:1-21.
[24]WANG P,GAO H C,GUO X Y,et al.Improving the security of audio captchas with adversarial examples[J].IEEE Transactions on Dependable and Secure Computing,2023,21(2):650-667.
[25]JANG Y,SONG C,CHUNG S P,et al.A11y attacks:Exploiting accessibility in operating systems[C]//Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security.2014:103-115.
[26]ZHANG R,CHEN X,WEN S,et al.Who activated my voice assistant? A stealthy attack on android phones without users' awareness[C]//Machine Learning for Cyber Security.Cham:Springer,2019:378-396.
[27]ESPOSITO S,SGANDURRA D,BELLA G.Alexa versus al-exa:Controlling smart speakers by self-issuing voice commands[C]//Proceedings of the 2022 ACM on Asia Conference on Computer and Communications Security.2022:1064-1078.
[28]HUANG W,CHEN H,CAO H,et al.Manipulating Voice Assistants Eavesdropping via Inherent Vulnerability Unveiling in Mobile Systems[J].IEEE Transactions on Mobile Computing,2024,23(12),11549-11536.
[29]ZHANG G,YAN C,JI X,et al.Dolphinattack:Inaudible voice commands[C]//Proceedings of the 2017 ACM SIGSAC Confe-rence on Computer and Communications Security.2017:103-117.
[30]ROY N,SHEN S,HASSANIEH H,et al.Inaudible voice commands:The {Long-Range} attack and defense[C]//15th USENIX Symposium on Networked Systems Design and Implementation(NSDI 18).2018:547-560.
[31]MICHALEVSKY Y,BONEH D,NAKIBLY G.Gyrophone:Recognizing speech from gyroscope signals[C]//23rd USENIX Security Symposium(USENIX Security 14).2014:1053-1067.
[32]BA Z,ZHENG T,ZHANG X,et al.Learning-based PracticalSmartphone Eavesdropping with Built-in Accelerometer[C]//NDSS.2020:1-18.
[33]GAO M,LIU Y,CHEN Y,et al.Device-independent Smart-phone Eavesdropping Jointly using Accelerometer and Gyroscope[J].IEEE Transactions on Dependable and Secure Computing,2022,20(4):3144-3157.
[34]WANG L,CHEN M,LU L,et al.VoiceListener:A Training-free and Universal Eavesdropping Attack on Built-in Speakers of Mobile Devices[C]//Proceedings of the ACM on Interactive,Mobile,Wearable and Ubiquitous Technologies.2023:1-22.
[35]ZHANG S,LIU Y,GOWDA M.I spy you:Eavesdropping continuous speech on smartphones via motion sensors[C]//Proceedings of the ACM on Interactive,Mobile,Wearable and Ubiquitous Technologies.2023:1-31.
[36]CHEN H,JIN W,HU Y,et al.Eavesdropping on Black-boxMobile Devices via Audio Amplifier's EMR[C]//Proceedings of the 2018 Annual International Conference on Network and Distributed System Security(NDSS).2024.
[37]LIAO Q,HUANG Y,HUANG Y,et al.An eavesdropping system based on magnetic side-channel signals leaked by speakers[J].ACM Transactions on Sensor Networks,2024,20(2):1-30.
[38]ZHU X,WANG W G,WANG J Y,et al.Just-In-Time Software Defect Prediction Approach Based on Fine-grained Code Representationand Feature Fusion[J].Computer Science,2025,52(1):242-249.
[39]LI J Q,LIU W P,HUANG D,et al.Multimodal Fusion Based Dynamic Malware Detection[J].Computer Science,2024,51(S2):946-952.
[40]DIAO W,LIU X,ZHOU Z,et al.Your voice assistant is mine:How to abuse speakers to steal information and control your phone[C]//Proceedings of the 4th ACM Workshop on Security and Privacy in Smartphones & Mobile Devices.2014:63-74.
[41]ZHANG R,CHEN X,LU J,et al.Using AI to hack IA:A new stealthy spyware against voice assistance functions in smart phones[J].arXiv:1805.06187,2018.
[42]HUANG W,TANG W,ZHANG K,et al.Thwarting unautho-rized voice eavesdropping via touch sensing in mobile systems[C]//IEEE INFOCOM 2022-IEEE Conference on Computer Communications.2022:31-40.
[43]WANG X,CONTINELLA A,YANG Y,et al.Leakdoctor:Toward automatically diagnosing privacy leaks in mobile applications[C]//Proceedings of the ACM on Interactive,Mobile,Wearable and Ubiquitous Technologies.2019:1-25.
[44]ZHOU R,JI X,YAN C,et al.DeHiREC:Detecting HiddenVoice Recorders via ADC Electromagnetic Radiation[C]//2023 IEEE Symposium on Security and Privacy(SP).IEEE,2023:3113-3128.
[45]AHMED D,SABIR A,DAS A.Spying through your voice assistants:realistic voice command fingerprinting[C]//32nd USENIX Security Symposium(USENIX Security 23).2023:2419-2436.
[46]MITEV R,PAZII A,MIETTINEN M,et al.Leakypick:Iot audio spy detector[C]//Annual Computer Security Applications Conference.2020:694-705.
[47]ZHU D,JIN H,LIU Y,et al.PhoneCheck:App-level protection against eavesdropping on Android[C]//2017 IEEE 9th International Conference on Communication Software and Networks(ICCSN).IEEE,2017:686-693.
[48]SUN K,CHEN C,ZHANG X.“ Alexa,stop spying on me!” speech privacy protection against voice assistants[C]//Procee-dings of the 18th Conference on Embedded Networked Sensor Systems.2020:298-311.
[49]ZHAO Q,ZUO C,DOLAN-GAVITT B,et al.Automatic un-covering of hidden behaviors from input validation in mobile apps[C]//IEEE Symposium on Security and Privacy(SP).IEEE,2020:1106-1120.
[50]MENG Z,XIONG Y,HUANG W,et al.AppAngio:RevealingContextual Information of Android App Behaviors by API-Level Audit Logs[J].IEEE Transactions on Information Forensics and Security,2020,16:1912-1927.
[51]FRATANTONIO Y,BIANCHI A,ROBERTSON W,et al.Triggerscope:Towards detecting logic bombs in android applications[C]//IEEE Symposium on Security and Privacy(SP).IEEE,2016:377-396.
[52]SENANAYAKE J,KALUTARAGE H,AL-KADRI M O,et al.Android source code vulnerability detection:a systematic literature review[J].ACM Computing Surveys,2023,55(9):1-37.
[53]XU K,LI Y,DENG R H.Iccdetector:Icc-based malware detection on android[J].IEEE Transactions on Information Forensics and Security,2016,11(6):1252-1264.
[54]WONG M Y,LIE D.Intellidroid:a targeted input generator for the dynamic analysis of android malware[C]//NDSS.2016:21-24.
[55]LEE Y T,ENCK W,CHEN H,et al.PolyScope:Multi-PolicyAccess Control Analysis to Compute Authorized Attack Operations in Android Systems[C]//USENIX Security Symposium.2021:2579-2596.
[56]WANG S,LING Z,ZHANG Y,et al.Implication of animationon Android security[C]//2022 IEEE 42nd International Confe-rence on Distributed Computing Systems(ICDCS).IEEE,2022:1122-1132.
[57]ZOUHAIER L,BENDALYHLAOUI Y,AYED L B.Adaptive user interface based on accessibility context[J].Multimedia Tools and Applications,2023,82:35621-35650.
[58]HUANG W,TANG W,CHEN H,et al.Unauthorized Micro-phone Access Restraint Based on User Behavior Perception in Mobile Devices[J].IEEE Transactions on Mobile Computing,2024,23(1):955-970.
[59]HUANG W,TANG W,JIANG H,et al.Recognizing VoiceSpoofing Attacks Via Acoustic Nonlinearity Dissection for Mobile Devices[J].IEEE Transactions on Mobile Computing,2024,23(12),12080-12096.
[60]CAO H,HUANG W,XU G,et al.Security analysis of wifi-based sensing systems:Threats from perturbation attacks[J].arXiv:2404.15587,2024.
[1] WEI Debin, ZHANG Yi, XU Pingduo, WANG Xinrui. Multipath Routing Algorithm for Satellite Networks Based on Convolutional Twin Delay Deep Deterministic Policy Gradient [J]. Computer Science, 2025, 52(11): 280-288.
[2] SUN Shiquan, YE Miao, ZHU Cheng, WANG Yong, JIANG Qiuxiang. Performance Optimization of Wireless Edge Storage System Based on SDN and Drone Assistance in Disaster Scenarios [J]. Computer Science, 2025, 52(11): 306-319.
[3] ZHAO Chunlei, YU Jie, WANG Pengxiang, YOU Wei. Research on Public Nuisance Website Identification Method Based on Multi-modal Data Fusion [J]. Computer Science, 2025, 52(11A): 241100171-10.
[4] HU Yongqing, YANG Han, LIU Ziyuan, QING Guangjun, DAI Qinglong. ACCF:Time Prediction Mechanism-driven Top-k Flow Measurement [J]. Computer Science, 2025, 52(10): 98-105.
[5] DUAN Pengsong, ZHANG Yihang, FANG Tao, CAO Yangjie, WANG Chao. WiLCount:A Lightweight Crowd Counting Model for Wireless Perception Scenarios [J]. Computer Science, 2025, 52(10): 317-327.
[6] WANG Pengrui, HU Yuxiang, CUI Pengshuai, DONG Yongji, XIA Jiqiang. SRv6 Functional Conformance Verification Mechanism Based on the Programmable Data Plane [J]. Computer Science, 2025, 52(10): 328-335.
[7] XU Jia, LIU Jingyi, XU Lijie, LIU Linfeng. Wireless Charging Scheduling with Minimized Maximum Return-to-Work Time for Heterogeneous Mobile Rechargeable Devices [J]. Computer Science, 2025, 52(10): 336-347.
[8] WU Moxun, PENG Zeshun, YU Minghe, LI Xiaohua, DONG Xiaomei, NIE Tiezheng, YU Ge. Approach for Lightweight Verifiable Data Management Based on Blockchains [J]. Computer Science, 2025, 52(10): 348-356.
[9] HE Hao, ZHANG Hui. Intrusion Detection Method Based on Improved Active Learning [J]. Computer Science, 2025, 52(10): 357-365.
[10] ZHU Ziyi, ZHANG Jianhui, ZENG Junjieand ZHANG Hongyuan. Security-aware Service Function Chain Deployment Method Based on Deep ReinforcementLearning [J]. Computer Science, 2025, 52(10): 404-411.
[11] WU Jiagao, YI Jing, ZHOU Zehui, LIU Linfeng. Personalized Federated Learning Framework for Long-tailed Heterogeneous Data [J]. Computer Science, 2025, 52(9): 232-240.
[12] SHEN Tao, ZHANG Xiuzai, XU Dai. Improved RT-DETR Algorithm for Small Object Detection in Remote Sensing Images [J]. Computer Science, 2025, 52(8): 214-221.
[13] LONG Tie, XIAO Fu, FAN Weibei, HE Xin, WANG Junchang. Cubic+:Enhanced Cubic Congestion Control for Cross-datacenter Networks [J]. Computer Science, 2025, 52(8): 335-342.
[14] YE Miao, WANG Jue, JIANG Qiuxiang, WANG Yong. SDN-based Integrated Communication and Storage Edge In-network Storage Node Selection Method [J]. Computer Science, 2025, 52(8): 343-353.
[15] FAN Xinggang, JIANG Xinyang, GU Wenting, XU Juntao, YANG Youdong, LI Qiang. Effective Task Offloading Strategy Based on Heterogeneous Nodes [J]. Computer Science, 2025, 52(8): 354-362.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!