计算机科学 ›› 2025, Vol. 52 ›› Issue (11): 364-372.doi: 10.11896/jsjkx.250300047

• 信息安全 • 上一篇    下一篇

面向语音助手的窃听攻击与防御研究现状与挑战

黄文斌1,2, 任炬3, 曹航程4, 蒋洪波5, 熊礼治1, 陈先意1, 付章杰1   

  1. 1 南京信息工程大学计算机学院、网络空间安全学院 南京 210044
    2 浙江大学区块链与数据安全全国重点实验室 杭州 310027
    3 清华大学计算机科学与技术系 北京 100084
    4 香港城市大学计算机科学系 香港 999077
    5 湖南大学信息科学与工程学院 长沙 410082
  • 收稿日期:2025-03-10 修回日期:2025-04-28 出版日期:2025-11-15 发布日期:2025-11-06
  • 通讯作者: 熊礼治(lzxiong16@163.com)
  • 作者简介:(wenbinhuang@nuist.edu.cn)
  • 基金资助:
    江苏省基础研究计划自然科学基金(BK20240694);国家自然科学基金(62502218);浙江大学区块链与数据安全全国重点实验室开放课题(A2530);南京信息工程大学人才启动经费(2024r045);江苏省重大专项(BG2024042)

Research Status and Challenges of Eavesdropping Attacks and Defenses Targeting VoiceAssistants

HUANG Wenbin1,2, REN Ju3, CAO Hangcheng4, JIANG Hongbo5, XIONG Lizhi1, CHEN Xianyi1, FU Zhangjie1   

  1. 1 School of Computer Science,School of Cyber Science and Engineering,Nanjing University of Information Science and Technology,Nanjing 210044,China
    2 The State Key Laboratory of Blockchain and Data Security,Zhejiang University,Hangzhou 310027,China
    3 Department of Computer Science and Technology,Tsinghua University,Beijing 100084,China
    4 Department of Computer Science,City University of Hong Kong,Hong Kong 999077,China5 College of Computer Science and Electronics Engineering,Hunan University,Changsha 410082,China
  • Received:2025-03-10 Revised:2025-04-28 Online:2025-11-15 Published:2025-11-06
  • About author:HUANG Wenbin,born in 1995,asso-ciate professor.His main research in-terests include smart sensing security,mobile system security,artificial intelligence and its application security.
    XIONG Lizhi,born in 1988,professor.His main research interests include artificial intelligence and security,multimedia information security,adversarial attack and defense.
  • Supported by:
    Natural Science Foundation of Jiangsu Province,China(BK20240694),National Natural Science Foundation of China(62502218),Open Research Fund of the State Key Laboratory of Blockchain and Data Security,Zhejiang University(A2530),Startup Foundation for Introducing Talent of NUIST(2024r045) and Jiangsu Provincial Science and Technology Major Project(BG2024042).

摘要: 语音助手作为人机语音交互的便捷接口,已在居家、运动、车载等诸多场景中得到广泛应用,为医疗、金融、教育等产业的智能化升级提供了有力支持。然而,语音助手的便捷与普及也引发了严峻的窃听用户对话,从而造成用户隐私泄露的问题。现有关于语音助手的综述性文献主要聚焦于语音欺骗攻击与防御、对抗样本攻击与防御等方面,针对语音窃听攻击与防御的总结与分析仍有待完善。为此,深入研究了面向语音助手的窃听攻击与防御,并对现有研究进行了详细综述。首先,全面回顾并深入分析了当前存在的窃听攻击方法,根据窃听攻击实现方式的不同进行分类,并对攻击的实施手段、攻击目标、所需技术和权限、攻击的隐蔽性等进行了详细的探讨,旨在全面了解语音助手面临的潜在威胁。其次,对近年来防御语音助手窃听攻击的研究工作进行了系统梳理,通过对不同防御技术的分类总结,结合其应用场景和检测效果进行深入分析,总结了防御方法存在的不足与面临的挑战,为进一步提升语音助手的安全性提供了有益参考。最后,对窃听攻击和防御领域面临的主要研究挑战进行了详细分析,并探讨了未来可能的研究方向。

关键词: 语音助手安全, 语音窃听攻击, 语音窃听防御

Abstract: Voice assistants serve as convenient interfaces for human-computer voice interaction,finding widespread application across various settings including homes,sports,and vehicles.They play a pivotal role in facilitating the intelligent advancement of industries,such as healthcare,finance,and education.However,the widespread adoption and convenience of voice assistants have also precipitated significant concerns regarding the eavesdropping on user conversations,consequently leading issues of user privacy disclosure.Existing literature primarily focuses on voice spoofing attacks and defenses,as well as adversarial sample attacks and defenses.However,there remains a notable gap in the analysis and synthesis of voice eavesdropping attacks and defenses.To address this gap,this work delves deeply into the mechanisms of eavesdropping attacks on voice assistants and meticulously reviews existing research in this domain.Firstly,this work conducts a comprehensive review and in-depth analysis of various eavesdropping attack methods,categorizs them based on their implementation strategies.It explores the means of attack,targets,necessary technology and permissions,and concealment techniques employed,aiming to provide a comprehensive understanding of the potential threats faced by voice assistants.Secondly,recent research efforts aimed at defending against voice assistant eavesdropping attacks are systematically reviewed.Through the classification and summarization of different defense technologies,coupled with insights into their application scenarios and detection effectiveness,the paper highlights the shortcomings and challenges of existing defense mechanisms,thereby offering valuable insights for enhancing the security of voice assistants.Lastly,this study meticulously analyzes the primary research challenges in the realm of eavesdropping attacks and defenses,while also discussing potential future research directions.By identifying these challenges and proposing future avenues of exploration,the paper aims to guide ongoing research endeavors towards bolstering the resilience of voice assistant systems against eavesdropping threats.

Key words: Voice assistant security, Voice eavesdropping attacks, Voice eavesdropping defense

中图分类号: 

  • TP393
[1]ZHANG J,LI H,ZHANG S M,et al.Review of Pre-training Methods for Visually-rich Document Understanding[J].Computer Science,2025,52(1):259-276.
[2]QIU J,HAN R,WEI Z F,et al.Research of public infrastructure system and security policy in cyberspace[J].Chinese Journal of Network and Information Security,2021,7(6):56-67.
[3]ZHANG R,JIANG C,WU S,et al.Wi-Fi sensing for joint gesture recognition and human identification from few samples in human-computer interaction[J].IEEE Journal on Selected Areas in Communications,2022,40(7):2193-2205.
[4]WANG L,CHEN M,LU L,et al.VoiceListener:A Training-free and Universal Eavesdropping Attack on Built-in Speakers of Mobile Devices[C]//Proceedings of the ACM on Interactive,Mobile,Wearable and Ubiquitous Technologies.2023:1-22.
[5]REN K,MENG Q R,YAN S K,et al.Survey of artificial intelligence data security and privacy protection[J].Chinese Journal of Network and Information Security,2021,7(1):1-10.
[6]SHEN X C,GE Y H,CHEN B,et al.Research on construction technology of artificial intelligence security knowledge graph[J].Chinese Journal of Network and Information Security,2023,9(2):164-174.
[7]ZHANG R,ZHANG P Y,SUN C L.Speech enhancement me-thod based on multi-domain fusion and neural architecture search[J].Journal on Communications,2024,45(2):225-239.
[8]CHIN J,DESAI S,LIN S,et al.Like my aunt dorothy:effects of conversational styles on perceptions,acceptance and metaphorical descriptions of voice assistants during later adulthood[C]//Proceedings of the ACM on Human-Computer Interaction.2024:1-21.
[9]WANG A H,ZHANG L,SONG W,et al.Review of End-to-EndStreaming Speech Recognition[J].Computer Engineering and Applications,2023,59(2):22-33.
[10]JIANG Y,LI W,HOSSAIN M S,et al.A snapshot research and implementation of multimodal information fusion for data-driven emotion recognition[J].Information Fusion,2020,53:209-221.
[11]MENG Y,LI S F,ZHANG Y C,et al.Information Physical Integration System Security for Smart Home Platform[J].Computer Research and Development,2019,56(11):2349-2364.
[12]WALKER P,SAXENA N.Evaluating the effectiveness of protection jamming devices in mitigating smart speaker eavesdropping attacks using gaussian white noise[C]//Proceedings of the 37th Annual Computer Security Applications Conference.2021:414-424.
[13]ZHU H,WANG X,JIANG Y,et al.Secure Voice Interactions With Smart Devices[J].IEEE Transactions on Mobile Computing,2021,22(1):515-526.
[14]WANG C,XIE L,LIN Y,et al.Thru-the-wall eavesdropping on loudspeakers via RFID by capturing sub-mm level vibration[C]//Proceedings of the ACM on Interactive,Mobile,Wearable and Ubiquitous Technologies.2021:1-25.
[15]CHENG L,WILSON C,LIAO S,et al.Dangerous skills got certified:Measuring the trustworthiness of skill certification in voice personal assistant platforms[C]//Proceedings of the 2020 ACM SIGSAC Conference on Computer and Communications Security.2020:1699-1716.
[16]EDU J S,SUCH J M,SUAREZ-TANGIL G.Smart home personal assistants:a security and privacy review[J].ACM Computing Surveys(CSUR),2020,53(6):1-36.
[17]QIN Y,YU C,LI Z,et al.Proximic:Convenient voice activation via close-to-mic speech detected by a single microphone[C]//Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems.2021:1-12.
[18]YAN C,JI X,WANG K,et al.A survey on voice assistant security:Attacks and countermeasures[J].ACM Computing Surveys,2022,55(4):1-36.
[19]WANG Z,LIU D,SUN Y,et al.A survey on IoT-enabled home automation systems:Attacks and defenses[J].IEEE Communications Surveys & Tutorials,2022,24(4):2292-2328.
[20]LI J,CHEN C,AZGHADI M R,et al.Security and privacyproblems in voice assistant applications:A survey[J].Compu-ters & Security,2023,134:103448.
[21]HUANG W,TANG W,JIANG H,et al.Stop deceiving! an effective defense scheme against voice impersonation attacks on smart devices[J].IEEE Internet of Things Journal,2021,9(7):5304-5314.
[22]REN Y,PENG H,LI L,et al.Generalized voice spoofing detection via integral knowledge amalgamation[C]//IEEE/ACM Transactions on Audio,Speech,and Language Processing.2023:2461-2475.
[23]CHEN M,LU L,WANG J,et al.VoiceCloak:AdversarialExample Enabled Voice De-Identification with Balanced Privacy and Utility[C]//Proceedings of the ACM on Interactive,Mobile,Wearable and Ubiquitous Technologies.2023:1-21.
[24]WANG P,GAO H C,GUO X Y,et al.Improving the security of audio captchas with adversarial examples[J].IEEE Transactions on Dependable and Secure Computing,2023,21(2):650-667.
[25]JANG Y,SONG C,CHUNG S P,et al.A11y attacks:Exploiting accessibility in operating systems[C]//Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security.2014:103-115.
[26]ZHANG R,CHEN X,WEN S,et al.Who activated my voice assistant? A stealthy attack on android phones without users' awareness[C]//Machine Learning for Cyber Security.Cham:Springer,2019:378-396.
[27]ESPOSITO S,SGANDURRA D,BELLA G.Alexa versus al-exa:Controlling smart speakers by self-issuing voice commands[C]//Proceedings of the 2022 ACM on Asia Conference on Computer and Communications Security.2022:1064-1078.
[28]HUANG W,CHEN H,CAO H,et al.Manipulating Voice Assistants Eavesdropping via Inherent Vulnerability Unveiling in Mobile Systems[J].IEEE Transactions on Mobile Computing,2024,23(12),11549-11536.
[29]ZHANG G,YAN C,JI X,et al.Dolphinattack:Inaudible voice commands[C]//Proceedings of the 2017 ACM SIGSAC Confe-rence on Computer and Communications Security.2017:103-117.
[30]ROY N,SHEN S,HASSANIEH H,et al.Inaudible voice commands:The {Long-Range} attack and defense[C]//15th USENIX Symposium on Networked Systems Design and Implementation(NSDI 18).2018:547-560.
[31]MICHALEVSKY Y,BONEH D,NAKIBLY G.Gyrophone:Recognizing speech from gyroscope signals[C]//23rd USENIX Security Symposium(USENIX Security 14).2014:1053-1067.
[32]BA Z,ZHENG T,ZHANG X,et al.Learning-based PracticalSmartphone Eavesdropping with Built-in Accelerometer[C]//NDSS.2020:1-18.
[33]GAO M,LIU Y,CHEN Y,et al.Device-independent Smart-phone Eavesdropping Jointly using Accelerometer and Gyroscope[J].IEEE Transactions on Dependable and Secure Computing,2022,20(4):3144-3157.
[34]WANG L,CHEN M,LU L,et al.VoiceListener:A Training-free and Universal Eavesdropping Attack on Built-in Speakers of Mobile Devices[C]//Proceedings of the ACM on Interactive,Mobile,Wearable and Ubiquitous Technologies.2023:1-22.
[35]ZHANG S,LIU Y,GOWDA M.I spy you:Eavesdropping continuous speech on smartphones via motion sensors[C]//Proceedings of the ACM on Interactive,Mobile,Wearable and Ubiquitous Technologies.2023:1-31.
[36]CHEN H,JIN W,HU Y,et al.Eavesdropping on Black-boxMobile Devices via Audio Amplifier's EMR[C]//Proceedings of the 2018 Annual International Conference on Network and Distributed System Security(NDSS).2024.
[37]LIAO Q,HUANG Y,HUANG Y,et al.An eavesdropping system based on magnetic side-channel signals leaked by speakers[J].ACM Transactions on Sensor Networks,2024,20(2):1-30.
[38]ZHU X,WANG W G,WANG J Y,et al.Just-In-Time Software Defect Prediction Approach Based on Fine-grained Code Representationand Feature Fusion[J].Computer Science,2025,52(1):242-249.
[39]LI J Q,LIU W P,HUANG D,et al.Multimodal Fusion Based Dynamic Malware Detection[J].Computer Science,2024,51(S2):946-952.
[40]DIAO W,LIU X,ZHOU Z,et al.Your voice assistant is mine:How to abuse speakers to steal information and control your phone[C]//Proceedings of the 4th ACM Workshop on Security and Privacy in Smartphones & Mobile Devices.2014:63-74.
[41]ZHANG R,CHEN X,LU J,et al.Using AI to hack IA:A new stealthy spyware against voice assistance functions in smart phones[J].arXiv:1805.06187,2018.
[42]HUANG W,TANG W,ZHANG K,et al.Thwarting unautho-rized voice eavesdropping via touch sensing in mobile systems[C]//IEEE INFOCOM 2022-IEEE Conference on Computer Communications.2022:31-40.
[43]WANG X,CONTINELLA A,YANG Y,et al.Leakdoctor:Toward automatically diagnosing privacy leaks in mobile applications[C]//Proceedings of the ACM on Interactive,Mobile,Wearable and Ubiquitous Technologies.2019:1-25.
[44]ZHOU R,JI X,YAN C,et al.DeHiREC:Detecting HiddenVoice Recorders via ADC Electromagnetic Radiation[C]//2023 IEEE Symposium on Security and Privacy(SP).IEEE,2023:3113-3128.
[45]AHMED D,SABIR A,DAS A.Spying through your voice assistants:realistic voice command fingerprinting[C]//32nd USENIX Security Symposium(USENIX Security 23).2023:2419-2436.
[46]MITEV R,PAZII A,MIETTINEN M,et al.Leakypick:Iot audio spy detector[C]//Annual Computer Security Applications Conference.2020:694-705.
[47]ZHU D,JIN H,LIU Y,et al.PhoneCheck:App-level protection against eavesdropping on Android[C]//2017 IEEE 9th International Conference on Communication Software and Networks(ICCSN).IEEE,2017:686-693.
[48]SUN K,CHEN C,ZHANG X.“ Alexa,stop spying on me!” speech privacy protection against voice assistants[C]//Procee-dings of the 18th Conference on Embedded Networked Sensor Systems.2020:298-311.
[49]ZHAO Q,ZUO C,DOLAN-GAVITT B,et al.Automatic un-covering of hidden behaviors from input validation in mobile apps[C]//IEEE Symposium on Security and Privacy(SP).IEEE,2020:1106-1120.
[50]MENG Z,XIONG Y,HUANG W,et al.AppAngio:RevealingContextual Information of Android App Behaviors by API-Level Audit Logs[J].IEEE Transactions on Information Forensics and Security,2020,16:1912-1927.
[51]FRATANTONIO Y,BIANCHI A,ROBERTSON W,et al.Triggerscope:Towards detecting logic bombs in android applications[C]//IEEE Symposium on Security and Privacy(SP).IEEE,2016:377-396.
[52]SENANAYAKE J,KALUTARAGE H,AL-KADRI M O,et al.Android source code vulnerability detection:a systematic literature review[J].ACM Computing Surveys,2023,55(9):1-37.
[53]XU K,LI Y,DENG R H.Iccdetector:Icc-based malware detection on android[J].IEEE Transactions on Information Forensics and Security,2016,11(6):1252-1264.
[54]WONG M Y,LIE D.Intellidroid:a targeted input generator for the dynamic analysis of android malware[C]//NDSS.2016:21-24.
[55]LEE Y T,ENCK W,CHEN H,et al.PolyScope:Multi-PolicyAccess Control Analysis to Compute Authorized Attack Operations in Android Systems[C]//USENIX Security Symposium.2021:2579-2596.
[56]WANG S,LING Z,ZHANG Y,et al.Implication of animationon Android security[C]//2022 IEEE 42nd International Confe-rence on Distributed Computing Systems(ICDCS).IEEE,2022:1122-1132.
[57]ZOUHAIER L,BENDALYHLAOUI Y,AYED L B.Adaptive user interface based on accessibility context[J].Multimedia Tools and Applications,2023,82:35621-35650.
[58]HUANG W,TANG W,CHEN H,et al.Unauthorized Micro-phone Access Restraint Based on User Behavior Perception in Mobile Devices[J].IEEE Transactions on Mobile Computing,2024,23(1):955-970.
[59]HUANG W,TANG W,JIANG H,et al.Recognizing VoiceSpoofing Attacks Via Acoustic Nonlinearity Dissection for Mobile Devices[J].IEEE Transactions on Mobile Computing,2024,23(12),12080-12096.
[60]CAO H,HUANG W,XU G,et al.Security analysis of wifi-based sensing systems:Threats from perturbation attacks[J].arXiv:2404.15587,2024.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!