计算机科学 ›› 2024, Vol. 51 ›› Issue (12): 223-233.doi: 10.11896/jsjkx.240400077
马琦珉1, 李向民1,2, 周雅倩1
MA Qimin1, LI Xiangmin1,2, ZHOU Yaqian1
摘要: 移动应用可访问性(Mobile Application Accessibility)是指移动应用程序设计和实现的程度,目的是确保任何用户都能够轻松地访问和使用该应用。国内移动应用市场上的海量应用中支持无障碍功能的应用少之又少,与数量庞大且与日俱增的老年群体和视觉障碍群体追求享受数字时代红利、打破数字鸿沟的愿景产生矛盾。大规模语言模型(Large Language Model,LLM)在实现人类水平的智能方面表现出了巨大的潜力,通过提示词工程引导可以进行简单的逻辑推理和决策判断。此外,缩短交互路径是一种最为直观的移动应用可访问性增强方法。受到上述事实的启发,提出一种基于大规模语言模型的移动应用可访问性增强方法,创新性地应用可访问性服务和大语言模型,兼顾安全性、自动化和智能化。实现了一种移动应用可访问性辅助工具AccessLink,在非侵入式和用户授权的前提下,感知和操作移动应用的图形化用户界面,由此实现了基于自动化方法的数据集构建方法,并在构建的数据集上使用大模型GPT-3.5、GPT-4.0、通义千问和百川进行实验,证明了所提方法的有效性。
中图分类号:
[1]卜佳俊,唐李真.无障碍与信息技术[M].沈阳:辽宁人民出版社,2019. [2]YU J E,CHATTOPADHYAY D.“Maps are hard for me”:identifying how older adults struggle with mobile maps[C]//Proceedings of the 22nd International ACM SIGACCESS Conference on Computers and Accessibility.2020:1-8. [3]中国信息通信研究院.信息无障碍白皮书(2022年)[R/OL].(2022-05-19)[2024-04-09].http://www.caict.ac.cn/kxyj/qwfb/bps/202205/P020220518510041281463.pdf. [4]LI X M,SHEN L W,DONG Z.Mobile Application Accessibility Enhancement Method Based on Recording and Playback[J].Computer Science,2023,50(12):32-48. [5]ONEY S,LUNDGARD A,KROSNICK R,et al.Arboretum and arbility:Improving web accessibility through a shared browsing architecture[C]//Proceedings of the 31st Annual ACM Symposium on User Interface Software and Technology.2018:937-949. [6]LUKIĆ N,TALEBIPOUR S,MEDVIDOVIĆ N.Remote control of ios devices via accessibility features[C]//Proceedings of the 2020 ACM Workshop on Forming an Ecosystem Around Software Transformation.2020:35-40. [7]ZHANG X,ROSS A S,FOGARTY J.Robust annotation of mobile application interfaces in methods for accessibility repair and enhancement[C]//Proceedings of the 31st Annual ACM Symposium on User Interface Software and Technology.2018:609-621. [8]ZHANG X,DE GREEF L,SWEARNGIN A,et al.Screen recognition:Creating accessibility metadata for mobile applications frompixels[C]//Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems.2021:1-15. [9]ELER M M,ROJAS J M,GE Y,et al.Automated accessibility testing of mobile apps[C]//2018 IEEE 11th International Conference on Software Testing,Verification and Validation(ICST).IEEE,2018:116-126. [10]VENDOME C,SOLANO D,LIÑÁN S,et al.Can everyone use my app? an empirical study on accessibility in android apps[C]//2019 IEEE International Conference on Software Maintenance and Evolution(ICSME).IEEE,2019:41-52. [11]ALSHAYBAN A,AHMED I,MALEK S.Accessibility issues in android apps:state of affairs,sentiments,and ways forward[C]//Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering.2020:1323-1334. [12]FOK R,ZHONG M,ROSS A S,et al.A Large-Scale Longitudinal Analysis of Missing Label Accessibility Failures in Android Apps[C]//Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems.2022:1-16. [13]VASWANI A,SHAZEER N,PARMAR N,et al.Attention is all you need[J].arXiv:1706.03762,2017. [14]PETERS M,NEUMANN M,IYYER M,et al.Deep Contextua-lized Word Representations[C]//Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.2018. [15]DEVLIN J,CHANG M W,LEE K,et al.Bert:Pre-training of deep bidirectional transformers for language understanding[J].arXiv:1810.04805,2018. [16]RADFORD A,WU J,CHILD R,et al.Language models are unsupervised multitask learners[J].OpenAI blog,2019,1(8):9. [17]BROWN T,MANN B,RYDER N,et al.Language models are few-shot learners[J].Advances in Neural Information Proces-sing Systems,2020,33:1877-1901. [18]张奇,郑锐,黄萱菁.大规模语言模型:从理论到实践[M].北京:电子工业出版社,2024. [19]WEI J,WANG X,SCHUURMANS D,et al.Chain-of-thought prompting elicits reasoning in large language models[J].Advances in Neural Information Processing Systems,2022,35:24824-24837. [20]KOJIMA T,GU S S,REID M,et al.Large language models are zero-shot reasoners[J].Advances in Neural Information Processing Systems,2022,35:22199-22213. [21]YAO S,YU D,ZHAO J,et al.Tree of thoughts:Deliberateproblem solving with large language models[J].arXiv:2305.10601,2023. [22]MADAAN A,TANDON N,GUPTA P,et al.Self-refine:Iterative refinement with self-feedback[J].arXiv:2303.17651,2023. [23]YAO S,ZHAO J,YU D,et al.React:Synergizing reasoning and acting in language models[J].arXiv:2210.03629,2022. [24]SHINN N,CASSANO F,LABASH B,et al.Reflexion:Lan-guage Agents with Verbal Reinforcement Learning[J].arXiv:2303.11366,2023. [25]YANG Z,LIU J,HAN Y,et al.Appagent:Multimodal agents as smartphone users[J].arXiv:2312.13771,2023. [26]FURUTA H,NACHUM O,LEE K H,et al.Multimodal web navigation with instruction-finetuned foundation models[J].arXiv:2305.11854,2023. [27]ZHAN Z,ZHANG A.You only look at screens:Multimodalchain-of-action agents[J].arXiv:2309.11436,2023. |
|