融合增强DOM与多模态大模型的Canvas元素自动化测试方法

doi:10.11896/jsjkx.250900004

Abstract

Abstract: As a core component of modern Web applications,HTML5 Canvas is widely used for dynamic rendering of interfaces,data visualization,etc.However,since Canvas elements lack a DOM structure,existing Web testing tools struggle to test them effectively.In order to solve the above problem,this paper proposes an automated testing method for Canvas elements based on large models,which solves this challenge by combining the advantages of computer vision technology and large model technology.The YOLO object detection algorithm is used to extract the category and geometric attributes of the elements inside the Canvas interface,and further extract the color,related text and hierarchical relationship of the inferred elements to construct an enhanced DOM structure.Prompt strategies are designed to guide large models to make full use of Canvas images and enhance DOM information to generate high-coverage test cases.Experiments show that the proposed method is significantly better than the existing methods(such as VisionTasker) in terms of results,and achieves 10.53% and 16.85% improvement in element coverage and interaction coverage,respectively.In addition,only by using the enhanced DOM structure for test case generation,99.18% of the element coverage and 98.22% of the interaction coverage effect of the proposed method can be achieved with less resource consumption.In addition,this paper compares the performance of different large language models on the research tasks,verifying the versatility and effectiveness of the proposed method.

Key words: Canvas testing, LLM, Enhanced DOM, Coverage

CLC Number:

TP311

ZHANG Weifeng, WANG Xiangwei, XU Lei. Automated Testing Method for Canvas Elements Based on Large Language Models[J].Computer Science, 2026, 53(6): 416-426.

References

[1]KONSTANTINIDIS E I,BAMPAROPOULOS G,BAMIDIS P D.Moving real exergaming engines on the web:the web fitforall case study in an active and healthy ageing living lab environment[J].IEEE Journal of Biomedical and Health Informatics,2016,21(3):859-866.
[2]PARISI T.Programming 3d applications with html5 and webgl:3d animation and visualization for web pages [M].O'Reilly Media,Inc.,2014.
[3]BUELS R,YAO E,DIESH C M,et al.Jbrowse:a dynamic web platform for genome visualization and analysis[J].Genome Bio-logy,2016,17:1-12.
[4]ELMQVIST N,STASKO J,TSIGAS P.Datameadow:A visual canvas for analysis of large-scale multivariate data[C]//2007 IEEE Symposium on Visual Analytics Science and Technology.2007:187-194.
[5]WELSMAN-DINELLE M.The animation canvas[D].Vancouver:University of British Columbia,2011.
[6]GOJARE S,JOSHI R,GAIGAWARE D.Analysis and design of selenium webdriver automation testing framework[J].Procedia Computer Science,2015,50:341-346.
[7]BRUNELLE J F,WEIGLE M C,NELSON M L.CombiningHeritrix and PhantomJS for Better Crawling of Pages with Javascript[C]//International Internet Preservation Consortium(IIPC) 2016 Conference.2016.
[8]BAJAMMAL M,MESBAH A.Web canvas testing through vi-sual inference[C]//2018 IEEE 11th International Conference on Software Testing,Verification and Validation(ICST).2018:193-203.
[9]WETZLMAIER T,RAMLER R,PUTSCHÖGL W.A frame-work for monkey GUI testing[C]//2016 IEEE International Conference on Software Testing,Verification and Validation(ICST).IEEE,2016:416-423.
[10]KANG S,HU Z,LIU L,et al.Object detection yolo algorithms and their industrial applications:Overview and Comparative Analysis[J].Electronics,2025,14(6):1104.
[11]MACKLON F,BEZEMER C P.Exploring the capabilities of vision-language models to detect visual bugs in html5〈canvas〉 applications[J].arXiv:2501.09236,2025.
[12]MACKLON F,TAESIRI M R,VIGGIATO M,et al.Automatically detecting visual bugs in html5 canvas games[C]//Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering.2022.
[13]MACKLON F,VIGGIATO M,ROMANOVA N,et al.A taxonomy of testable html5 canvas issues[J].IEEE Transactions on Software Engineering,2023,49(6):3647-3659.
[14]ZIMMERMANN D,KOZIOLEK A.Gui-based software testing:An automated approach using gpt-4 and selenium webdriver[C]//2023 38th IEEE/ACM International Conference on Automated Software Engineering Workshops(ASEW).2023:171-174.
[15]MACHIRY A,TAHILIANI R,NAIK M.Dynodroid:an inputgeneration system for android apps[C]//ESEC/FSE 2013:Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering.New York:ACM,2013:224-234.
[16]GU T,SUN C,MA X,et al.Practical gui testing of android applications via model abstraction and refinement [C]//2019 IEEE/ACM 41st International Conference on Software Engineering(ICSE).2019:269-280.
[17]MAO K,HARMAN M,JIA Y.Sapienz:multi-objective automated testing for android applications[C]//ISSTA 2016:Procee-dings of the 25th International Symposium on Software Testing and Analysis.New York:ACM,2016:94-105.
[18]SU T,MENG G,CHEN Y,et al.Guided,stochastic model-based gui testing of android apps[C]//ESEC/FSE 2017:Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering.New York:ACM,2017:245-256.
[19]WANG J,JIANG Y,XU C,et al.Combodroid:Generating high-quality test inputs for android apps via use case combinations[C]//2020 IEEE/ACM 42nd International Conference on Software Engineering(ICSE).2020:469-480.
[20]YAZDANIBANAFSHEDARAGH F,MALEK S.Deep gui:Black-box gui input generation with deep learning[C]//2021 36th IEEE/ACM International Conference on Automated Software Engineering(ASE).2021:905-916.
[21]HARRIES L,CLARKE R S,CHAPMAN T,et al.Drift:Deep reinforcement learning for functional software testing [J].ar-Xiv:2007.08220,2020.
[22]LI Y,YANG Z,GUO Y,et al.Humanoid:A deep learning-based approach to automated black-box android app testing[C]//2019 34th IEEE/ACM International Conference on Automated Software Engineering(ASE).2019:1070-1073.
[23]ALHARBI K,YEH T.Collect,decompile,extract,stats,anddiff:Mining design pattern changes in android apps [C]//MobileHCI '15:Proceedings of the 17th International Conference on Human-Computer Interaction with Mobile Devices and Ser-vices.New York:ACM,2015:515-524.
[24]LIU Z,CHEN C,WANG J,et al.Make llm a testing expert:Bringing human-like interaction to mobile gui testing via functionality-aware decisions[C]//ICSE '24:Proceedings of the IEEE/ACM 46th International Conference on Software Engineering.New York:ACM,2024.
[25]SMITH R.An overview of the tesseract ocr engine[C]//Ninth International Conference on Document Analysis and Recognition(ICDAR 2007).2007:629-633.
[26]DING L,GOSHTASBY A.On the canny edge detector[J].Pattern Recognition,2001,34(3):721-725.
[27]HERSHBERGER J E,SNOEYINK J.Speeding up the douglas-peucker line-simplification algorithm[C]//Proceedings of the 5th Symposium on Data Handling.1992.
[28]WU J,WANG S,SHEN S,et al.Webui:A dataset for enhancing visual ui understanding with web semantics[C]//CHI '23:Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems.New York:ACM,2023.
[29]SONG Y,BIAN Y,TANG Y,et al.Visiontasker:Mobile task automation using vision based ui understanding and llm task planning[C]//UIST '24:Proceedings of the 37th Annual ACM Symposium on User Interface Software and Technology.New York:ACM,2024.
[30]YANG A,YANG B,ZHANG B,et al.Qwen2.5 technical report[J].arXiv:2412.15115,2024.
[31]LIU A,FENG B,XUE B,et al.Deepseek-v3 technical report[J].arXiv:2412.19437,2024.
[32]GUO D,YANG D,ZHANG H,et al.Deepseek-r1:Incentivizing reasoning capability in llms via reinforcement learning[J].ar-Xiv:2501.12948,2025.
[33]BROWN T,MANN B,RYDER N,et al.Language models are few-shot learners[J].Advances in Neural Information Proces-sing Systems,2020,33:1877-1901.
[34]ACHIAM J,ADLER S,AGARWAL S,et al.Gpt-4 technical report[J].arXiv:2303.08774,2023.

Related Articles 15

[1]	HAN Linrui, ZHENG Ri, CONG Yingnan. Explainable Sentencing Prediction Method Driven by Sentencing Rule Knowledge Graph [J]. Computer Science, 2026, 53(5): 286-298.
[2]	LI Zhongjie, LIANG Haotian, JIA Haoyang, WANG Qingxian , CAO Yan. Research on Fuzz Testing Techniques for Closed-source DBMSs Based on Black-box Instrumentation [J]. Computer Science, 2026, 53(2): 133-144.
[3]	SHAO Xinyi, ZHU Jingwei, ZHANG Liang. LLM-based Business Process Adaptation Method to Respond Long-tailed Changes [J]. Computer Science, 2026, 53(1): 29-38.
[4]	LIU Dayong, DONG Zhiming, GUO Qisheng, GAO Ang, QIU Xuehuan. Research on Architecture and Technology Pathways for Empowering Tactical AdversarialSimulation Experiments with LLMs [J]. Computer Science, 2026, 53(1): 39-50.
[5]	PAN Jie, WANG Juan, WANG Nan. Large Language Models and Rumors:A Survey on Generation and Detection [J]. Computer Science, 2025, 52(11): 1-12.
[6]	BAO Zepeng, QIAN Tieyun. Survey on Large Model Red Teaming [J]. Computer Science, 2025, 52(1): 34-41.
[7]	LI Jiahui, ZHANG Mengmeng, CHEN Honghui. Large Language Models Driven Framework for Multi-agent Military Requirement Generation [J]. Computer Science, 2025, 52(1): 65-71.
[8]	LI Zhibo, LI Qingbao, LAN Mingjing. Method of Generating Test Data by Genetic Algorithm Based on ART Optimal Selection Strategy [J]. Computer Science, 2024, 51(6): 95-103.
[9]	LIU Jiahao, JIANG He. DeepGenFuzz:An Efficient PDF Application Fuzzing Test Case Generation Framework Based on Deep Learning [J]. Computer Science, 2024, 51(12): 53-62.
[10]	YANG Pengfei, WANG Shuqi, HUANG Jiayang, ZHANG Wenjuan, WANG Quan, ZHONG Haodi. Learning Path Recommendation Method Based on Feature Similarity and Jaccard Median [J]. Computer Science, 2024, 51(10): 153-161.
[11]	AN Haojia, SHI Dianxi, LI lin, SUN Yixuan, YANG Shaowu, CHEN Xucan. TAMP:A Hierarchical Multi-robot Task Assignment Method for Area Coverage [J]. Computer Science, 2023, 50(9): 269-277.
[12]	Cui ZHANG, En WANG, Funing YANG, Yong jian YANG , Nan JIANG. UAV Frequency-based Crowdsensing Using Grouping Multi-agentDeep Reinforcement Learning [J]. Computer Science, 2023, 50(2): 57-68.
[13]	WANG Fang-hong, FAN Xing-gang, YANG Jing-jing, ZHOU Jie, WANG De-en. Strong Barrier Construction Algorithm Based on Adjustment of Directional Sensing Area [J]. Computer Science, 2022, 49(6A): 612-618.
[14]	FAN Xing-ze, YU Mei. Coverage Optimization of WSN Based on Improved Grey Wolf Optimizer [J]. Computer Science, 2022, 49(6A): 628-631.
[15]	CHEN Zhuang, ZOU Hai-tao, ZHENG Shang, YU Hua-long, GAO Shang. Diversity Recommendation Algorithm Based on User Coverage and Rating Differences [J]. Computer Science, 2022, 49(5): 159-164.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

Automated Testing Method for Canvas Elements Based on Large Language Models

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 15

Metrics

Comments

Recommended 0