Computer Science ›› 2022, Vol. 49 ›› Issue (12): 353-361.doi: 10.11896/jsjkx.211000059

• Information Security • Previous Articles     Next Articles

Reverse Location of Software Online Upgrade Function Based on Semantic Guidance

LYU Xiao-shao, SHU Hui, KANG Fei, HUANG Yu-yao   

  1. State Key Laboratory of Mathematical Engineering and Advanced Computing,Information Engineering University,Zhenzhou 450001,China
  • Received:2021-10-10 Revised:2022-04-23 Published:2022-12-14
  • About author:LYU Xiao-shao,born in 1989,postgra-duate.His main research interests include cyber security and reverse engineering.SHU Hui,born in 1974,Ph.D,professor,Ph.D supervisor.His main research interests include cyber security and reverse engineering.
  • Supported by:
    National Key R & D Program of China(2019QY1305).

Abstract: The hijacking attack for software online upgrade is one of the most common methods of network attack.Program ana-lysis is an important method to evaluate the security of software upgrade quickly and automatically.Rapid reverse positioning of upgrade functions in software is a key premise to realize static analysis and improve the efficiency of dynamic analysis.Traditional program analysis reverse localization relies on manual experience based on the cross reference chain relation of semantic information,such as string and API function,which is inefficient and cannot be automated.To solve this problem,this paper proposes a software upgrade function localization method based on semantic analysis and reverse analysis.Firstly,an upgrade semantic classification model based on natural language processing is established for common semantic information(string,function name,API function,etc.) in software binary program.Secondly,the software semantic information is extracted by reverse analysis tool,and the upgrade semantic classification model is used to identify the upgrade semantic information.Finally,an algorithm is defined to solve the key nodes of the upgrade function in the graph tree of function call relationship.This paper designs and implements a software online upgrade positioning system,and carries out reverse positioning analysis on 153 commonly used softwares,126 of which are successfully located.The security of some software upgrades is preliminarily evaluated by positioning analysis,and one CNNVD vulnerability and five CNVD vulnerabilities are found.

Key words: Software online update, Semantic information, Text classification model, Binary program reverse analysis, Function positioning

CLC Number: 

  • TP393
[1]ZHANG J,ZHANG C,XUAN J F,et al.Research Progress of Program Analysis[J].Journal of Software,2019,30(1):80-109.
[2]FU J M,LIU G,LI P W,et al.A security analysis method for antivirus software upgrade process [J].Journal of Wuhan University(Science Edition),2015(12):509-516.
[3]TENG J H,GUANG Y,SHU H,et al.Automatic DetectionMethod of Software Upgrade Vulnerability Based on Traffic Analysis[J].Journal of Network and Information Security,2020,6(1):94-108.
[4]CIFUENTES C,MIKE V.Recovery of jump table case statements from binary code[J].Science of Computer Programming,1999,40(10):171-188.
[5]KINDER J.Static Analysis of x86 Executables[D].Darmstadt:Technische Universitat Darmstadt,2010.
[6]KINDER J,VEITH H.JAKSTAB.A Static Analysis Platform for Binaries[C]//Proceedings of the 20th International Confe-rence on Computer Aided Verification.2008:423-427.
[7]CHUA Z,SHEN S,SAXENA P,et al.Neural nets can learn function type signatures from binaries[C]//Proceedings of the USENIX Security.2017:99-116.
[8]WEI Y,ZONG P,CHEN K,et al.SemFuzz:Semantics-basedAutomatic Generation of Proof-of-Concept Exploits[C]//ACM SIGSAC Conference on Computer and Communications Security.2017:2139-2154.
[9]BIAN P,LIANG B,HUANG J,et al.SinkFinder:Harvesting Hundreds of Unknown Interesting Function Pairs with Just One Seed[C]//Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering.2020:1101-1113.
[10]HU Y,WANG H,ZHANG Y,et al.A Semantics-Based Hybrid Approach on Binary Code Similarity Comparison[J].IEEE Transactions on Software Engineering,2021(6):1241-1258.
[11]NAN Y H,YANG Z,WANG X,et al.Finding Clues for Your Secrets:Semantics-Driven,Learning-Based Privacy Discovery in Mobile Apps[C]//Proceedings of the 24th Annual Network and Distributed System Security Symposium.2018.
[12]DUAN G.Encryption and decryption [M].Publishing House of Electronics Industry.2018:65-94.
[13]DEREK A.Wordninja[EB/OL].https://github.com/kered-son/wordninja.
[14]NLTK.NLTK Document[EB/OL].http://www.nltk.org/.
[15]MIKOLOV T,CHEN K,CORRADO G,et al.Efficient estimation of word representations in vector space[C]//Proceedings of the International Conference on Learning Representations(ICLR 2013).2013:1-12.
[16]MIKOLOV T,SUTSKEVER I,CHEN K,et al.DistributedRepresentations of Words and Phrases and Their Compositiona-lity[C]//Proceedings of the Advances in Neural Information Processing Systems.2013:3111-3119.
[17]CHARLES E,KEITH N.Learning Classifiers from Only Positive and Unlabeled Data[C]//Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.2008:213-220.
[18]OLIVIER C,BERNHARD S,ALEXANDER Z.Semi-Supervised Learning[J].IEEE Transactions on Neural Networks,2009,20(3):542-542.
[19]ZHOU Z H,LI M.Semi-Supervised Learning by Disagreement[J].Knowledge and Information Systems,2010,24(3):415-439.
[20]CHARLES E,KEITH N.Learning Classifiers from Only Positive and Unlabeled Data[C]//Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.2008:213-220.
[21]KABOUTARI A,BAGHERZADEH J.An Evaluation of Two-Step Techniques for Positive-Unlabeled Learning in Text Classification[J].International Journal of Computer Applications Technology and Research,2014,3(9):592-594.
[22]MARTHINUS C,GANG N,MASASHI S.Analysis of Learning from Positive and Unlabeled Data[C]//Advances in Neural Information Processing Systems.2014:703-711.
[23]HWANJO Y,JIAWEI H,CHANG K.PEBL:Web page classification without negative examples[J].IEEE Transactions on Knowledge and Data Engineering,2004,16(1):70-81.
[24]FLARE.IDA Pro Script Series:Automating Function Argument Extraction[EB/OL].https://www.fireeye.com/blog/threat-research/2015/11/flare_ida_pro_script.html.
[1] YAN Jia-dan, JIA Cai-yan. Text Classification Method Based on Information Fusion of Dual-graph Neural Network [J]. Computer Science, 2022, 49(8): 230-236.
[2] GUO Liang, YANG Xing-yao, YU Jiong, HAN Chen, HUANG Zhong-hao. Hybrid Recommender System Based on Attention Mechanisms and Gating Network [J]. Computer Science, 2022, 49(6): 158-164.
[3] PAN Zhi-hao, ZENG Bi, LIAO Wen-xiong, WEI Peng-fei, WEN Song. Interactive Attention Graph Convolutional Networks for Aspect-based Sentiment Classification [J]. Computer Science, 2022, 49(3): 294-300.
[4] SHAO Hai-lin, JI Yi, LIU Chun-ping, XU Yun-long. Scene Text Detection Algorithm Based on Enhanced Feature Pyramid Network [J]. Computer Science, 2022, 49(2): 248-255.
[5] CHENG Hua-ling, CHEN Yan-ping, YANG Wei-zhe, QIN Yong-bin, HUANG Rui-zhang. Relation Extraction Based on Multidimensional Semantic Mapping [J]. Computer Science, 2022, 49(11): 206-211.
[6] WU Lan, WANG Han, LI Bin-quan. Unsupervised Domain Adaptive Method Based on Optimal Selection of Self-supervised Tasks [J]. Computer Science, 2021, 48(6A): 357-363.
[7] HUO Dan, ZHANG Sheng-jie, WAN Lu-jun. Context-based Emotional Word Vector Hybrid Model [J]. Computer Science, 2020, 47(11A): 28-34.
[8] LU Hai-chuan, FU Hai-dong, LIU Yu. Geo-semantic Data Storage and Retrieval Mechanism Based on CAN [J]. Computer Science, 2019, 46(2): 171-177.
[9] ZHANG Qun, WANG Hong-jun and WANG Lun-wen. Short Text Clustering Algorithm Combined with Context Semantic Information [J]. Computer Science, 2016, 43(Z11): 443-446.
[10] WU Zhi-peng HUANG Zhi-qiu WANG Shan-shan CAO De-jian. Research on Framework of Safety Verification Based on Fault-extended SysML Activity Diagram [J]. Computer Science, 2015, 42(7): 222-228.
[11] YOU Hong-tao,ZHANG Yan-yuan,LIN Yi and LIU Sheng. Based on the Semantic Information of the Stored Energy Efficiency Research [J]. Computer Science, 2013, 40(Z6): 112-114.
[12] . Semantic-based Chinese Web Page Retrieval [J]. Computer Science, 2012, 39(8): 79-87.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!