计算机科学 ›› 2020, Vol. 47 ›› Issue (12): 50-55.doi: 10.11896/jsjkx.200700145
所属专题: 复杂系统的软件工程和需求工程
田宇立, 李宁
TIAN Yu-li, LI Ning
摘要: 从软件系统使用视角出发开展系统使用模式与故障分析可以帮助软件提供者更准确地把握用户需求、评价系统质量、指导系统运营和完善系统维护方案.云计算系统整合海量计算资源并通过网络接入为用户提供可配置的计算解决方案受到了学术界和工业界的一致关注.深入理解云计算系统的使用负载和软件故障特征对于提高云计算系统的资源利用效率和系统服务可靠性具有重要的促进作用.文中针对云计算环境下的系统使用模式和系统故障进行研究深入分析了Googlecluster云计算系统的真实执行日志从系统使用模式和故障特征等方面对系统进行了描述和总结揭示了系统存在的质量问题并为提高云计算系统的质量奠定了基础.
中图分类号:
[1] CLOUD H.The NIST definition of cloud computing[R].National Institute of Science and Technology ,Technical Report,2011,145. [2] LIU B,LIN Y,CHEN Y.Quantitative workload analysis andprediction using Google cluster traces[C]//IEEE Conference on Computer Communications.2016:935-940. [3] MORENO I S,GARRAGHAN P,TOWNEND P,et al.Analysis,modeling and simulation of workload patterns in a large-scale utility cloud[J].IEEE Transactions on Cloud Computing,2014,2(2):208-221. [4] ROSÁ A,CHEN L Y,BINDER W.Failure analysis and prediction for big-data systems[J].IEEE Transactions on Services Computing,2016,10(6):984-998. [5] GARRAGHAN P,TOWNEND P,XU J.An empirical failure-analysis of a large-scale cloud computing environment[C]//International Symposium on High-Assurance Systems Engineering.2014:113-120. [6] COTRONEO D,DE SIMONE L,LIGUORI P,et al.Enhancing failure propagation analysis in cloud computing systems[C]//International Symposium on Software Reliability Engineering.2019. [7] LYU M R.Handbook of software reliability engineering[M].CA:IEEE computer society press,1996. [8] TIAN J,RUDRARAJU S,LI Z.Evaluating web software reliability based on workload and failure data extracted from server logs[J].IEEE Transactions on Software Engineering,2004,30(11):754-769. [9] GUPTA S,DILEEP A D.Long range dependence in cloud servers:a statistical analysis based on Google workload trace[J].Computing,2020:102(4):1-19. [10] KAVULYA S,TAN J,GANDHI R,et al.An analysis of traces from a production mapreduce cluster[C]//IEEE International Conference on Cluster,Cloud and Grid Computing.2010:94-103. [11] CHEN Z,HU J,MIN G,et al.Towards accurate prediction for high-dimensional and highly-variable cloud workloads with deep Learning[J].IEEE Transactions on Parallel and Distributed Systems,2020,31(4):923-934. [12] TIAN J.Software quality engineering:testing,quality assur-ance,and quantifiable improvement[M].John Wiley &Sons,2005. [13] GARG S K,GOPALAIYENGAR S K,BUYYA R.SLA-based resource provisioning for heterogeneous workloads in a virtualized cloud datacenter[C]//International Conference on Algorithms and Architectures for Parallel Processing.2011:371-384. [14] SHARMA B,CHUDNOVSKY V,HELLERSTEIN J L,et al.Modeling and synthesizing task placement constraints in Google compute clusters[C]//ACM Symposium on Cloud Computing.2011:1-14. [15] ZHU X,YANG L T,CHEN H,et al.Real-time tasks oriented energy-aware scheduling in virtualized clouds[J].IEEE Transactions on Cloud Computing,2014,2(2):168-180. [16] SAHOO R K,SQUILLANTE M S,SIVASUBRAMANIAM A,et al.Failure data analysis of a large-scale heterogeneous server environment[C]//IEEE International Conference on Dependable Systems and Networks.2004:772-781. [17] REISS C,WILKES J,HELLERSTEIN J L.Google cluster-usage traces:format+ schema[R].Google Inc.,Technical Report,2011:1-14. [18] CHEN Z,HU J,MIN G,et al.Towards accurate prediction for high-dimensional and highly-variable cloud workloads with deep learning[J].IEEE Transactions on Parallel and Distributed Systems,2019,31(4):923-934. [19] KHAN A A,ZAKARYA M,BUYYA R,et al.An energy and performance aware consolidation technique for containerized datacenters[J].IEEE Transactions on Cloud Computing,2019,PP(99). [20] MUSA J D,IANNINO A,OKUMOTO K.Software Reliability:Measurement,prediction,application[M].McGrawHill,New York,1987. [21] TIAN J.Integrating time domain and input domain analyses of software reliability using tree-based models[J].IEEE Transactions on Software Engineering,1995,21(12):945-958. [22] REISS C,TUMANOV A,GANGER G R,et al.Towards understanding heterogeneous clouds at scale:Google trace analysis[R].Intel Science and Technology Center for Cloud Computing,Tech.Rep,2012. |
[1] | 高诗尧, 陈燕俐, 许玉岚. 云环境下基于属性的多关键字可搜索加密方案 Expressive Attribute-based Searchable Encryption Scheme in Cloud Computing 计算机科学, 2022, 49(3): 313-321. https://doi.org/10.11896/jsjkx.201100214 |
[2] | 王政, 姜春茂. 一种基于三支决策的云任务调度优化算法 Cloud Task Scheduling Algorithm Based on Three-way Decisions 计算机科学, 2021, 48(6A): 420-426. https://doi.org/10.11896/jsjkx.201000023 |
[3] | 潘瑞杰, 王高才, 黄珩逸. 云计算下基于动态用户信任度的属性访问控制 Attribute Access Control Based on Dynamic User Trust in Cloud Computing 计算机科学, 2021, 48(5): 313-319. https://doi.org/10.11896/jsjkx.200400013 |
[4] | 陈玉平, 刘波, 林伟伟, 程慧雯. 云边协同综述 Survey of Cloud-edge Collaboration 计算机科学, 2021, 48(3): 259-268. https://doi.org/10.11896/jsjkx.201000109 |
[5] | 王文娟, 杜学绘, 任志宇, 单棣斌. 基于因果知识和时空关联的云平台攻击场景重构 Reconstruction of Cloud Platform Attack Scenario Based on Causal Knowledge and Temporal- Spatial Correlation 计算机科学, 2021, 48(2): 317-323. https://doi.org/10.11896/jsjkx.191200172 |
[6] | 蒋慧敏, 蒋哲远. 企业云服务体系结构的参考模型与开发方法 Reference Model and Development Methodology for Enterprise Cloud Service Architecture 计算机科学, 2021, 48(2): 13-22. https://doi.org/10.11896/jsjkx.200300044 |
[7] | 毛瀚宇, 聂铁铮, 申德荣, 于戈, 徐石成, 何光宇. 区块链即服务平台关键技术及发展综述 Survey on Key Techniques and Development of Blockchain as a Service Platform 计算机科学, 2021, 48(11): 4-11. https://doi.org/10.11896/jsjkx.210500159 |
[8] | 王勤, 魏立斐, 刘纪海, 张蕾. 基于云服务器辅助的多方隐私交集计算协议 Private Set Intersection Protocols Among Multi-party with Cloud Server Aided 计算机科学, 2021, 48(10): 301-307. https://doi.org/10.11896/jsjkx.210300308 |
[9] | 雷阳, 姜瑛. 云计算环境下关联节点的异常判断 Anomaly Judgment of Directly Associated Nodes Under Cloud Computing Environment 计算机科学, 2021, 48(1): 295-300. https://doi.org/10.11896/jsjkx.191200186 |
[10] | 徐蕴琪, 黄荷, 金钟. 容器技术在科学计算中的应用研究 Application Research on Container Technology in Scientific Computing 计算机科学, 2021, 48(1): 319-325. https://doi.org/10.11896/jsjkx.191100111 |
[11] | 张恺琪, 涂志莹, 初佃辉, 李春山. 基于排队论的服务资源可用性相关研究综述 Survey on Service Resource Availability Forecast Based on Queuing Theory 计算机科学, 2021, 48(1): 26-33. https://doi.org/10.11896/jsjkx.200900211 |
[12] | 李彦, 申德荣, 聂铁铮, 寇月. 面向加密云数据的多关键字语义搜索方法 Multi-keyword Semantic Search Scheme for Encrypted Cloud Data 计算机科学, 2020, 47(9): 318-323. https://doi.org/10.11896/jsjkx.190800139 |
[13] | 马潇潇, 黄艳. 大属性可公开追踪的密文策略属性基加密方案 Publicly Traceable Accountable Ciphertext Policy Attribute Based Encryption Scheme Supporting Large Universe 计算机科学, 2020, 47(6A): 420-423. https://doi.org/10.11896/JsJkx.190700131 |
[14] | 梁俊斌, 张敏, 蒋婵. 社交传感云安全研究进展 Research Progress of Social Sensor Cloud Security 计算机科学, 2020, 47(6): 276-283. https://doi.org/10.11896/jsjkx.190400116 |
[15] | 金小敏, 滑文强. 移动云计算中面向能耗优化的资源管理 Energy Optimization Oriented Resource Management in Mobile Cloud Computing 计算机科学, 2020, 47(6): 247-251. https://doi.org/10.11896/jsjkx.190400020 |
|