计算机科学 ›› 2016, Vol. 43 ›› Issue (Z6): 480-484.doi: 10.11896/j.issn.1002-137X.2016.6A.113
郭建华,杨洪斌,陈圣波
GUO Jian-hua, YANG Hong-bin and CHEN Sheng-bo
摘要: 基于视频数据的分布式计算与基于文本类型数据的分布式计算存在很大的差异。视频数据本身是非结构化的,并且对于同样大小的视频,若其内容不同会导致任务执行消耗的时间也不同。对于简单的结构化数据,HDFS默认的负载均衡器能够解决负载均衡的问题。但是视频文件存在热点访问以及复杂度不一致的问题。使用HDFS默认的数据分布机制不能很好地解决计算负载均衡问题。因此提出了一种基于HDFS的海量视频数据重分布算法。首先对视频文件的访问次数以及历史视频分析对视频文件的访问时间进行记录;然后对数据进行量化之后将其加权作为该视频文件的负载度;最后使用文件置换手段将负载高的视频与低的视频进行置换,直到每个节点的负载达到均衡为止。实验结果表明,使用提出的数据重分布算法可以减少海量视频数据的处理时间。
[1] White T.Hadoop 权威指南[M].北京:清华大学出版社,2011:1-123 [2] Zaharia M,Chowdhury M,Das T,et al.Resilient distributeddatasets:A fault-tolerant abstraction for in-memory cluster computing[C]∥Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation.USENIX Association,2012:2-2 [3] Geng Chen-yao,et al.Distributed Video Processing PlatformBased on Map Reduce[J].Computer Engineering,2012,38(10):280-283 [4] Zaharia M,Chowdhury M,Franklin M J,et al.Spark:clustercomputing with working sets[C]∥Proceedings of the 2nd USENIX Conference on Hot Topics in Cloud Computing.2010:1765-1773 [5] Zaharia M,Borthakur D,Sen Sarma J,et al.Delay scheduling:a simple technique for achieving locality and fairness in cluster scheduling[C]∥Proceedings of the 5th European Conference on Computer Systems.ACM,2010:265-278 [6] Dean J,Ghemawat S.MapReduce:simplified data processing on large clusters[J].Communications of the ACM,2008,51(1):107-113 [7] Lin S H.An introduction to face recognition technology[J].Informing Science,2000,3(1):1-8 [8] Spark job scheduling.http://spark.apache.org/ docs/latest/job-scheduling.htm [9] Borthakur D.HDFS architecture guide.HADOOP APACHEPROJECT.(2008).http://hadoop.apache.org/common/docs/current/hdfs design.pdf,2008 [10] Kapil B S,Kamath S S.Resource aware scheduling in Hadoop for heterogeneous workloads based on load estimation[C]∥2013 Fourth International Conference on Computing,Communications and Networking Technologies (ICCCNT).IEEE,2013:1-5 [11] Orrite C,Bernues E,Gracia J J,et al.Face detection and recognition in a video sequence[C]∥Defense and Security.InternationalSociety for Optics and Photonics,2004:94-105 [12] Bezerra A,Hernández P,Espinosa A,et al.Job scheduling for optimizing data locality in Hadoop clusters[C]∥Proceedings of the 20th European MPI Users’ Group Meeting.ACM,2013:271-276 |
No related articles found! |
|