计算机科学 ›› 2015, Vol. 42 ›› Issue (6): 243-246.doi: 10.11896/j.issn.1002-137X.2015.06.051
高阳,严建峰,刘晓升
GAO Yang, YAN Jian-feng and LIU Xiao-sheng
摘要: 并行潜在狄利克雷分配(LDA)主题模型在计算与通信两方面的时间消耗较 大,导致训练模型的时间过长,因而无法被广泛应用。提出朴素并行LDA算法,针对计算和通信分别提出改进方法。一方面通过加入单词影响因子以及设置阈值的方法来降低文本训练的粒度,另一方面通过降低通信频率来减少通信时间。实验结果表明,优化后的并行LDA在保证精度损失为1%的前提下,将训练速度提高了36%,有效提高了并行的加速比。
[1] Deerwester S C,Dumais S T,Landauer T K,et al.Indexing bylatent semantic analysis[J].Journal of the American Society for Information Science,1990,41(6):391-407 [2] Hofmann T.Probabilistic latent semantic indexing[C]∥Special Inspector General for Iraq Reconstruction.1999:50-57 [3] Blei D M,Ng A Y,Jordan M I.Latent dirichlet allocation[C]∥Neural Information Processing Systems.2001:601-608 [4] Griffiths T L,Steyvers M.Finding scientific topics[J].Procee-dings of the National Academy of Sciences,2004,101(1):5228-5235 [5] Zeng J,Cheung W K,Liu J.Learning topic models by belief propagation[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2013,5(5):1121-1134 [6] 都志辉,等.高性能计算并行编程技术——MPI并行程序设计[M].北京:清华大学出版社,2001 Du Zhi-hui,et al.High performance computing parallel programming technology——MPI parallel program design[M].Peking:Tsinghua University Press,2001 [7] Newman D,Asuncion A U,Smyth P,et al.Distributed inference for latent dirichlet allocation[C]∥Neural Information Proces-sing Systems.2007 [8] Asuncion A U,Smyth P,Welling M.Asynchronous distributed learning of topic models[C]∥Neural Information Processing Systems.2008:81-88 [9] Wang Y,Bai H,Stanton M,et al.Plda:Parallel latent dirichlet allocation for large-scale applications[C]∥AAIM.2009:301-314 [10] Liu Z,Zhang Y,Chang E Y,et al.Plda+:Parallel latentdirichlet allocation with data placement and pipeline processing[J].ACM TIST,2011,2(3):1-18 [11] Zhai K,Boyd-Graber J L,Asadi N,et al.lda:a flexible large scale topic modeling package using variational inference in mapreduce[C]∥ WWW.2012:879-888 [12] Yan F,Xu N,Qi Y.Parallel inference for latent dirichlet allocation on graphics processing units[C]∥Neural Information Processing Systems.2009:2134-2142 |
No related articles found! |
|