计算机科学 ›› 2015, Vol. 42 ›› Issue (4): 244-248.doi: 10.11896/j.issn.1002-137X.2015.04.050
王海平,戴 玮,郭 丹
WANG Hai-ping, DAI Wei and GUO Dan
摘要: 近年来,随着生物信息学、信息检索等领域的发展,串模式匹配问题被不断扩展。其中,具有代表性的是在模式中引入可变长度的通配符而形成带有通配符的模式匹配(PMWL)。该问题定义的灵活性给用户提供了方便,却也造成了求解上的困难。因此,如何在多项式时间内得到更好的匹配解成为研究的焦点。提出了一种启发式的小兵算法。小兵算法通过将PMWL问题转化为路径搜索问题,并借鉴动态剪枝思想,在算法搜索的过程中动态地将不可能的匹配位置剪枝,从而提高解的质量。实验在真实DNA序列上进行,并人工生成了196个模式。结果表明,相比于目前最有效的SAIL算法,小兵算法在绝大多数的尾部有重复字符的模式中可以获得更好的匹配解。
[1] Navarro G,Raffinot M.Flexible pattern matching in strings:Practical on-line search algorithms for texts and biological sequences[M].Cambridge,UK:Cambridge University Press,2002 [2] Han J,Cheng H,Xin D,et al.Frequent pattern mining:CurrentStatus and Future Directions[J].Data Mining and Knowledge Discovery,2007,15(1):55-86 [3] Fischer M J,Paterson M S.String matching and other products[R].Complexity of computation,1974:113-125 [4] Cole J R,Chai B,Marsh T L,et al.The ribosomal database project(RDP-11):Sequences and tools for high-throughput rRNA analysis[J].Nucleic Acids Research,2005,33 (1):294-296 [5] 徐小双,冯玉才,王锋,等.路径分区编码优化小枝查询[J].计算机科学,2010,37(3):182-204 [6] Xie F,Wu X D,Hu X G,et al.Sequential Pattern Mining with Wildcards[C]∥22nd IEEE International Conference on Tools with Artificial Intelligence (ICTAI).2010:241-247 [7] Guo D,Hu X G,Xie F,et al.Pattern matching with wildcards and gap-length constraints based on a centrality-degree graph[J].Applied Intelligence,2013,39:57-74 [8] Bille P,Gortz I,Vildhoj H,et al.String matching with variable length gaps[J].Theoretical Computer Science,2012,443:25-34 [9] Chen G,Wu X D,Zhu X Q,et al.Efficient String Matching with Wildcards and Length Constraints[J].Knowledge and Information Systems,2006,10(4):399-419 [10] Min F,Wu X D,Lu Z.Pattern matching with independent wildcard gaps[C]∥Eighth IEEE International Conference on Dependable,Autonomic and Secure Computing (DASC-2009).2009:194-199 [11] Zhang M,Kao B,Cheung D,et al.Mining periodic patterns with gap requirement from sequences[C]∥Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data.2005:622-633 [12] Zhu X Q,Wu X D.Mining complex patterns across sequences with gap requirements[C]∥Proceedings of 2007 International Joint Conference on Artificial Intelligence.2007:2934-2940 [13] Guo D,Hong X L,Hu X G,et al.A Bit-Parallel Algorithm for Sequential Pattern Matching with Wildcards[J].Cybernetics and Systems,2011,42(6):382-401 [14] 王海平,胡学钢,谢飞,等.模式特征对带有通配符和长度约束的模式匹配问题的影响[J].模式识别与人工智能,2012,25(6):1013-1021 [15] 运正佳,李铁男,杨晓春.支持带有通配符的字符串匹配算法[J].计算机科学与探索,2010,4(11):984-995 [16] Muthukrishnan S,Palem K.Non-standard stringology:algo-rithms and complexity[C]∥Proceedings of the Twenty-sixth Annual ACM Symposium on Theory of Computing.1994:770-779 [17] Manber U,Baeza-Yates R.An algorithm for string matchingwith a sequence of don’t cares[J].Information Processing Letters,1991,37(3):133-136 [18] Zhu X Q,Wu X D.Discovering relational patterns across multiple databases[C]∥Proceedings of IEEE 23rd International Conference on Data Engineering.2007:726-735 [19] Zhang M,Kao B,Cheung D W,et al.Mining periodic patterns with gap requirement from sequences[C]∥Proceedings of ACM SIGMOD.2005:623-633 |
No related articles found! |
|