Computer Science ›› 2014, Vol. 41 ›› Issue (9): 269-273.doi: 10.11896/j.issn.1002-137X.2014.09.051

Previous Articles     Next Articles

Characteristic Analysis of Pattern Matching with Wildcards Problem and its Solution Space

XIANG Tai-ning,GUO Dan,WANG Hai-ping and HU Xue-gang   

  • Online:2018-11-14 Published:2018-11-14

Abstract: With the developments in bioinformatics and information retrieval,pattern matching with wildcards and length constraints has attracted wild attention.This problem extends the exact pattern matching,so it brings more flexibility as well as complexity.Meanwhile,the time complexity of non-linear algorithms is greatly increased.The solution space makes great difference to the efficiency of matching algorithms,but the characteristics of the problem solution space is still lack of systematic research.Therefore,the problem solution space was presented,and the divisibility of the space was analyzed.Afterwards,a solution space partitioning algorithm,named SPLIT,was proposed,and its time complexity was also illustrated.In our experiments,three pattern matching algorithms were used as baselines running on realDNA data sets with 5109 sets of patterns.Our empirical results verify that SPLIT can effectively reduce the time consumption of non-linear matching algorithms without influencing the matching solution.

Key words: Solution space,Partitioning,Pattern matching,Wildcard

[1] Chen Gong,Wu Xin-dong,Zhu Xing-quan,et al.Efficient string matching with wildcards and length constraints[J].Knowledge and Information Systems,2006,0(4):399-419
[2] Cole J R,Chai B,Marsh T L,et al.The ribosomal database project (RDP-11):Sequences and tools for high-throughput rRNA analysis[J].Nucleic Acids Research,2005,3(1):294-296
[3] Xie Fei,Wu Xin-dong,Hu Xue-gang,et al.Sequential PatternMining with Wildcards[C]∥22nd IEEE International Confe-rence on Tools with Artificial Intelligence (ICTAI).Arras,France,2010:241-247
[4] He Yu,Wu Xin-dong,Zhu Xing-quan,et al.Mining Frequent Patterns with Wildcards from Biological Sequences[C]∥IEEE International Conference on Information Reuse and Integration.Las Vegas,IL,2007:329-334
[5] Califf M E,Mooney R J.Bottom-up relational learning of pattern matching rules for information extraction[J].Journal of Machine Learning Research,2003,4(6):177-210
[6] Manber U,Baeza-Yates R.An algorithm for string matchingwith a sequence of don’t cares[J].Information Processing Letters,1991,7(3):133-136
[7] Fischer M J,Paterson M S.String matching and other products[R].Massachusetts Institute of Technology Cambridge Project MAC, 1974:113-125
[8] Zhang Ming-hua,Kao B,Cheung DW,et al.Mining periodic patterns with gap requirement from sequences[C]∥Proc.of ACM SIGMOD.Baltimore Maryland,2005:623-633
[9] Bille P,Grtz I L,Vildhj H,et al.String matching with variable length gaps[C]∥Proc.of 17th SPIRE.2010:385-394
[10] Muthukrishnan S,Krishna P.Non-stadard stringology:algo-rithms and complexity [C]∥Proc.of the twenty-sixth annual ACM symposium on Theory of computing.New York,NY,USA,1994
[11] Guo Dan,Hong Xiao-li,Hu Xue-gang,et al.A Bit-Parallel Algorithm for Sequential Pattern Matching with Wildcards[J].Cybernetics and Systems,2011,2(6):382-401
[12] 武优西,吴信东,江贺,等.一种求解MPMGOOC问题的启发式算法[J].计算机学报,2011,4(8):1452-1462
[13] 侯宝剑,谢飞,胡学钢,等.基于后缀树的带有通配符的模式匹配研究[J].计算机科学,2012,9(12):177-180
[14] 张雁,焦方正,卢昕玮,等.采用染色划分改进的RLS 算法及性能分析[J].软件学报,2011,2(10):2305-2316
[15] Zhang Li-hua,Xu Wen-li,Chang Cheng.Genetic algorithm for affine point pattern matching[J].Pattern Recognition Letters,2003,4:9-19
[16] Sagot M F,Viari A.A Double Combinatorial Approach to Dis-covering Patterns in Biological Sequence[C]∥Proc.of the 7th Symposium on Combinatorial Pattern Matching.Springer,1996:186-208
[17] Min Fan,Wu Xin-dong,Lu Zhen-yu.Pattern matching with independent wildcard gaps[C]∥Eighth IEEE International Conference on Dependable,Autonomic and Secure Computing(DASC-2009).Chengdu,China,2009:194-199
[18] National center for biotechnology information website.http://www.ncbi.nlm.nih.gov/

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!