Computer Science ›› 2017, Vol. 44 ›› Issue (Z11): 55-60.doi: 10.11896/j.issn.1002-137X.2017.11A.010

Previous Articles     Next Articles

Research on Fuzzy Matching Duplicate Checking Algorithm Based on Matrix Model of Word Segmentation

LI Cheng-long, YANG Dong-ju and HAN Yan-bo   

  • Online:2018-12-01 Published:2018-12-01

Abstract: Aiming at the need of Chinese text duplicate checking,based on the result of word segmentation,we converted target text and sample text into matrix model of word segmentation,then scanned and analyzed matrix to get the result.Therefore an algorithm of duplicate checking was developed,and the usefulness of the method was demonstrated by practical examples.

Key words: Similarity,Matrix model of word segmentation,Fuzzy matching,Duplicate checking algorithm

[1] WANG J Y,WANG B,et al.Multi-core Parallel Substring Ma-tching Algorithm Using BWT [J].Journal of Northeastern University (Natural Science),2016,37(5):624-628.
[2] SONG Y,CAI D F,et al.Approach to Chinese Word Segmentation Based on Character-Word Joint Decoding [J].Journal of Software,2009,20(9):2366-2375.
[3] ZHANG B Y,WEI B,et al.Chinese word segmentation algo-rithm based on pair coding [J].Journal of Nanjing University of Science and Technology,2014(4):526-530.
[4] ZHANG P Y,CHEN C M,et al.Texts Similarity AlgorithmBased on Subtrees Matching [J].Pattern Recognition and Artificial Intelligence,2014(3):226-234.
[5] HUANG C H,YIN J,et al.A Text Similarity MeasurementCombining Word Semantic Information with TF-IDF Method [J].Chinese Journal of Computers,2011,34(5):856-864.
[6] MAO Y F,ZHANG D L,WANG L.Directional evidence conflict measurement based on improved cosine similarity [J].Systems Engineering and Electronics,2016,38(11):2567-2571.
[7] FAN H B,YAO N M.A Fast and Exact Single Pattern Ma-tching Algorithm [J].Journal of Computer Research and Deve-lopment,2009,46(8):1341-1348.
[8] LIANG J Y,BAI L,et al.K-Modes Clustering Algorithm Based on a New Distance Measure [J].Journal of Computer Research and Development,2010,47(10):1749-1755.
[9] REN J,LI C P.Improved minimum distance classifier-weighted minimum distance classifier [J].Journal of Computer Applications,2005,25(5):992-994.
[10] YUAN Y,MA L B.Affine Translation Surfaces in Minkowski 3D-Space [J].Journal of Northeastern University(Natural Scien-ce),2013,34(10):1517-1520.
[11] KE J J,HU J Z.Fault feature extraction method based on Manhattan distance and stochastic neighbor embedding [J].Application Research of Computers,2015,32(10):2992-2995.
[12] WANG L F,WANG Y,et al.Application of Chebyshev localcollocation method to trajectory optimization[J].Journal of Harbin Institute of Technology,2013,45(5):95-100.
[13] XIE J Y,XIE W X.Several Feature Selection Algorithms Based on the Discernibility of a Feature Subset and Support Vector Machines [J].Chinese Journal of Computers,2014,37(8):1704-1718.
[14] YU Y Y.Multi-model Estimation Based on Jaccard Distance and Conceptual Clustering[J].Computer Engineering,2012,38(10):22-26.
[15] YANG H F,LI G J.Novel antenna selection algorithm based on Tanimoto similarity [J].Journal of Systems Engineering and Electronics,2008,19(3):624-627.
[16] CHEN D L,SHEN Y T,et al.A Measure Model of Similarity for Finding the Best Coach [J].Journal of Northeastern University (Natural Science),2014,35(12):1697-1700.
[17] WU D,TENG Y P.Word Segment and Search Techniques forChinese Information Search Engines [J].Journal of Computer Applications,2004,24(7):128-131.
[18] XIAO W,TANG D K,et al.Knowledge push based on Lucene and collaborative filtering algorithm [J].Journal of Changchun University of Technology(Natural Science Edition),2016,37(5):503-506.
[19] HE W.The Research for Fast Exact String Matching Algorithm [D].Hefei:Hefei University of Technology,2010.
[20] http://baike.baidu.com/item/%E6%93%8D%E4%BD%9C%E 7%B3%BB%E7%BB%9F/192?sefr=enterbt.
[21] WANG Z.Analysis of producer and consumer problem algorithm [J].Journal of Jilin Province Economic Management Cadre College,2008,22(3):78-81.

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!