Computer Science ›› 2022, Vol. 49 ›› Issue (5): 113-119.doi: 10.11896/jsjkx.210700131

Special Issue: Big Data & Data Scinece

• Database & Big Data & Data Science • Previous Articles     Next Articles

Line-Segment Clustering Algorithm for Chemical Structure

ZHU Zhe-qing1,3, GENG Hai-jun1,3, QIAN Yu-hua1,2,3   

  1. 1 School of Computer and Information Technology,Shanxi University,Taiyuan 030006,China
    2 Key Laboratory Computational Intelligence and Chinese Information Processing of Ministry of Education,Shanxi University,Taiyuan 030006,China
    3 Institute of Big Data Science and Industry,Shanxi University,Taiyuan 030006,China
  • Received:2021-07-13 Revised:2021-12-10 Online:2022-05-15 Published:2022-05-06
  • About author:ZHU Zhe-qing,born in 1982,postgra-duate.His main research interests include machine learning and data mi-ning.
    QIAN Yu-hua,born in 1976,Ph.D,professor,is a member of China Computer Federation.His main reserch interests include pattern recognition,feature selection,rough set theory,granular computing and artifificial intelligence.
  • Supported by:
    National Natural Science Foundation of China(61672332),Key R & D Program of Shanxi Province (201903D421003),Science and Technology Achievements Transformation and Cultivation Project of Shanxi Provincial Education Department(2020CG001),Applied Basic Research Plan of Shanxi Province(20210302123444) and China University Industry-University-Research Innovation Fund(2021FNA02009).

Abstract: Chemical bond recognition is an important sub-task of chemical structure recognition.The single bonds,double bonds and triple bonds of the chemical structure are all composed of line segments,and it is easy to produce redundant data and interfe-rence data when the Hough transform is used for line segment detection.To this end,a clustering algorithm is proposed to cluster the line segments in chemical bonds detected by Hough transform,during which the redundant line segments can be merged dynamically.Specifically,based on the analysis of spatial relationship between the line segments,the relative similarity measure and interval similarity measure between line segments are defined.A clustering method based on the merging of line segments is carried out by using these two measures.Experimental results show that the proposed similarity measures can comprehensively des-cribe the similarity between line segments.The algorithm can obtain good clustering results,and accurately restore the true position of the line segments in the chemical bonds.It is therefore an effective method for chemical structure image preprocessing.

Key words: Chemical bond, Chemical structure recognition, Clustering of line segments, Hough transform

CLC Number: 

  • TP391
[1]QUIRÓS M,GRAŽULIS S,GIRDZIJAUSKAITÉ S,et al.Using SMILES strings for the description of chemical connecti-vity in the Crystallography Open Database[J].Journal of Cheminformatics,2018,10(1):1-17.
[2]MEMON J,SAMI M,KHAN R A,et al.Handwritten optical character recognition (OCR):A comprehensive systematic lite-rature review(SLR)[J].IEEE Access,2020,8:142642-142668.
[3]CASEY R,BOYER S,HEALEY P,et al.Optical recognition of chemical graphics[C]//Proceedings of 2nd International Confe-rence on Document Analysis and Recognition (ICDAR’93).IEEE,1993:627-631.
[4]PARK J,ROSANIA G R,SHEDDEN K A,et al.Automated extraction of chemical structure information from digital raster images[J].Chemistry Central Journal,2009,3(1):1-16.
[5]RAJAN K,ZIELESNY A,STEINBECK C.DECIMER:towards deep learning for chemical image recognition[J].Journal of Cheminformatics,2020,12(1):1-9.
[6]LIANG X,GUO Q,QIAN Y,et al.Evolutionary deep fusionmethod and its application in chemical structure recognition[J].IEEE Transactions on Evolutionary Computation,2021,25(5):883-893.
[7]OLDENHOF M,ARANY A,MOREAU Y,et al.ChemGra-pher:optical graph recognition of chemical compounds by deep learning[J].Journal of Chemical Information and Modeling,2020,60(10):4506-4517.
[8]STEPHENS R S.Probabilistic approach to the Hough trans-form[J].Image & Vision Computing,1990,9(1):66-71.
[9]GIOI R,JAKUBOWICZ J,MOREL J M,et al.LSD:A Fast Line Segment Detector with a False Detection Control[J].IEEE Transactions on Pattern Analysis & Machine Intelligence,2010,32(4):722-732.
[10]LEE J G,HAN J,WHANG K Y.Trajectory clustering:a partition-and-group framework[C]//Proceedings of the 2007 ACM SIGMOD International Conference on Management of Data.2007:593-604.
[11]EL BAHI H,ZATNI A.Document text detection in videoframes acquired by a smartphone based on line segment detector and dbscan clustering[J].Journal of Engineering Science and Technology,2018,13(2):540-557.
[12]LEE S,HYEON D,PARK G,et al.Directional-DBSCAN:Par-king-slot detection using a clustering method in around-view monitoring system[C]//2016 IEEE Intelligent Vehicles Symposium (IV).IEEE,2016:349-354.
[13]WANG W,XIA F,NIE H,et al.Vehicle trajectory clusteringbased on dynamic representation learning of internet of vehicles[J].IEEE Transactions on Intelligent Transportation Systems,2020,22(6):3567-3576.
[14]LIU B,LIU H Z.Lane Detection Algorithm Based on Improved Enet Network[J].Computer Science,2020,47(4):142-149.
[15]LUO J N,ZHANG J M.Rail Area Extraction Using Extended Haar-like Features and DBSCAN Clustering[J].Computer Science,2020,47(6A):153-156.
[16]LI X,LI J,MU T.A Local Map Construction Method for SLAM Problem Based on DBSCAN Clustering Algorithm[C]//International Conference on Bio-Inspired Computing:Theories and Applications.Springer:Singapore,2019:540-549.
[17]CHEN J Y,GUO Z J,YIN Y K.Full Traversal Path Planning and System Design of Intelligent Lawn Mower Based on Hybrid Algorithm[J].Computer Science,2021,48(6A):633-637.
[18]BLUM L C,REYMOND J L.970 million druglike small molecules for virtual screening in the chemical universe database GDB-13[J].Journal of the American Chemical Society,2009,131(25):8732-8733.
[19]PROBST D,REYMOND J L.SmilesDrawer:parsing and dra-wing SMILES-encoded molecular structures using client-side JavaScript[J].Journal of Chemical Information and Modeling,2018,58(1):1-7.
[1] FANG Zheng, CAO Tie-yong, FU Tie-lian. Motion Blur Parameters Estimation Based on Bottom-hat Using Spectrum [J]. Computer Science, 2018, 45(8): 36-40.
[2] HAN Shao-chao,XU Zun-yi,YIN Zhong-chuan,WANG Jun-xue. Research Review and Development for Automatic Reading Recognition Technology of Pointer Instruments [J]. Computer Science, 2018, 45(6A): 54-57.
[3] ZOU Na, TIAN Jin-wen. Research on Multi Feature Fusion Infrared Ship Wake Detection [J]. Computer Science, 2018, 45(11A): 172-175.
[4] LI Chao, LIU Hong-zhe, YUAN Jia-zheng and ZHENG Yong-rong. Real-time Lane Detection Algorithm Based on Inter-frame Correlation [J]. Computer Science, 2017, 44(2): 317-323.
[5] HE Li-xin, KONG Bin, YANG Jing, XU Yuan-yuan and WANG Bin. Detection and Location on Circular Valve Handle Based on Feature Decomposition and Combination [J]. Computer Science, 2016, 43(4): 284-289.
[6] NI Jin-hui, XIAO Jun and WEN Li-wei. Real-time Visual Inspection Method to Detect Prepreg Edge Straight Based on Line Fitting [J]. Computer Science, 2015, 42(Z6): 16-19.
[7] QU Zhi-guo,TAN Xian-si,LIN Qiang,WANG Hong and GAO Ying-hui. Image Registration Method Based on Straight-line in Hough Parameter Space [J]. Computer Science, 2014, 41(Z11): 107-109.
[8] JIAO Li-min,HE Zhong-shi and LI Jia. Frame Recognition and Sort of Comic Pages Based on Split Line [J]. Computer Science, 2013, 40(Z6): 192-195.
[9] CHEN Hao,MA Yue,CHEN Shuai and LI Zhao-yue. Improved Randomized Hough Method of Circle Detection [J]. Computer Science, 2013, 40(Z6): 163-165.
[10] CHENG Hui and ZHANG Jian-pei. Building Planar Recognition Based on GA-PSO Hough Transform [J]. Computer Science, 2013, 40(9): 300-301.
[11] WANG Yan-qing,XIN Ke-jun,CHEN De-yun and WU Jian. Road Edge Detection Based on Heuristic Probabilistic Hough Transform [J]. Computer Science, 2013, 40(9): 279-283.
[12] . Method of Video People Counting Based on Double_ellipse Model [J]. Computer Science, 2012, 39(Z6): 499-502.
[13] . Iris Location Algorithm Based on Ant Colony and Hough Transform [J]. Computer Science, 2012, 39(Z11): 384-385.
[14] . Combining the Hough Transform and an Improved Least Squares Method for Line Detection [J]. Computer Science, 2012, 39(4): 196-200.
[15] ZHANG Chao-liang,JIANG Han-hong,ZHANG Bo,JIANG Chun-liang. Tidal Level Measurement in Ruler Image Based on Hough Transform and Harris Detection [J]. Computer Science, 2011, 38(3): 283-285.
Full text



No Suggested Reading articles found!