计算机科学 ›› 2026, Vol. 53 ›› Issue (4): 318-325.doi: 10.11896/jsjkx.250600124
于灵鑫, 陈艺博, 曲浩君, 厉广伟, 李金屏
YU Lingxin, CHEN Yibo, QU Haojun, LI Guangwei, LI Jinping
摘要: 用于工业分拣领域的机械装置通常是针对特定应用场景和特定产品而设计的,面对多种物品无序堆叠的场景,其普适性和智能性往往较差。当前基于3D结构光相机的点云匹配抓取技术虽在一定程度上提升了柔性生产能力,但受限于硬件成本高昂,以及特征描述能力有限、计算复杂度高、对遮挡敏感等固有缺陷,难以满足无序混装抓取需求。近年来以GraspNet为代表的深度学习抓取技术发展迅速,通过双目相机实现位姿估计,但仍存在目标选择策略欠优、位姿评分机制具有局限性、位姿定位偏差大等问题。针对上述挑战,提出一种改进型三阶段抓取算法。第一阶段,针对目标选择策略欠佳的问题,通过融合YOLOv10目标检测与SAM分割模型,结合优化的目标选择算法,即选择无遮挡、距离近的目标,有效解决了堆叠遮挡场景下的目标选择策略不佳难题。第二阶段,对GraspNet位姿估计框架进行改进,即通过引入基于点云表面法向量的位姿筛选机制,重构更加合理的评分机制,进而获取高精度抓取位姿。第三阶段,设计位姿微调策略,即采用"悬停对齐-垂直抓取"的分层控制架构,最大程度消除执行过程中的累积误差,有效解决位姿定位偏差大、实际抓取不准确问题。实验结果表明,该方法显著提升了复杂场景下的抓取效率、操作可靠性和跨场景泛化能力,同时由于使用双目相机取代了3D结构光相机,还显著降低了系统成本,为工业自动化提供了高性价比的解决方案。
中图分类号:
| [1]GUO H K.Application of Artificial Intelligence Technology in Mechanical Automation[J].Electronic Technology,2024,53(10):218-219. [2]ZHAO Y,HUANG Q.Application of Intelligent Sensors in Industrial Automation[J].Smart China,2025(1):126-128. [3]YAN J X.Research on Robotic Sorting Technology for Stacked Parts Based on Deep Learning[D].Hangzhou:Zhejiang University,2024. [4]ZHANG H J,XIONG Z,LAO D B,et al.Monocular visionmeasurement system based on EPNP algorithm[J].Infrared and Laser Engineering,2019,48(5):0517005. [5]LOWE D G.Distinctive image features from scale invariant keypoints[J].International Journal of Computer Vision,2004,60(2):91-110. [6]RUBLEE E,RABAUD V,KONOLIGE K,et al.ORB:an effi-cient alternative to SIFT or SURF[C]//2011 International Conference on Computer Vision.New York:IEEE,2011:2564 2571. [7]DALAL N,TRIGGS B.Histograms of oriented gradients for human detection[C]//2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.New York:IEEE,2005:886-893. [8]ZHANG Q P,CAO Y.Research on three-dimensional recon-struction algorithm of weak textured objects in indoor scenes[J].Laser & Optoelectronics Progress,2021,58(8):0810017. [9]BESL P J,MCKAY N D.Method for registration of 3-D shapes[C]//Proceedings of SPIE.1992:586-606. [10]TOMBARI F,SALTI S,DI STEFANO L.Unique signatures of histograms for local surface description[C]//Computer Vision-ECCV 2010.Heidelberg:Springer,2010:356-369. [11]RUSU R B,BLODOW N,BEETZ M.Fast point feature histograms(FPFH) for 3D registration[C]//2009 IEEE Interna-tional Conference on Robotics and Automation.New York:IEEE,2009:3212-3217. [12]JOHNSON A E.Spin-images:a representation for 3-D surface matching:CMU-RI-TR-97-47[R].Pittsburgh:Carnegie Mellon University,1997. [13]JIANG Y,MOSESON S,SAXENA A.Efficient grasping from rgbd images:Learning using a new rectangle representation[C]//2011 IEEE International Conference on Robotics and Automation.IEEE,2011:3304-3311. [14]DEPIERRE A,DELLANDRÉA E,CHEN L.Jacquard:A large scale dataset for robotic grasp detection.[C]//RSJ International Conference on Intelligent Robots and Systems(IROS).IEEE,2018:3511-3516. [15]XIANG Y,SCHMIDT T,NARAYANAN V,et al.Posecnn:A convolutional neural network for 6d object pose estimation in cluttered scenes[J].arXiv:1711.00199,2017. [16]FANG H S,WANG C,GOU M,et al.Graspnet-1billion:Alarge-scalebenchmark for general object grasping[C]//Procee-dings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:11444-11453. [17]KLEEBERGER K,BORMANN R,KRAUS W,et al.A survey on learning-based robotic grasping[J].Current Robotics Reports,2020,1:239-249. [18]QI C R,YI L,SU H,et al.Pointnet++:Deep hierarchical feature learning on point sets in a metric space[C]//Advances in Neural Information Processing Systems.2017. [19]ZHOU Y,TUZEL O.Voxelnet:End-to-end learning for pointcloud based 3d object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:4490-4499. [20]PENG S,LIU Y,HUANG Q,et al.Pvnet:Pixel-wise votingnetwork for 6dof pose estimation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:4561-4570. [21]JIANG P,ERGU D,LIU F,et al.A Review of Yolo algorithm developments[J].Procedia Computer Science,2022,199:1066-1073. [22]GIRSHICK R.Fast R-CNN[C]//Proceedings of the IEEE International Conference on Computer Vision.2015:1440-1448. [23]WANG A,CHEN H,LIU L,et al.Yolov10:Real-time end-to-end object detection[J].Advances in Neural Information Processing Systems,2024,37:107984-108011. [24]KIRILLOV A,MINTUN E,RAVI N,et al.Segment anything[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2023:4015-4026. [25]FISCHLER M A,BOLLES R C.Random sample consensus:a paradigm for model fitting with applications to image analysis and automated cartography[J].Communications of the ACM,1981,24(6):381-395. |
| [1] | 郭楠, 李婧源, 任曦. 基于深度学习的刚体位姿估计方法综述 Survey of Rigid Object Pose Estimation Algorithms Based on Deep Learning 计算机科学, 2023, 50(2): 178-189. https://doi.org/10.11896/jsjkx.211200164 |
| [2] | 廖德, 张辉, 赵晨阳. 基于大型场景下的多相机标定方法 Multi-camera Calibration Method Based on Large-scale Scene 计算机科学, 2022, 49(11A): 211200054-6. https://doi.org/10.11896/jsjkx.211200054 |
| [3] | 朱世昕, 杨泽民. 基于半直接方法的序列影像直线特征跟踪匹配算法 Line Tracking and Matching Algorithm Based on Semi-direct Method in Image Sequence 计算机科学, 2019, 46(6A): 270-273. |
| [4] | 张国亮,吴琰翔,王展妮,王田. 基于视觉标记的增强现实系统建模及配准误差问题研究 Research on Augmented Reality System Modeling and Registration Error Based on Simple Visual Marker 计算机科学, 2015, 42(6): 299-302. https://doi.org/10.11896/j.issn.1002-137X.2015.06.063 |
| [5] | 方志刚 马卫娟. 多通道用户界面中的目标选择技术 计算机科学, 2000, 27(1): 48-50. |
|
||