计算机科学 ›› 2020, Vol. 47 ›› Issue (6A): 172-175.doi: 10.11896/JsJkx.190500154

• 计算机图形学 & 多媒体 • 上一篇    下一篇

基于卷积神经网络的半监督手术视频流程识别

祁宝莲1, 3, 钟坤华1, 2, 3, 陈芋文1, 2, 3   

  1. 1 中科院成都计算所 成都 610041;
    2 中国科学院重庆绿色智能技术研究院 重庆 400714;
    3 中国科学院大学 北京 100049
  • 发布日期:2020-07-07
  • 通讯作者: 陈芋文(chenyuwen@cigit.ac.cn)
  • 作者简介:qibaolian17@mails.ucas.ac.cn
  • 基金资助:
    国家重点研发计划项目(2018YFC0116704);重庆市技术创新与应用发展专项面上项目(cstc2019Jscx-msxmX0237)

Semi-supervised Surgical Video Workflow Recognition Based on Convolution Neural Network

QI Bao-lian1, 3, ZHONG Kun-hua1, 2, 3 and CHEN Yu-wen1, 2, 3   

  1. 1 Chengdu Computing Institute of the Chinese Academy of Sciences,Chengdu 610041,China
    2 Chongqing Institute of Green and Intelligent Technology,Chongqing 400714,China
    3 University of Chinese Academy of Sciences,BeiJing 100049,China
  • Published:2020-07-07
  • About author:I Bao-lian, postgraduate.Her main research interests include video analysis, surgical workflow recognition, and artificial intelligence for healthcare.
    CHEN Yu-wen, doctorial student.His main research interests include automated reasoning and programming, computer vision and artificial intelligent for healthcare.
  • Supported by:
    This work was supported by the National Key Research & Development Plan of China (2018YFC0116704) and Chongqing Technology Innovation and Application Development ProJect(cstc2019Jscx-msxmX0237).

摘要: 实时鲁棒的开放性外科手术视频流程自动识别检测将是未来人工智能医疗手术室的核心组成部分,这一关键技术结合其他AI(Artificial Intelligence)技术就可以帮助医护人员自动化、智能化地完成多项术中的常规活动。利用人工智能和计算机视觉的方法进行手术流程识别检测需要对大量的数据进行学习,为了训练这种方法,需要大量地标记手术视频数据,然而在医学领域,对外科手术视频数据的标记需要专家知识,收集足够数量的标记外科手术视频数据是困难且耗时的。因此,文中以腹腔镜胆囊切除术视频数据为研究对象,通过半监督学习方法卷积自编码器对视频进行空间特征提取,结合从同视频上下文中的一对视频帧进行时序特征提取,将非结构化的手术视频数据结构化,从而构建低层手术视频特征到高层外科手术流程语义之间的桥梁,以低代价实现对手术视频流程的智能化识别检测,高效判定手术流程进展。在开源数据集上的实验的结果表明,使用该模型Jacc系数达到71.3%,准确率为86.6%,取得了较好的实验效果。

关键词: 半监督, 卷积网络, 手术流程

Abstract: The real-time and robust open surgery workflow automatic detection will be the core component of the future artificial intelligent medical operation room.The key technology combined with other artificial intelligence technologies can help medical staff to automatically and intelligently complete a number of routine activities in the operation.However,the use of artificial intelligence and computer vision for surgical workflow recognition requires a large amount of data to be learned.In order to train this method,a large amount of labeled surgical video data is required.However,in the medical field,the labeling of surgical video data requires expert knowledge,and collecting enough numbers of marked surgical video data is difficult and time-consuming.Therefore,in this paper,the video data of laparoscopic cholecystectomy data is taken as the research obJect,the video spatial feature extraction is carried out by convolution self-encoder with semi-supervised learning method,and combined with a pair of video frames in the context of the same video for sequential feature extraction.The unstructured surgical video data is structured to build a bridge between the video characteristics of low-level surgery and the semantics of high-level surgical procedures,trying to realize the intelligent recognition of the surgical workflow at a low cost,and effectively determining the progress of the surgical workflow.Finally,the Jaccard coefficient of the proposed algorithm in this paper on a public dataset is 71.3% and the accuracy is 86.6%,achieving good experimental results.

Key words: CNN, Semi-supervised, Surgical workflow

中图分类号: 

  • TP181
[1] 徐大华.高科技引领微创外科发展.科技导报,2017,35(11):69-70.
[2] 曹晖.人工智能医疗给外科医生带来的挑战、机遇与思考.中国实用外科杂志,2017(12):387-388.
[3] JIN Y,DOU Q,CHEN H,et al.SV-RCNet:Workflow recognition from surgical videos using recurrent convolutional network.IEEE Trans.Med.Imaging,2018,37(5):1114-1126.
[4] TWINANDA A,YENGERA G,MUTTER D.RSDNet:Lear-ning to Predict Remaining Surgery Duration from Laparoscopic Videos Without Manual Annotations.arXiv:1802.03243v2,2018.
[5] LOUKAS C.Video content analysis of surgical procedures.Surgical Endoscopy,2018,32(2):553-568.
[6] LI X,URICCHIO T,BALLAN L,et al.Socializing the Semantic Gap:A Comparative Survey on Image Tag Assignment,Refinement,and Retrieval.Acm Computing Surveys,2016,49(1):14.
[7] KLANK U,PADOY N,FEUSSNER H.Navab N (2008) Automatic feature generation in endoscopic images.Int J Comput Assist Radiol Surg 3:331-339.
[8] BLUM T,FEUSSNER H,NAVAB N.Modeling and segmentation of surgical workflow from laparoscopic video.Lect Notes Comput Sci,2010,6363:400-407.
[9] DERGACHYOVA O,BOUGET D,HUAULM A,et al.Automatic data-driven real-time segmentation and recognition of surgical workflow.Int J Comput Assist Radiol Surg,2016,11:1081-1089.
[10] TWINANDA A P,SHEHATA S,MUTTER D,et al.EndoNet:a deep architecture for recognition tasks on laparoscopic videos.IEEE Trans Med Imaging,2017,36:86-97.
[11] LOUKAS C.Surgical Phase Recognition of Short Video Shots Based on Temporal Modeling of Deep Features.arXiv:1807.07853,2018.
[12] LECUN Y,BENGIO Y,HINTON G.Deep learning.Nature,2015,521(7553):436.
[13] GENG C,SONG J X.Human Action Recognition based on Convolutional Neural Networks with a Convolutional Auto-Encoder//International Conference on Computer Sciences and Automation Engineering.Atlantis Press,2016.
[14] RONNEBERGER O,FISCHER P,BROX T.U-Net:Convolutional Networks for Biomedical Image Segmentation//Medi-cal Image Computing and Computer-Assisted Intervention-MICCAI 2015.Springer International Publishing,2015:234-241.
[1] 武红鑫, 韩萌, 陈志强, 张喜龙, 李慕航.
监督和半监督学习下的多标签分类综述
Survey of Multi-label Classification Based on Supervised and Semi-supervised Learning
计算机科学, 2022, 49(8): 12-25. https://doi.org/10.11896/jsjkx.210700111
[2] 汪鸣, 彭舰, 黄飞虎.
基于多时间尺度时空图网络的交通流量预测模型
Multi-time Scale Spatial-Temporal Graph Neural Network for Traffic Flow Prediction
计算机科学, 2022, 49(8): 40-48. https://doi.org/10.11896/jsjkx.220100188
[3] 庞兴龙, 朱国胜.
基于半监督学习的网络流量分析研究
Survey of Network Traffic Analysis Based on Semi Supervised Learning
计算机科学, 2022, 49(6A): 544-554. https://doi.org/10.11896/jsjkx.210600131
[4] 侯夏晔, 陈海燕, 张兵, 袁立罡, 贾亦真.
一种基于支持向量机的主动度量学习算法
Active Metric Learning Based on Support Vector Machines
计算机科学, 2022, 49(6A): 113-118. https://doi.org/10.11896/jsjkx.210500034
[5] 李健智, 王红玲, 王中卿.
基于图卷积网络的专利摘要自动生成研究
Automatic Generation of Patent Summarization Based on Graph Convolution Network
计算机科学, 2022, 49(6A): 172-177. https://doi.org/10.11896/jsjkx.210400117
[6] 王宇飞, 陈文.
基于DECORATE集成学习与置信度评估的Tri-training算法
Tri-training Algorithm Based on DECORATE Ensemble Learning and Credibility Assessment
计算机科学, 2022, 49(6): 127-133. https://doi.org/10.11896/jsjkx.211100043
[7] 赵小虎, 叶圣, 李晓.
多算法融合的骨骼重建信息动作分类方法
Multi-algorithm Fusion Behavior Classification Method for Body Bone Information Reconstruction
计算机科学, 2022, 49(6): 269-275. https://doi.org/10.11896/jsjkx.210500070
[8] 周海榆, 张道强.
面向多中心数据的超图卷积神经网络及应用
Multi-site Hyper-graph Convolutional Neural Networks and Application
计算机科学, 2022, 49(3): 129-133. https://doi.org/10.11896/jsjkx.201100152
[9] 许华杰, 陈育, 杨洋, 秦远卓.
基于混合样本自动数据增强技术的半监督学习方法
Semi-supervised Learning Method Based on Automated Mixed Sample Data Augmentation Techniques
计算机科学, 2022, 49(3): 288-293. https://doi.org/10.11896/jsjkx.210100156
[10] 潘志豪, 曾碧, 廖文雄, 魏鹏飞, 文松.
基于交互注意力图卷积网络的方面情感分类
Interactive Attention Graph Convolutional Networks for Aspect-based Sentiment Classification
计算机科学, 2022, 49(3): 294-300. https://doi.org/10.11896/jsjkx.210100180
[11] 解宇, 杨瑞玲, 刘公绪, 李德玉, 王文剑.
基于动态拓扑图的人体骨架动作识别算法
Human Skeleton Action Recognition Algorithm Based on Dynamic Topological Graph
计算机科学, 2022, 49(2): 62-68. https://doi.org/10.11896/jsjkx.210900059
[12] 侯宏旭, 孙硕, 乌尼尔.
蒙汉神经机器翻译研究综述
Survey of Mongolian-Chinese Neural Machine Translation
计算机科学, 2022, 49(1): 31-40. https://doi.org/10.11896/jsjkx.210900006
[13] 龚浩田, 张萌.
基于关键点检测的无锚框轻量级目标检测算法
Lightweight Anchor-free Object Detection Algorithm Based on Keypoint Detection
计算机科学, 2021, 48(8): 106-110. https://doi.org/10.11896/jsjkx.200700161
[14] 邢豪, 李明.
基于3D CNNS的深度伪造视频篡改检测
Deepfake Video Detection Based on 3D Convolutional Neural Networks
计算机科学, 2021, 48(7): 86-92. https://doi.org/10.11896/jsjkx.210200127
[15] 宋龙泽, 万怀宇, 郭晟楠, 林友芳.
面向出租车空载时间预测的多任务时空图卷积网络
Multi-task Spatial-Temporal Graph Convolutional Network for Taxi Idle Time Prediction
计算机科学, 2021, 48(7): 112-117. https://doi.org/10.11896/jsjkx.201000089
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!