Application value of machine learning algorithms for gauze detection in laparoscopic pan-creatic surgery
-
摘要:目的
探讨深度学习技术识别纱布在腹腔镜胰腺手术中的应用价值。
方法采用回顾性描述性研究方法。收集2017年7月至2020年7月中国医学科学院北京协和医院保存完整的80个腹腔镜胰腺手术视频资料,按随机数方法以3∶1比例分为训练集61个视频与测试集19个视频。训练集用于训练神经网络模型,测试集用于测试不同难度场景下神经网络识别纱布的能力。由两位医师筛选出包含纱布的视频片段,根据纱布识别难度分为简单、正常、困难3类场景。标注人员在手术视频中逐帧标注纱布真值框。所有图像经过归一化、预处理后输入神经网络模型进行训练。神经网络输出纱布预测框,若与真值框重合的交并比>0.5则判定为预测正确。观察指标:(1)视频标注及分类情况。(2)测试集神经网络测试结果。计数资料以绝对数或百分比表示。
结果(1)视频标注及分类情况:80个视频共26 893帧图像完成标注工作。其中训练集61个视频片段共22 564帧图像,测试集19个视频片段共4 329帧图像。训练集19个视频中5 791帧图为简单难度场景,38个视频中15 771帧图为正常难度场景,4个视频中1 002帧图为困难难度场景。测试集4个视频中1 684帧图为简单难度场景,6个视频中1 016帧图为正常难度场景,9个视频中1 629帧图为困难难度场景。(2)测试集神经网络测试结果:神经网络模型对总体测试集中纱布识别的灵敏度和精确率分别为78.471%(3 397/4 329)和69.811%(3 397/4 866),对简单难度测试集中纱布识别的灵敏度和精确率分别为94.478%(1 591/1 684)和83.168%(1 591/1 913),对正常难度测试集中纱布识别的灵敏度和精确率分别为80.413%(817/1 016)和70.859%(817/1 153),对困难难度测试集中纱布识别的灵敏度和精确率分别为60.712%(989/1 629)和54.944%(989/1 800)。手术视频实时运行时帧率≥15 fps。神经网络模型对总体测试集中纱布识别的漏报率和误报率分别为21.529%(932/4 329)和30.189%(1 469/4 866)。漏报主要原因为图像模糊、纱布暴露面积过小及纱布浸血严重,误报主要原因为结缔组织、体液反光导致误识别。
结论基于深度学习的纱布识别技术在腹腔镜胰腺手术中可行,可协助医务人员识别纱布。
Abstract:ObjectiveTo investigate the application value of machine learning algorithms for gauze detection in laparoscopic pancreatic surgery.
MethodsThe retrospective and descriptive study was conducted. The 80 intact laparoscopic pancreatic surgery videos from Peking Union Medical College Hospital of Chinese Academy of Medical Sciences with timing of July 2017 to July 2020 were collected. The training set was used to train the neural network, and the test set was used to test the ability of neural network for gauze detection under different difficulties. Under the supervision of two superior doctors, videos that containing gauze were selected and classified according to recognition difficulty into three difficulty level including easy, normal and hard difficulty, and further divided based on random number method into training set with 61 videos and test set with 19 videos in a ratio of 3:1 roughly. The minimum enclosing rectangle of the gauze were marked frame by frame. All images were input to the neural network model for training after normalization and preprocessing. For every image, the output of neural network is the predicted minimum enclosing rectangle of gauze. The intersection over union >0.5 was identified as positive result. Observation indicators: (1) video annotation and classification; (2) test outcomes of neural network for test set.Count data were represented as absolute numbers or percentages.
Results(1) Video annotation and classification: a total of 26 893 frames of images form 80 videos were annotated, with 61 videos including 22 564 frames of images as the training set and 19 videos including 4 329 frames of images as the test set. Of the training set, 19 videos including 5 791 frames of images were classifed as easy difficulty, 38 videos including 15 771 frames of images were classifed as normal difficulty, 4 videos including 1 002 frames of images were classifed as hard difficulty, respectively. Of the test set, 4 videos including 1 684 frames of images were classifed as easy difficulty, 6 videos including 1 016 frames of images were classifed as normal difficulty, 9 videos including 1 629 frames of images were classifed as hard difficulty, respectively. (2) Test outcomes of neural network for test set: the overall sensitivity and accuracy of gauze detection by neural network in the test set were 78.471%(3 397/4 329) and 69.811%(3 397/4 866), respectively. The sensitivity and accuracy of gauze detection by neural network were 94.478%(1 591/1 684) and 83.168%(1 591/1 913) in easy difficulty test set. The sensitivity and accuracy of gauze detection by neural network were 80.413%(817/1 016) and 70.859%(817/1 153) in normal difficulty test set, 60.712%(989/1 629) and 54.944%(989/1 800)in hard difficulty test set. The frame rate reached more than or equally to 15 fps. The overall false negative rate and false positive rate of gauze detection by neural network in the test set were 21.529%(932/4 329) and 30.189%(1 469/4 866), respectively. The false negative was mainly due to the existence of blurred images, too small gauze exposure or blood immersion of gauze. The false positive was caused by the reflection of connective tissue or body fluids.
ConclusionThe machine learning algorithms for gauze detection in laparoscopic pancreatic surgery is feasible, which could help medical staff identify gauze.
-
Keywords:
- Pancreas diseases /
- Pancreatic surgery /
- Laparoscopic surgery /
- Deep lear-ning /
- Gauze detection
-
腹腔镜胰腺手术等出血量大、解剖层次丰富的手术中常需使用各种止血材料,其中纱布应用广泛。然而,腹腔镜手术中纱布常因浸血或暴露面积小等原因难以识别,导致手术时间延长甚至发生纱布遗漏等不良后果。目前临床医师尝试在纱布上镶嵌醒目蓝色线或便于X光识别的钢丝,但效果不佳。计算机视觉技术特别是深度学习技术在医疗图像领域迅速发展[1‑3]。深度学习技术已应用于腹腔镜手术视频,常用于图像中的目标检测,包括手术器械识别、手术步骤预测以及手术解剖结构识别等,其效果均不劣于传统计算机视觉识别技术[1,4‑20]。其中,神经网络技术检测目标结构存在的能力突出,检测手术器械的精确率可达85%~92%[4‑8]。然而,深度学习技术检测目标所在位置的能力欠优,Choi等[21]报道该技术实时手术器械探测的精确率为72.26%,Tokuyasu等[19]报道该技术胆总管相关结构的实时识别精确率则仅有31.4%~32.0%,对于胆囊管仅为7.4%。
纱布因其特殊的网格状纹理,比正常组织更易识别,理论上更易检出,但相关报道较少。已有的研究结果显示:深度学习技术在体外识别纱布的灵敏度和精确率分别为97%和100%[22]。本研究回顾性分析2017年7月至2020年7月中国医学科学院北京协和医院保存完整的80个腹腔镜胰腺手术视频资料,探讨深度学习技术识别纱布在腹腔镜胰腺手术中的应用价值。
资料与方法
一、一般资料
采用回顾性描述性研究方法。收集80个腹腔镜胰腺手术视频资料。病人均签署手术视频收集知情同意书。
二、纳入标准和排除标准
纳入标准:(1)手术视频资料完整,无录像缺失。(2)腹腔镜胰腺手术,包括胰体尾切除术、胰十二指肠切除术、胰腺局部切除术、全胰切除术等。(3)视频资料中纱布出现时长≥5 min。
排除标准:(1)手术视频资料不完整。(2)腹腔镜非胰腺手术,如远端胃癌切除术、脾切除术等。(3)病人存在其他腹部或盆腔手术史。(4)病人手术中途转为开腹手术。
三、研究方法
1.视频标注及纱布识别难度分类:由两位医师筛选出包含纱布的视频片段,按照每4帧随机抽取1帧的方式压缩图像数量。在高年资胰腺外科医师指导下,专业视频标注人员使用自主开发的软件(Universal Calibration Client.exe,版本号20190329)逐帧标注纱布的最小外接矩形即真值框,见图1。根据纱布暴露面积和沾血量等指标,结合标注人员的反馈,将视频纱布识别分为简单、正常、困难3类难度场景,见图2。简单难度场景即纱布干净,暴露面积大;正常难度场景为纱布有一定程度浸血,暴露面积中等;困难难度场景为纱布暴露面积小,常位于镜头边缘,浸血严重,可能存在镜头模糊。
图 2 手术视频中纱布识别的不同难度场景 2A:简单难度,表现为纱布白净,位于视野中央,暴露面积大(←);2B:正常难度,表现为中度浸血,暴露面积中等,常有组织阻挡(←);2C:困难难度,表现为严重浸血,与血块难以分辨(←);2D:困难难度,存在视野模糊(←)Figure 2. Different difficulty of gauze detection in surgery videos 2A: Easy difficulty, gauzes were usually white and clean, located in the center of the field of vision, with good exposure (←); 2B: Normal difficulty, gauzes with moderate immersion, relative bad exposure, partially covered by tissues or organs were classified as normal difficulty (←); 2C: Hard difficulty, gauzes were immersed with blood, which was difficult to distinguish from blood clots (←); 2D: Hard difficulty, gauzes were hard to recognize due to the blurriness (←)2.神经网络模型构建:采用YOLOv3框架作为纱布检测的技术框架,darknet19作为卷积神经网络模型的主干网络。若训练集数据较小,darknet19加载预训练模型权重以加速模型学习速度,特征金字塔结构的权重部分随机初始化。坐标和置信度采用MSE损失函数、类别采用softmax损失函数,两者加权后进行训练,选定随机梯度下降优化器。卷积神经网络架构示意图见图3。训练环境为GPU NVIDIA Tesla_V100 32GB。
图 3 卷积神经网络架构 3A:神经网络整体流程。待预测图像经统一预处理后以RGB三通道形式输入darknet19+特征金字塔网络,在神经网络中经过卷积、池化、下采样、特征金字塔网络(FPN),以及2次上采样和2次不同通道维度进行CONCAT拼接,最终输出预测的纱布所在外接矩形的4个端点;3B:具体神经网络架构示例。神经网络共有5个步骤,其中每步骤都由卷积、归一化、relu激活函数和池化层组成;每经过一个步骤,特征图发生2倍下采样;特征金字塔结构由3个分支组成,FPN‑3、FPN‑4、FPN‑5,其中x2表示上采样2倍,“+”表示不同特征进行通道连接注:CNN为卷积神经网络;POOL为池化;CONV为卷积;UPSAMPLE为上采样;stride为步长;FPN为特征金字塔Figure 3. The framework of convolution neuron network 3A: Flow chart of convolution neuron network. After preprocessing, the data were input into darknet19+Feature pyramid neuron network (in the form of RGB channel). After convolution, pooling, subsampling, feature pyramid network, 2 times upsampling and CONCAT, 4 couples of coordinates indicating 4 endpoints of predicted outer rectangle of gauze were printed by neuron network; 3B: Detailed framework. The neuron network has 5 stages. Each stage contains conv, bn, relu, and pooling layer. After each stage, the feature map will undergo 2X subsampling. The feature pyramid network structure wans constructed by FPN‑3, FPN‑4 and FPN‑5. X2 as upsampling (sampling rate:2).“+”as channel CONCAT.3.神经网络输入与输出:视频片段按3∶1比例随机分为训练集61个视频与测试集19个视频。训练集用于训练神经网络模型,测试集用于测试不同难度场景下神经网络识别纱布的能力。图像尺寸统一缩放至960×540,RGB格式,像素归一化为[0,1],并进行几何增强、色彩增强、噪声增强、亮度增强、对比度增强,随机反转等预处理,辅助提升训练的稳定性和收敛速度。
神经网络最后一层输出特征图大小为N×C×H×W,其中N是批尺寸,C是通道数,H是特征图高度,W是特征图宽度。C由锚点数组拼接而成,每组锚点则是坐标(目标所在外接矩形连续4个角点的8个坐标值)、置信度(表示属于目标的概率)、类别(表示目标的类别,本研究仅纱布一个类别)拼接而成。根据特征图,网络将输出所预测的纱布最小外接矩4个端点的坐标(x1,y1,x2,y2,x3,y3,x4,y4),即预测框。
4.神经网络训练结果判断:预测框与真值框的交并比定义为
IOU=area(P)⋂area(GT)area(P)⋃area(GT) ,IOU为交并比,P为预测框,GT为真值框。预测框与真值框的交并比>0.5为神经网络预测正确,定义为真阳性,交并比≤0.5定义为假阴性。若图中未标注真值框而神经网络输出预测框,定义为假阳性。灵敏度指样本中正例被预测正确的比例,而精确率指预测为正样本中真正的正样本比例。两者的计算公式如下:灵敏度=
真阳性例数真阳性例数+假阴性例数 ×100%精确率=
真阳性例数真阳性例数+假阳性例数 ×100%漏报率指样本中正样本未被预测正确的比例,误报率指预测为正的样本中不是正样本的比例,两者的计算公式如下:
漏报率=
假阴性例数真阳性例数+假阴性例数 ×100%=1-灵敏度
误报率=
假阳性例数真阳性例数+假阳性例数 ×100%=1-精确率
四、观察指标
(1)视频标注及分类情况:训练集和测试集手术视频标注情况及纱布识别难度分类结果。(2)测试集神经网络测试结果:darknet19+特征金字塔网络对于不同难度测试集纱布识别的灵敏度、精确率、漏报率及误报率。
五、统计学分析
应用SPSS 26.0统计软件进行分析,计数资料以绝对数或百分比表示。
结果
一、视频标注及分类情况
80个视频共26 893帧图像完成标注工作。其中训练集61个视频片段共22 564帧图像,测试集19个视频片段共4 329帧图像。手术视频纱布识别难度分类结果见表1。
表 1 腹腔镜胰腺手术视频标注情况及手术视频纱布识别难度分类Table 1. Annotated for laparoscopic pancreatic surgeries and different catogaries of gauze detection difficulties手术视频纱布识别难度 训练集 测试集 简单 19个视频,5 791帧图 4个视频,1 684帧图 正常 38个视频,15 771帧图 6个视频,1 016帧图 困难 4个视频,1 002帧图 9个视频,1 629帧图 总计 61个视频,22 564帧图 19个视频,4 329帧图 二、测试集神经网络测试结果
神经网络模型对测试集中纱布识别的灵敏度和精确率见表2,不同难度场景下腹腔镜胰腺手术中神经网络模型成功识别纱布位置样例见图4。手术视频实时运行时帧率≥15 fps。演示视频详见二维码。神经网络模型对测试集中纱布识别的漏报率和误报率见表3。漏报主要原因为图像模糊、纱布暴露面积过小以及纱布浸血严重,误报主要原因为结缔组织、体液反光导致误识别。见图5。
表 2 神经网络模型对测试集中纱布识别的灵敏度和精确率Table 2. Sensitivity and precision of gauze detection in test sets by neuron network测试集手术视频纱布识别难度 图片数量(帧) 灵敏度(%) 精确率(%) 简单 1 684 94.478(1591/1 684) 83.168(1 591/1 913) 正常 1 016 80.413(817/1 016) 70.859(817/1 153) 困难 1 629 60.712(989/1 629) 54.944(989/1 800) 总体 4 329 78.471(3 397/4 329) 69.811(3 397/4 866) 图 4 不同难度场景下腹腔镜胰腺手术中神经网络模型成功识别纱布位置样例 4A:简单难度场景中预测框与真值框重合;4B:正常难度场景中预测框与真值框重合;4C:困难难度场景中预测框与真值框重合注:绿色框为预测框,蓝色框为真值框,交并比>0.5即预测成功Figure 4. Successful gauze detection samples in video data of laparoscopic pancreatic surgery in different catogaries of difficulties 4A: The prediction box overlaps with ground truth box perfectly under easy difficulty; 4B: The prediction box overlaps with ground truth box perfectly under normal difficulty; 4C:The prediction box overlaps with ground truth box under hard difficulty表 3 神经网络模型对测试集中纱布识别的漏报率和误报率Table 3. False negative rate and false positive rate of gauze detection in test sets by neuron network测试集手术视频纱布识别难度 图片数量(帧) 漏报率(%) 误报率(%) 简单 1 684 5.522(93/1 684) 16.832(322/1 913) 正常 1 016 19.587(199/1 016) 29.141(336/1 153) 困难 1 629 39.288(640/1 629) 45.056(811/1 800) 总体 4 329 21.529(932/4 329) 30.189(1 469/4 866) 图 5 神经网络模型对手术视频中纱布识别的误报和漏报情况 5A:神经网络模型误报纱布(黄色框),翻起的胰腺组织截面出现类似纱布的纹理;5B:神经网络模型误报纱布(黄色框),脂肪组织表面存在不规则反光类似纱布的纹理;5C:神经网络模型漏报纱布(←),纱布位于屏幕边缘,暴露面积过小;5D:神经网络模型漏报纱布(➝),纱布浸血过多与组织区别不清;5E:镜头运动幅度过大出现模糊导致神经网络模型漏报纱布(蓝色框)Figure 5. False positive and false negative samples of gauze detection in video by neuron network 5A: False positive sample of gauze detection (yellow box). The cross‑section of pancreas was similar to gauze texture; 5B: False positive sample of gauze detection (yellow box). Reflected light on the fat tissue looked like gauze texture; 5C: False negative sample of gauze detection (←). Gauze was in the edge of picture, with littlw exposure; 5D: False negative sample of gauze detection (➝). The gauze immersed in blood and was hard to distinguish from surrounding tissue; 5E: Camera moved too fast, resulting false negative (blue box)讨论
本研究中神经网络模型采用YOLOv3技术框架,可实现端到端的目标检测,且YOLOv3提出的特征金字塔结构有利于加强对纱布多层次结构特征的提取[23‑24]。主干网络darknet19具有19个卷积层和5个池化层,对图片的分类性能良好、结构简单,与VGG‑16算法精度相当,但运算量减少至约1/5[23‑24]。此外,实际采集视频中纱布尺寸各异,尤其是纱布尺寸较小的情况下,由于其本身像素信息较少,采样过程中容易造成信息丢失。因此,笔者团队通过引入特征金字塔结构,将网络中不同尺度的特征图叠加融合,能在增加较少计算量的前提下融合低分辨率(语义信息较强)的特征图和高分辨率(空间信息丰富)的特征图,提升网络检测不同尺寸纱布的稳定性。此外,视频还进行随机翻转、几何增强、噪声增强等预处理,以提高复杂环境下模型的稳定性。
本研究中神经网络模型对腹腔镜胰腺手术视频中纱布的总体识别能力尚可,正常难度场景下灵敏度和精确率分别为80.413%和70.859%,困难难度场景下则分别为60.712%和54.944%。目前神经网络对手术器械的位置检测精确率为38%~86%[5,9‑14]。与解剖器械比较,纱布的形状及颜色多变,识别难度更高。神经网络对于腹腔镜胰腺手术视频中纱布的总体识别表现非常优异。本研究采用80个视频共计26 893帧图片,数据量较大,但新收集51个胰腺手术视频尚未标注和纳入研究。有待后续进一步提升。
本研究结果显示:对于困难难度场景,神经网络模型的精确率仅为54.944%。笔者分析原因为:困难难度训练样本过少,共1 002帧。有待后续增加困难难度训练量。本研究中神经网络对于腹腔镜胰腺手术视频中纱布识别的漏报率和误报率分别为21.529%和30.189%。漏报的主要原因是某一帧视频的纱布浸血严重,或没在血液中,导致其纹理难以被检出。这可通过加大训练量解决。其他原因包括运动模糊、暴露面积过小等难以检出情况。由于腹腔镜单光源的特性,且结缔组织及血液等组织反光率较高,导致在单帧检测情况下误报,但连续帧检测时此类误报实际影响较小。而其他如胰腺断面、脂肪组织断面等本身存在细小网格状纹理的组织则容易被误判为纱布,这可能是后续改进方向。笔者团队实时试运行数个视频片段,实时帧率≥15 fps,且仅在困难难度出现漏报及误报,但仍能准确定位纱布。因此,笔者认为后续可实现实时识别。
本研究不足:所有视频来源于同一家医疗中心,视频可能流程相同,识别纱布存在可能的额外标识,后续需补充其他医疗中心的手术视频提升神经网络训练结果的稳定性。此外,该神经网络模型的精确率及灵敏度仍有提升空间,未来应补充更为复杂而困难的训练素材提升模型识别能力。
综上,基于深度学习的纱布识别技术在腹腔镜胰腺癌手术中可行,可协助医务人员识别纱布。
所有作者均声明不存在利益冲突花苏榕, 王智弘, 王晶, 等. 深度学习技术识别纱布在腹腔镜胰腺手术中的应用价值[J]. 中华消化外科杂志, 2021, 20(12): 1324-1330. DOI: 10.3760/cma.j.cn115610-20211208-00641.http://journal.yiigle.com/LinkIn.do?linkin_type=cma&DOI=10.3760/cma.j.cn115610-20211208-00462
-
图 2 手术视频中纱布识别的不同难度场景 2A:简单难度,表现为纱布白净,位于视野中央,暴露面积大(←);2B:正常难度,表现为中度浸血,暴露面积中等,常有组织阻挡(←);2C:困难难度,表现为严重浸血,与血块难以分辨(←);2D:困难难度,存在视野模糊(←)
Figure 2. Different difficulty of gauze detection in surgery videos 2A: Easy difficulty, gauzes were usually white and clean, located in the center of the field of vision, with good exposure (←); 2B: Normal difficulty, gauzes with moderate immersion, relative bad exposure, partially covered by tissues or organs were classified as normal difficulty (←); 2C: Hard difficulty, gauzes were immersed with blood, which was difficult to distinguish from blood clots (←); 2D: Hard difficulty, gauzes were hard to recognize due to the blurriness (←)
图 3 卷积神经网络架构 3A:神经网络整体流程。待预测图像经统一预处理后以RGB三通道形式输入darknet19+特征金字塔网络,在神经网络中经过卷积、池化、下采样、特征金字塔网络(FPN),以及2次上采样和2次不同通道维度进行CONCAT拼接,最终输出预测的纱布所在外接矩形的4个端点;3B:具体神经网络架构示例。神经网络共有5个步骤,其中每步骤都由卷积、归一化、relu激活函数和池化层组成;每经过一个步骤,特征图发生2倍下采样;特征金字塔结构由3个分支组成,FPN‑3、FPN‑4、FPN‑5,其中x2表示上采样2倍,“+”表示不同特征进行通道连接
注:CNN为卷积神经网络;POOL为池化;CONV为卷积;UPSAMPLE为上采样;stride为步长;FPN为特征金字塔
Figure 3. The framework of convolution neuron network 3A: Flow chart of convolution neuron network. After preprocessing, the data were input into darknet19+Feature pyramid neuron network (in the form of RGB channel). After convolution, pooling, subsampling, feature pyramid network, 2 times upsampling and CONCAT, 4 couples of coordinates indicating 4 endpoints of predicted outer rectangle of gauze were printed by neuron network; 3B: Detailed framework. The neuron network has 5 stages. Each stage contains conv, bn, relu, and pooling layer. After each stage, the feature map will undergo 2X subsampling. The feature pyramid network structure wans constructed by FPN‑3, FPN‑4 and FPN‑5. X2 as upsampling (sampling rate:2).“+”as channel CONCAT.
图 4 不同难度场景下腹腔镜胰腺手术中神经网络模型成功识别纱布位置样例 4A:简单难度场景中预测框与真值框重合;4B:正常难度场景中预测框与真值框重合;4C:困难难度场景中预测框与真值框重合
注:绿色框为预测框,蓝色框为真值框,交并比>0.5即预测成功
Figure 4. Successful gauze detection samples in video data of laparoscopic pancreatic surgery in different catogaries of difficulties 4A: The prediction box overlaps with ground truth box perfectly under easy difficulty; 4B: The prediction box overlaps with ground truth box perfectly under normal difficulty; 4C:The prediction box overlaps with ground truth box under hard difficulty
图 5 神经网络模型对手术视频中纱布识别的误报和漏报情况 5A:神经网络模型误报纱布(黄色框),翻起的胰腺组织截面出现类似纱布的纹理;5B:神经网络模型误报纱布(黄色框),脂肪组织表面存在不规则反光类似纱布的纹理;5C:神经网络模型漏报纱布(←),纱布位于屏幕边缘,暴露面积过小;5D:神经网络模型漏报纱布(➝),纱布浸血过多与组织区别不清;5E:镜头运动幅度过大出现模糊导致神经网络模型漏报纱布(蓝色框)
Figure 5. False positive and false negative samples of gauze detection in video by neuron network 5A: False positive sample of gauze detection (yellow box). The cross‑section of pancreas was similar to gauze texture; 5B: False positive sample of gauze detection (yellow box). Reflected light on the fat tissue looked like gauze texture; 5C: False negative sample of gauze detection (←). Gauze was in the edge of picture, with littlw exposure; 5D: False negative sample of gauze detection (➝). The gauze immersed in blood and was hard to distinguish from surrounding tissue; 5E: Camera moved too fast, resulting false negative (blue box)
表 1 腹腔镜胰腺手术视频标注情况及手术视频纱布识别难度分类
Table 1 Annotated for laparoscopic pancreatic surgeries and different catogaries of gauze detection difficulties
手术视频纱布识别难度 训练集 测试集 简单 19个视频,5 791帧图 4个视频,1 684帧图 正常 38个视频,15 771帧图 6个视频,1 016帧图 困难 4个视频,1 002帧图 9个视频,1 629帧图 总计 61个视频,22 564帧图 19个视频,4 329帧图 表 2 神经网络模型对测试集中纱布识别的灵敏度和精确率
Table 2 Sensitivity and precision of gauze detection in test sets by neuron network
测试集手术视频纱布识别难度 图片数量(帧) 灵敏度(%) 精确率(%) 简单 1 684 94.478(1591/1 684) 83.168(1 591/1 913) 正常 1 016 80.413(817/1 016) 70.859(817/1 153) 困难 1 629 60.712(989/1 629) 54.944(989/1 800) 总体 4 329 78.471(3 397/4 329) 69.811(3 397/4 866) 表 3 神经网络模型对测试集中纱布识别的漏报率和误报率
Table 3 False negative rate and false positive rate of gauze detection in test sets by neuron network
测试集手术视频纱布识别难度 图片数量(帧) 漏报率(%) 误报率(%) 简单 1 684 5.522(93/1 684) 16.832(322/1 913) 正常 1 016 19.587(199/1 016) 29.141(336/1 153) 困难 1 629 39.288(640/1 629) 45.056(811/1 800) 总体 4 329 21.529(932/4 329) 30.189(1 469/4 866) -
[1] AntebyR, HoreshN, SofferS, et al. Deep learning visual analysis in laparoscopic surgery: a systematic review and diagnostic test accuracy meta-analysis[J]. Surg Endosc,2021,35(4):1521-1533. DOI: 10.1007/s00464-020-08168-1.
[2] HosnyA, ParmarC, CorollerTP, et al. Deep learning for lung cancer prognostication: a retrospective multi-cohort radiomics study[J]. PLoS Med,2018,15(11):e1002711. DOI: 10.1371/journal.pmed.1002711.
[3] SinghSP, WangL, GuptaS, et al. 3D deep learning on medi-cal images: a review[J]. Sensors(Basel),2020,20(18):5097. DOI: 10.3390/s20185097.
[4] KletzS, SchoeffmannK, HussleinH. Learning the repre-sen-tation of instrument images in laparoscopy videos[J]. Healthc Technol Lett,2019,6(6):197-203. DOI: 10.1049/htl.2019.0077.
[5] NwoyeCI, MutterD, MarescauxJ, et al. Weakly supervised convolutional LSTM approach for tool tracking in laparos-copic videos[J]. Int J Comput Assist Radiol Surg,2019,14(6):1059-1067. DOI: 10.1007/s11548-019-01958-6.
[6] TwinandaAP, ShehataS, MutterD, et al. EndoNet: a deep architecture for recognition tasks on laparoscopic videos[J]. IEEE Trans Med Imaging,2017,36(1):86-97. DOI:10. 1109/TMI.2016.2593957.
[7] MishraK, SathishR, SheetD, et al Learning latent tem-poral connectionism of deep residual visual abstractions for identifying surgical tools in laparoscopy procedures[C]//IEEE Conference on Computer Vision and Pattern Recognition Workshops.2017.
[8] LeibetsederA, PetscharnigS, PrimusMJ, et al. LapGyn4: a dataset for 4 automatic content analysis problems in the domain of laparoscopic gynecology[J]. Proceedings of the 9th ACM Multimedia Systems Conference,2018:357-362. DOI: 10.1145/3204949.3208127.
[9] ZhangB, WangS, DongL, et al. Surgical tools detection based on modulated anchoring network in laparoscopic videos[J]. IEEE Access,2020,8:23748-23758. DOI:10.1109/ access.2020.2969885.
[10] Madad ZadehS, FrancoisT, CalvetL, et al. SurgAI: deep learning for computerized laparoscopic image understan-ding in gynaecology[J]. Surg Endosc,2020,34(12):5377-5383. DOI: 10.1007/s00464-019-07330-8.
[11] YamazakiY, KanajiS, MatsudaT, et al. Automated surgical instrument detection from laparoscopic gastrectomy video images using an open source convolutional neural net-work platform[J]. J Am Coll Surg,2020,230(5):725-732.e1. DOI: 10.1016/j.jamcollsurg.2020.01.037.
[12] WangS, RajuA, HuangJ, et al. Deep learning based multi-label classification for surgical tool presence detection in laparoscopic videos[C]//IEEE International Symposium on Biomedical Imaging.2017.
[13] SahuM, MukhopadhyayA, SzengelA, et al. Addressing multi-label imbalance problem of surgical tool detection using CNN[J]. Int J Comput Assist Radiol Surg,2017,12(6):1013-1020. DOI: 10.1007/s11548-017-1565-x.
[14] JinA, YeungS, JoplingJ, et al. Tool Detection and Opera-tive Skill Assessment in Surgical Videos Using Region-Based Convolutional Neural Networks[C]//IEEE Winter Conference on Applications of Computer Vision.2018.
[15] KitaguchiD, TakeshitaN, MatsuzakiH, et al. Real-time automatic surgical phase recognition in laparoscopic sig-moidectomy using the convolutional neural network-based deep learning approach[J]. Surg Endosc,2020,34(11):4924-4931. DOI: 10.1007/s00464-019-07281-0.
[16] HashimotoDA, RosmanG, WitkowskiER, et al. Computer vision analysis of intraoperative video: automated recog-nition of operative steps in laparoscopic sleeve gastrec-tomy[J]. Ann Surg,2019,270(3):414-421. DOI: 10.1097/SLA.0000000000003460.
[17] ChittajalluDR, DongB, TunisonP, et al. XAI-CBIR: explain-able AI system for content based retrieval of video frames from minimally invasive surgery videos[J]. 2019 IEEE 16th International Symposium on Biomedical Imaging,2019:66-69. DOI: 10.1109/ISBI.2019.8759428.
[18] ChenY, TangP, ZhongK, et al. Semi-supervised surgical workflow recognition based on convolution neural net-work[J]. Basic Clin Pharmacol Toxicol,2019,124(suppl 3):52.
[19] TokuyasuT, IwashitaY, MatsunobuY, et al. Development of an artificial intelligence system using deep learning to indicate anatomical landmarks during laparoscopic chole-cystectomy[J]. Surg Endosc,2021,35(4):1651-1658. DOI: 10.1007/s00464-020-07548-x.
[20] HarangiB, HajduA, LampeR, et al. Recognizing ureter and uterine artery in endoscopic images using a convolutional neural network[C]//IEEE International Symposium on Computer-Based Medical Systems.2017.
[21] ChoiB, JoK, ChoiS, et al. Surgical-tools detection based on Convolutional Neural Network in laparoscopic robot-assisted surgery[J]. Annu Int Conf IEEE Eng Med Biol Soc,2017,2017:1756-1759. DOI: 10.1109/EMBC.2017.8037183.
[22] de la Fuente López E, MuñozGÁ, Santos Del BlancoL, et al. Automatic gauze tracking in laparoscopic surgery using image texture analysis[J]. Comput Methods Programs Biomed,2020,190:105378. DOI: 10.1016/j.cmpb.2020.105378.
[23] RedmonJ, FarhadiA. YOLO9000: better, faster, stronger [J]. 30th IEEE Conference on Computer Vision and Pattern Recognition(Cvpr2017),2017,1:6517-6525. DOI: 10.1109/Cvpr.2017.690.
[24] LinTY, DollarP, GirshickR, et al. Feature pyramid net-works for object detection[J]. 30th IEEE Conference on Computer Vision and Pattern Recognition(Cvpr2017),2017,1:936-944. DOI: 10.1109/Cvpr.2017.106.