University of Twente Student Theses


Artificial intelligence for tool and action detection in laparoscopic fundoplication surgery videos

Croonen, Elke (2023) Artificial intelligence for tool and action detection in laparoscopic fundoplication surgery videos.

[img] PDF
Abstract:Electrosurgical devices are widely used in modern surgical interventions, providing precise tissue cutting, coagulation, and sealing. However, ensuring optimal hemostasis quality while minimizing tissue damage remains a significant challenge. Excessive energy utilization can lead to adverse effects such as thermal injury, prolonged healing, and increased postoperative complications. Therefore, it is important to create more knowledge about the usage of electrosurgical tools during surgery. To address this, a research project at the Meander Medical Centre aims to develop an automatic objective assessment of energy usage during surgery. During this these two methods are developed. One for tool detection and one for tool activations detection. The first method presents a YOLOv7 network developed for surgical tool detection in 65 laparoscopic fundoplication videos. Utilizing a semi-supervised approach with active learning techniques, with 36% of 13.518 frames manually annotated. The network achieves high precision of 0.90, recall of 0.81, mAP@0.5 of 0.81, and mAP@0.5:0.95 of 0.58. This study emphasizes the importance of a semi-supervised and active learning approach to enhance data labeling efficiency. Future research should consider incorporating temporal information and expanding the network’s training on diverse surgical procedures and tools. The second method presents a method to acquire the number and length of activations of an electrosurgical tool, the Enseal, in laparoscopic fundoplication surgery videos. The study involves acquiring video and audio data from 10 laparoscopic fundoplication surgeries to train an action recognition network for detecting the activation of the Enseal tool. The audio recordings of the Gen11 (the energy generator of the Enseal) are processed to generate a ground truth label for Enseal tool activations. The network architecture comprises an I3D feature extractor and a MSTCN++. The MSTCN++ generates frame-wise labels from the feature maps and processed audio, enabling action detection. The network achieves a frame-wise accuracy of 90.74, a segmental edit distance of 86.86, segmental F1@0.1 of 73.99, F1@0.25 of 70.17, and F1@0.5 of 54.90. The results of the action detection network show promise, considering the limited dataset of only 10 videos. However, there is room for improvements, which include increasing the dataset size, enhancing the feature extractor, incorporating data augmentation techniques, and exploring additional spatial information. These enhancements are expected to enhance the developed methodology’s accuracy and robustness. This master thesis shows the potential of utilizing deep learning networks for detecting energy devices and actions during laparoscopic surgery. The proposed methods hold promise for creating an automatic objective assessment of energy usage, enhancing surgical outcomes, and contributing to surgeons’ self-improvement and benchmarking. Future research should focus on refining the current methods and combining the methods with active bleeding detection, evaluating the effectiveness of tool activations. Additionally, exploring alternative data acquisition methods, such as energy monitoring devices, could provide valuable insights into the amount of energy usage and its correlation with tool activation and effectiveness.
Item Type:Essay (Master)
Faculty:TNW: Science and Technology
Subject:44 medicine, 54 computer science
Programme:Technical Medicine MSc (60033)
Link to this item:
Export this item as:BibTeX
HTML Citation
Reference Manager


Repository Staff Only: item control page