1
|
Zhou R, Wang D, Zhang H, Zhu Y, Zhang L, Chen T, Liao W, Ye Z. Vision techniques for anatomical structures in laparoscopic surgery: a comprehensive review. Front Surg 2025; 12:1557153. [PMID: 40297644 PMCID: PMC12034692 DOI: 10.3389/fsurg.2025.1557153] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2025] [Accepted: 03/17/2025] [Indexed: 04/30/2025] Open
Abstract
Laparoscopic surgery is the method of choice for numerous surgical procedures, while it confronts a lot of challenges. Computer vision exerts a vital role in addressing these challenges and has become a research hotspot, especially in the classification, segmentation, and target detection of abdominal anatomical structures. This study presents a comprehensive review of the last decade of research in this area. At first, a categorized overview of the core subtasks is presented regarding their relevance and applicability to real-world medical scenarios. Second, the dataset used in the experimental validation is statistically analyzed. Subsequently, the technical approaches and trends of classification, segmentation, and target detection tasks are explored in detail, highlighting their advantages, limitations, and practical implications. Additionally, evaluation methods for the three types of tasks are discussed. Finally, gaps in current research are identified. Meanwhile, the great potential for development in this area is emphasized.
Collapse
Affiliation(s)
- Ru Zhou
- Department of General Surgery, RuiJin Hospital LuWan Branch, Shanghai Jiaotong University School of Medicine, Shanghai, China
| | - Dan Wang
- Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Zhejiang, Hangzhou, China
| | - Hanwei Zhang
- Institute of Intelligent Software, Guangzhou, Guangdong, China
| | - Ying Zhu
- Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Zhejiang, Hangzhou, China
| | - Lijun Zhang
- Institute of Software Chinese Academy of Sciences, Beijing, China
| | - Tianxiang Chen
- School of Cyber Space and Technology, University of Science and Technology of China, Hefei, China
| | - Wenqiang Liao
- Department of General Surgery, RuiJin Hospital LuWan Branch, Shanghai Jiaotong University School of Medicine, Shanghai, China
| | - Zi Ye
- Institute of Intelligent Software, Guangzhou, Guangdong, China
| |
Collapse
|
2
|
Qayyum A, Ali H, Caputo M, Vohra H, Akinosho T, Abioye S, Berrou I, Capik P, Qadir J, Bilal M. Robust multi-label surgical tool classification in noisy endoscopic videos. Sci Rep 2025; 15:5520. [PMID: 39952951 PMCID: PMC11828880 DOI: 10.1038/s41598-024-82351-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2023] [Accepted: 12/04/2024] [Indexed: 02/17/2025] Open
Abstract
Over the past few years, surgical data science has attracted substantial interest from the machine learning (ML) community. Various studies have demonstrated the efficacy of emerging ML techniques in analysing surgical data, particularly recordings of procedures, for digitising clinical and non-clinical functions like preoperative planning, context-aware decision-making, and operating skill assessment. However, this field is still in its infancy and lacks representative, well-annotated datasets for training robust models in intermediate ML tasks. Also, existing datasets suffer from inaccurate labels, hindering the development of reliable models. In this paper, we propose a systematic methodology for developing robust models for surgical tool classification using noisy endoscopic videos. Our methodology introduces two key innovations: (1) an intelligent active learning strategy for minimal dataset identification and label correction by human experts through collective intelligence; and (2) an assembling strategy for a student-teacher model-based self-training framework to achieve the robust classification of 14 surgical tools in a semi-supervised fashion. Furthermore, we employ strategies such as weighted data loaders and label smoothing to enable the models to learn difficult samples and address class imbalance issues. The proposed methodology achieves an average F1-score of 85.88% for the ensemble model-based self-training with class weights, and 80.88% without class weights for noisy tool labels. Also, our proposed method significantly outperforms existing approaches, which effectively demonstrates its effectiveness.
Collapse
Affiliation(s)
- Adnan Qayyum
- Information Technology University of the Punjab, Lahore, Pakistan
| | - Hassan Ali
- Information Technology University of the Punjab, Lahore, Pakistan
- UNSW, Sydney, Australia
| | - Massimo Caputo
- NHS Bristol Heart Institute, University of Bristol, Bristol, UK.
| | - Hunaid Vohra
- NHS Bristol Heart Institute, University of Bristol, Bristol, UK
| | | | | | | | - Paweł Capik
- University of the West of England, Bristol, UK
| | - Junaid Qadir
- College of Engineering, Qatar University, Doha, Qatar
| | - Muhammad Bilal
- University of the West of England, Bristol, UK
- Birmingham City University, Birmingham, UK
| |
Collapse
|
3
|
Magro M, Covallero N, Gambaro E, Ruffaldi E, De Momi E. A dual-instrument Kalman-based tracker to enhance robustness of microsurgical tools tracking. Int J Comput Assist Radiol Surg 2024; 19:2351-2362. [PMID: 39133431 DOI: 10.1007/s11548-024-03246-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2024] [Accepted: 07/26/2024] [Indexed: 08/13/2024]
Abstract
PURPOSE The integration of a surgical robotic instrument tracking module within optical microscopes holds the potential to advance microsurgery practices, as it facilitates automated camera movements, thereby augmenting the surgeon's capability in executing surgical procedures. METHODS In the present work, an innovative detection backbone based on spatial attention module is implemented to enhance the detection accuracy of small objects within the image. Additionally, we have introduced a robust data association technique, capable to re-track surgical instrument, mainly based on the knowledge of the dual-instrument robotics system, Intersection over Union metric and Kalman filter. RESULTS The effectiveness of this pipeline was evaluated through testing on a dataset comprising ten manually annotated videos of anastomosis procedures involving either animal or phantom vessels, exploiting the Symani®Surgical System-a dedicated robotic platform designed for microsurgery. The multiple object tracking precision (MOTP) and the multiple object tracking accuracy (MOTA) are used to evaluate the performance of the proposed approach, and a new metric is computed to demonstrate the efficacy in stabilizing the tracking result along the video frames. An average MOTP of 74±0.06% and a MOTA of 99±0.03% over the test videos were found. CONCLUSION These results confirm the potential of the proposed approach in enhancing precision and reliability in microsurgical instrument tracking. Thus, the integration of attention mechanisms and a tailored data association module could be a solid base for automatizing the motion of optical microscopes.
Collapse
Affiliation(s)
- Mattia Magro
- Department of Electronics, Information and Bioengineering, Politecnico di Milano, Milan, Italy.
- Medical Microinstruments, Inc., Wilmington, USA.
| | | | | | | | - Elena De Momi
- Department of Electronics, Information and Bioengineering, Politecnico di Milano, Milan, Italy
| |
Collapse
|
4
|
Ardila CM, González-Arroyave D. Precision at scale: Machine learning revolutionizing laparoscopic surgery. World J Clin Oncol 2024; 15:1256-1263. [PMID: 39473862 PMCID: PMC11514504 DOI: 10.5306/wjco.v15.i10.1256] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/24/2024] [Revised: 08/10/2024] [Accepted: 08/22/2024] [Indexed: 09/29/2024] Open
Abstract
In their recent study published in the World Journal of Clinical Cases, the article found that minimally invasive laparoscopic surgery under general anesthesia demonstrates superior efficacy and safety compared to traditional open surgery for early ovarian cancer patients. This editorial discusses the integration of machine learning in laparoscopic surgery, emphasizing its transformative potential in improving patient outcomes and surgical precision. Machine learning algorithms analyze extensive datasets to optimize procedural techniques, enhance decision-making, and personalize treatment plans. Advanced imaging modalities like augmented reality and real-time tissue classification, alongside robotic surgical systems and virtual reality simulations driven by machine learning, enhance imaging and training techniques, offering surgeons clearer visualization and precise tissue manipulation. Despite promising advancements, challenges such as data privacy, algorithm bias, and regulatory hurdles need addressing for the responsible deployment of machine learning technologies. Interdisciplinary collaborations and ongoing technological innovations promise further enhancement in laparoscopic surgery, fostering a future where personalized medicine and precision surgery redefine patient care.
Collapse
Affiliation(s)
- Carlos M Ardila
- Biomedical Stomatology Research Group, Universidad de Antioquia U de A, Medellín 0057, Colombia
| | | |
Collapse
|
5
|
Hermens F. Automatic object detection for behavioural research using YOLOv8. Behav Res Methods 2024; 56:7307-7330. [PMID: 38750389 PMCID: PMC11362367 DOI: 10.3758/s13428-024-02420-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/02/2024] [Indexed: 08/30/2024]
Abstract
Observational studies of human behaviour often require the annotation of objects in video recordings. Automatic object detection has been facilitated strongly by the development of YOLO ('you only look once') and particularly by YOLOv8 from Ultralytics, which is easy to use. The present study examines the conditions required for accurate object detection with YOLOv8. The results show almost perfect object detection even when the model was trained on a small dataset (100 to 350 images). The detector, however, does not extrapolate well to the same object in other backgrounds. By training the detector on images from a variety of backgrounds, excellent object detection can be restored. YOLOv8 could be a game changer for behavioural research that requires object annotation in video recordings.
Collapse
Affiliation(s)
- Frouke Hermens
- Open University of the Netherlands, Heerlen, The Netherlands.
| |
Collapse
|
6
|
Benavides D, Cisnal A, Fontúrbel C, de la Fuente E, Fraile JC. Real-Time Tool Localization for Laparoscopic Surgery Using Convolutional Neural Network. SENSORS (BASEL, SWITZERLAND) 2024; 24:4191. [PMID: 39000974 PMCID: PMC11243864 DOI: 10.3390/s24134191] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/19/2024] [Revised: 06/17/2024] [Accepted: 06/25/2024] [Indexed: 07/16/2024]
Abstract
Partially automated robotic systems, such as camera holders, represent a pivotal step towards enhancing efficiency and precision in surgical procedures. Therefore, this paper introduces an approach for real-time tool localization in laparoscopy surgery using convolutional neural networks. The proposed model, based on two Hourglass modules in series, can localize up to two surgical tools simultaneously. This study utilized three datasets: the ITAP dataset, alongside two publicly available datasets, namely Atlas Dione and EndoVis Challenge. Three variations of the Hourglass-based models were proposed, with the best model achieving high accuracy (92.86%) and frame rates (27.64 FPS), suitable for integration into robotic systems. An evaluation on an independent test set yielded slightly lower accuracy, indicating limited generalizability. The model was further analyzed using the Grad-CAM technique to gain insights into its functionality. Overall, this work presents a promising solution for automating aspects of laparoscopic surgery, potentially enhancing surgical efficiency by reducing the need for manual endoscope manipulation.
Collapse
Affiliation(s)
| | - Ana Cisnal
- Instituto de las Tecnologías Avanzadas de la Producción (ITAP), Escuela de Ingenierías Industriales, Universidad de Valladolid, Paseo Prado de la Magdalena 3-5, 47011 Valladolid, Spain; (D.B.); (C.F.); (E.d.l.F.); (J.C.F.)
| | | | | | | |
Collapse
|
7
|
Zhu Y, Du L, Fu PY, Geng ZH, Zhang DF, Chen WF, Li QL, Zhou PH. An Automated Video Analysis System for Retrospective Assessment and Real-Time Monitoring of Endoscopic Procedures (with Video). Bioengineering (Basel) 2024; 11:445. [PMID: 38790312 PMCID: PMC11118061 DOI: 10.3390/bioengineering11050445] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2024] [Revised: 04/21/2024] [Accepted: 04/22/2024] [Indexed: 05/26/2024] Open
Abstract
BACKGROUND AND AIMS Accurate recognition of endoscopic instruments facilitates quantitative evaluation and quality control of endoscopic procedures. However, no relevant research has been reported. In this study, we aimed to develop a computer-assisted system, EndoAdd, for automated endoscopic surgical video analysis based on our dataset of endoscopic instrument images. METHODS Large training and validation datasets containing 45,143 images of 10 different endoscopic instruments and a test dataset of 18,375 images collected from several medical centers were used in this research. Annotated image frames were used to train the state-of-the-art object detection model, YOLO-v5, to identify the instruments. Based on the frame-level prediction results, we further developed a hidden Markov model to perform video analysis and generate heatmaps to summarize the videos. RESULTS EndoAdd achieved high accuracy (>97%) on the test dataset for all 10 endoscopic instrument types. The mean average accuracy, precision, recall, and F1-score were 99.1%, 92.0%, 88.8%, and 89.3%, respectively. The area under the curve values exceeded 0.94 for all instrument types. Heatmaps of endoscopic procedures were generated for both retrospective and real-time analyses. CONCLUSIONS We successfully developed an automated endoscopic video analysis system, EndoAdd, which supports retrospective assessment and real-time monitoring. It can be used for data analysis and quality control of endoscopic procedures in clinical practice.
Collapse
Affiliation(s)
- Yan Zhu
- Endoscopy Center and Endoscopy Research Institute, Zhongshan Hospital, Fudan University, Shanghai 200032, China; (Y.Z.); (L.D.); (P.-Y.F.); (Z.-H.G.); (D.-F.Z.); (W.-F.C.)
- Shanghai Collaborative Innovation Center of Endoscopy, Shanghai 200032, China
| | - Ling Du
- Endoscopy Center and Endoscopy Research Institute, Zhongshan Hospital, Fudan University, Shanghai 200032, China; (Y.Z.); (L.D.); (P.-Y.F.); (Z.-H.G.); (D.-F.Z.); (W.-F.C.)
- Shanghai Collaborative Innovation Center of Endoscopy, Shanghai 200032, China
| | - Pei-Yao Fu
- Endoscopy Center and Endoscopy Research Institute, Zhongshan Hospital, Fudan University, Shanghai 200032, China; (Y.Z.); (L.D.); (P.-Y.F.); (Z.-H.G.); (D.-F.Z.); (W.-F.C.)
- Shanghai Collaborative Innovation Center of Endoscopy, Shanghai 200032, China
| | - Zi-Han Geng
- Endoscopy Center and Endoscopy Research Institute, Zhongshan Hospital, Fudan University, Shanghai 200032, China; (Y.Z.); (L.D.); (P.-Y.F.); (Z.-H.G.); (D.-F.Z.); (W.-F.C.)
- Shanghai Collaborative Innovation Center of Endoscopy, Shanghai 200032, China
| | - Dan-Feng Zhang
- Endoscopy Center and Endoscopy Research Institute, Zhongshan Hospital, Fudan University, Shanghai 200032, China; (Y.Z.); (L.D.); (P.-Y.F.); (Z.-H.G.); (D.-F.Z.); (W.-F.C.)
- Shanghai Collaborative Innovation Center of Endoscopy, Shanghai 200032, China
| | - Wei-Feng Chen
- Endoscopy Center and Endoscopy Research Institute, Zhongshan Hospital, Fudan University, Shanghai 200032, China; (Y.Z.); (L.D.); (P.-Y.F.); (Z.-H.G.); (D.-F.Z.); (W.-F.C.)
- Shanghai Collaborative Innovation Center of Endoscopy, Shanghai 200032, China
| | - Quan-Lin Li
- Endoscopy Center and Endoscopy Research Institute, Zhongshan Hospital, Fudan University, Shanghai 200032, China; (Y.Z.); (L.D.); (P.-Y.F.); (Z.-H.G.); (D.-F.Z.); (W.-F.C.)
- Shanghai Collaborative Innovation Center of Endoscopy, Shanghai 200032, China
| | - Ping-Hong Zhou
- Endoscopy Center and Endoscopy Research Institute, Zhongshan Hospital, Fudan University, Shanghai 200032, China; (Y.Z.); (L.D.); (P.-Y.F.); (Z.-H.G.); (D.-F.Z.); (W.-F.C.)
- Shanghai Collaborative Innovation Center of Endoscopy, Shanghai 200032, China
| |
Collapse
|
8
|
Loza G, Valdastri P, Ali S. Real-time surgical tool detection with multi-scale positional encoding and contrastive learning. Healthc Technol Lett 2024; 11:48-58. [PMID: 38638504 PMCID: PMC11022231 DOI: 10.1049/htl2.12060] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2023] [Accepted: 11/22/2023] [Indexed: 04/20/2024] Open
Abstract
Real-time detection of surgical tools in laparoscopic data plays a vital role in understanding surgical procedures, evaluating the performance of trainees, facilitating learning, and ultimately supporting the autonomy of robotic systems. Existing detection methods for surgical data need to improve processing speed and high prediction accuracy. Most methods rely on anchors or region proposals, limiting their adaptability to variations in tool appearance and leading to sub-optimal detection results. Moreover, using non-anchor-based detectors to alleviate this problem has been partially explored without remarkable results. An anchor-free architecture based on a transformer that allows real-time tool detection is introduced. The proposal is to utilize multi-scale features within the feature extraction layer and at the transformer-based detection architecture through positional encoding that can refine and capture context-aware and structural information of different-sized tools. Furthermore, a supervised contrastive loss is introduced to optimize representations of object embeddings, resulting in improved feed-forward network performances for classifying localized bounding boxes. The strategy demonstrates superiority to state-of-the-art (SOTA) methods. Compared to the most accurate existing SOTA (DSSS) method, the approach has an improvement of nearly 4% on mAP and a reduction in the inference time by 113%. It also showed a 7% higher mAP than the baseline model.
Collapse
Affiliation(s)
- Gerardo Loza
- School of Computing, Faculty of Engineering and Physical SciencesUniversity of LeedsWest YorkshireUK
| | - Pietro Valdastri
- School of Electronic and Electrical Engineering, Faculty of Engineering and Physical SciencesUniversity of LeedsWest YorkshireUK
| | - Sharib Ali
- School of Computing, Faculty of Engineering and Physical SciencesUniversity of LeedsWest YorkshireUK
| |
Collapse
|
9
|
Lin Z, Lei C, Yang L. Modern Image-Guided Surgery: A Narrative Review of Medical Image Processing and Visualization. SENSORS (BASEL, SWITZERLAND) 2023; 23:9872. [PMID: 38139718 PMCID: PMC10748263 DOI: 10.3390/s23249872] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/01/2023] [Revised: 11/15/2023] [Accepted: 12/13/2023] [Indexed: 12/24/2023]
Abstract
Medical image analysis forms the basis of image-guided surgery (IGS) and many of its fundamental tasks. Driven by the growing number of medical imaging modalities, the research community of medical imaging has developed methods and achieved functionality breakthroughs. However, with the overwhelming pool of information in the literature, it has become increasingly challenging for researchers to extract context-relevant information for specific applications, especially when many widely used methods exist in a variety of versions optimized for their respective application domains. By being further equipped with sophisticated three-dimensional (3D) medical image visualization and digital reality technology, medical experts could enhance their performance capabilities in IGS by multiple folds. The goal of this narrative review is to organize the key components of IGS in the aspects of medical image processing and visualization with a new perspective and insights. The literature search was conducted using mainstream academic search engines with a combination of keywords relevant to the field up until mid-2022. This survey systemically summarizes the basic, mainstream, and state-of-the-art medical image processing methods as well as how visualization technology like augmented/mixed/virtual reality (AR/MR/VR) are enhancing performance in IGS. Further, we hope that this survey will shed some light on the future of IGS in the face of challenges and opportunities for the research directions of medical image processing and visualization.
Collapse
Affiliation(s)
- Zhefan Lin
- School of Mechanical Engineering, Zhejiang University, Hangzhou 310030, China;
- ZJU-UIUC Institute, International Campus, Zhejiang University, Haining 314400, China;
| | - Chen Lei
- ZJU-UIUC Institute, International Campus, Zhejiang University, Haining 314400, China;
| | - Liangjing Yang
- School of Mechanical Engineering, Zhejiang University, Hangzhou 310030, China;
- ZJU-UIUC Institute, International Campus, Zhejiang University, Haining 314400, China;
| |
Collapse
|
10
|
Ping L, Wang Z, Yao J, Gao J, Yang S, Li J, Shi J, Wu W, Hua S, Wang H. Application and evaluation of surgical tool and tool tip recognition based on Convolutional Neural Network in multiple endoscopic surgical scenarios. Surg Endosc 2023; 37:7376-7384. [PMID: 37580576 DOI: 10.1007/s00464-023-10323-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2023] [Accepted: 07/19/2023] [Indexed: 08/16/2023]
Abstract
BACKGROUND In recent years, computer-assisted intervention and robot-assisted surgery are receiving increasing attention. The need for real-time identification and tracking of surgical tools and tool tips is constantly demanding. A series of researches focusing on surgical tool tracking and identification have been performed. However, the size of dataset, the sensitivity/precision, and the response time of these studies were limited. In this work, we developed and utilized an automated method based on Convolutional Neural Network (CNN) and You Only Look Once (YOLO) v3 algorithm to locate and identify surgical tools and tool tips covering five different surgical scenarios. MATERIALS AND METHODS An algorithm of object detection was applied to identify and locate the surgical tools and tool tips. DarkNet-19 was used as Backbone Network and YOLOv3 was modified and applied for the detection. We included a series of 181 endoscopy videos covering 5 different surgical scenarios: pancreatic surgery, thyroid surgery, colon surgery, gastric surgery, and external scenes. A total amount of 25,333 images containing 94,463 targets were collected. Training and test sets were divided in a proportion of 2.5:1. The data sets were openly stored at the Kaggle database. RESULTS Under an Intersection over Union threshold of 0.5, the overall sensitivity and precision rate of the model were 93.02% and 89.61% for tool recognition and 87.05% and 83.57% for tool tip recognition, respectively. The model demonstrated the highest tool and tool tip recognition sensitivity and precision rate under external scenes. Among the four different internal surgical scenes, the network had better performances in pancreatic and colon surgeries and poorer performances in gastric and thyroid surgeries. CONCLUSION We developed a surgical tool and tool tip recognition model based on CNN and YOLOv3. Validation of our model demonstrated satisfactory precision, accuracy, and robustness across different surgical scenes.
Collapse
Affiliation(s)
- Lu Ping
- 8-Year MD Program, Chinese Academy of Medical Science and Peking Union Medical College, Beijing, China
- Department of General Surgery, State Key Laboratory of Complex Severe and Rare Diseases, Peking Union Medical College Hospital, Chinese Academy of Medical Science and Peking Union Medical College, Beijing, China
| | - Zhihong Wang
- 8-Year MD Program, Chinese Academy of Medical Science and Peking Union Medical College, Beijing, China
| | - Jingjing Yao
- Department of Nursing, Peking Union Medical College Hospital, Chinese Academy of Medical Science and Peking Union Medical College, Beijing, China
| | - Junyi Gao
- Department of General Surgery, State Key Laboratory of Complex Severe and Rare Diseases, Peking Union Medical College Hospital, Chinese Academy of Medical Science and Peking Union Medical College, Beijing, China
| | - Sen Yang
- Department of General Surgery, State Key Laboratory of Complex Severe and Rare Diseases, Peking Union Medical College Hospital, Chinese Academy of Medical Science and Peking Union Medical College, Beijing, China
| | - Jiayi Li
- 8-Year MD Program, Chinese Academy of Medical Science and Peking Union Medical College, Beijing, China
| | - Jile Shi
- 8-Year MD Program, Chinese Academy of Medical Science and Peking Union Medical College, Beijing, China
| | - Wenming Wu
- Department of General Surgery, State Key Laboratory of Complex Severe and Rare Diseases, Peking Union Medical College Hospital, Chinese Academy of Medical Science and Peking Union Medical College, Beijing, China
| | - Surong Hua
- Department of General Surgery, State Key Laboratory of Complex Severe and Rare Diseases, Peking Union Medical College Hospital, Chinese Academy of Medical Science and Peking Union Medical College, Beijing, China.
| | - Huizhen Wang
- Department of Nursing, Peking Union Medical College Hospital, Chinese Academy of Medical Science and Peking Union Medical College, Beijing, China.
| |
Collapse
|
11
|
Memida S, Miura S. Identification of surgical forceps using YOLACT++ in different lighted environments. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2023; 2023:1-4. [PMID: 38083778 DOI: 10.1109/embc40787.2023.10341025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/18/2023]
Abstract
Forceps tracking in laparoscopic surgery contributes to improved surgical outcomes. We identified forceps using YOLACT++ for fast and accurate segmentation. Differences in the illumination of the environment can affect the image recognition accuracy in deep learning. Therefore, we examined the speed and accuracy of YOLACT++ forceps identification in different illuminated environments. We expected that this experiment would help us understand the optimal lighted environments for YOLACT++ and to further improve the performance of the forceps identification model. The greatest accuracy was obtained under a light-shielded environment with light shining only on the suture area. Although a laparotomy with a clear view of the surgical site is easier for the physician to operate in, we concluded that the forceps identification model of YOLACT++ can be used more effectively in the laparoscopic surgical environment.Clinical Relevance- This study contributes to analyzing the cause of surgical errors in laparoscopic surgery.
Collapse
|
12
|
Reddy M, Jonna P, Perala S, Rao M, Vazhiyal V. Automated Microsurgical Tool Categorization using a Surface-Based EMG System. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2023; 2023:1-5. [PMID: 38083685 DOI: 10.1109/embc40787.2023.10340320] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/18/2023]
Abstract
The selection of micro-surgical tools and their usage form a critical component in today's surgical outcome. The success or failure of the surgery depends on how effectively the surgeon has applied the tools on the operating patient. Popular surgical tools, including forceps, clamp applicators, micro-scissors, and needle holders, are typically used in operating procedures, and characterising the same set of tools in conjunction with the surgeon's implementation remains a topic of continued interest. There have been several computer vision-based approaches to segment and detect the tools used in the surgery, where the microscopic recordings are evaluated, however surgeon's haptic feedback and the subtle variation drawn is not evidently reported. The video-based detection system also suffers from the problem of masking or shadowing critical information while in use. Hence, efforts to extract localised information during the surgery are very valuable in addition to the video-acquired outcome. The surface-based electromyography (sEMG) system offers electro-muscular signals at the skin that characterise the actions performed. The work attempts to evaluate the adoption of sEMG signals for the detection of micro-surgical tools. In this work, a pilot study of handling micro-surgical tools and classifying the same using two-channel sEMG signals from the machine learning (ML) model is performed. A two-channel sEMG acquisition system connected to the electrodes was designed and used to build a staged and supervised data set consisting of operating surgical tools. Five hand-crafted features, each from an individual channel, were employed and utilised to design an accurate model. An accuracy of 97.43% was achieved for running the ANN model on sEMG signals to classify five surgical tools based on their press-and-release action. The use of sEMG signals for tool detection is a step towards the development of tool characterisation and the assessment of surgeons' skills.Clinical relevance- A two-channel sEMG signal acquisition system to detect micro-surgical tool usage was developed. Mapping the surgery outcome with the tool handling characteristics will be a valuable addition to the existing video-based assessment of the surgery. The subtle tool handling characteristics of the surgeon are effective in improving the overall outcome of the surgery, and it is possible to extract this information only from the proposed approach.
Collapse
|
13
|
Eckhoff JA, Ban Y, Rosman G, Müller DT, Hashimoto DA, Witkowski E, Babic B, Rus D, Bruns C, Fuchs HF, Meireles O. TEsoNet: knowledge transfer in surgical phase recognition from laparoscopic sleeve gastrectomy to the laparoscopic part of Ivor-Lewis esophagectomy. Surg Endosc 2023; 37:4040-4053. [PMID: 36932188 PMCID: PMC10156818 DOI: 10.1007/s00464-023-09971-2] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2022] [Accepted: 02/21/2023] [Indexed: 03/19/2023]
Abstract
BACKGROUND Surgical phase recognition using computer vision presents an essential requirement for artificial intelligence-assisted analysis of surgical workflow. Its performance is heavily dependent on large amounts of annotated video data, which remain a limited resource, especially concerning highly specialized procedures. Knowledge transfer from common to more complex procedures can promote data efficiency. Phase recognition models trained on large, readily available datasets may be extrapolated and transferred to smaller datasets of different procedures to improve generalizability. The conditions under which transfer learning is appropriate and feasible remain to be established. METHODS We defined ten operative phases for the laparoscopic part of Ivor-Lewis Esophagectomy through expert consensus. A dataset of 40 videos was annotated accordingly. The knowledge transfer capability of an established model architecture for phase recognition (CNN + LSTM) was adapted to generate a "Transferal Esophagectomy Network" (TEsoNet) for co-training and transfer learning from laparoscopic Sleeve Gastrectomy to the laparoscopic part of Ivor-Lewis Esophagectomy, exploring different training set compositions and training weights. RESULTS The explored model architecture is capable of accurate phase detection in complex procedures, such as Esophagectomy, even with low quantities of training data. Knowledge transfer between two upper gastrointestinal procedures is feasible and achieves reasonable accuracy with respect to operative phases with high procedural overlap. CONCLUSION Robust phase recognition models can achieve reasonable yet phase-specific accuracy through transfer learning and co-training between two related procedures, even when exposed to small amounts of training data of the target procedure. Further exploration is required to determine appropriate data amounts, key characteristics of the training procedure and temporal annotation methods required for successful transferal phase recognition. Transfer learning across different procedures addressing small datasets may increase data efficiency. Finally, to enable the surgical application of AI for intraoperative risk mitigation, coverage of rare, specialized procedures needs to be explored.
Collapse
Affiliation(s)
- J A Eckhoff
- Surgical Artificial Intelligence and Innovation Laboratory, Department of Surgery, Massachusetts General Hospital, 15 Parkman Street, WAC339, Boston, MA, 02114, USA.
- Department of General, Visceral, Tumor and Transplant Surgery, University Hospital Cologne, Kerpenerstrasse 62, 50937, Cologne, Germany.
| | - Y Ban
- Surgical Artificial Intelligence and Innovation Laboratory, Department of Surgery, Massachusetts General Hospital, 15 Parkman Street, WAC339, Boston, MA, 02114, USA
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, 32 Vassar St, Cambridge, MA, 02139, USA
| | - G Rosman
- Surgical Artificial Intelligence and Innovation Laboratory, Department of Surgery, Massachusetts General Hospital, 15 Parkman Street, WAC339, Boston, MA, 02114, USA
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, 32 Vassar St, Cambridge, MA, 02139, USA
| | - D T Müller
- Department of General, Visceral, Tumor and Transplant Surgery, University Hospital Cologne, Kerpenerstrasse 62, 50937, Cologne, Germany
| | - D A Hashimoto
- Department of Surgery, University Hospitals Cleveland Medical Center, Cleveland, OH, 44106, USA
- Department of Surgery, Case Western Reserve School of Medicine, Cleveland, OH, 44106, USA
| | - E Witkowski
- Surgical Artificial Intelligence and Innovation Laboratory, Department of Surgery, Massachusetts General Hospital, 15 Parkman Street, WAC339, Boston, MA, 02114, USA
| | - B Babic
- Department of General, Visceral, Tumor and Transplant Surgery, University Hospital Cologne, Kerpenerstrasse 62, 50937, Cologne, Germany
| | - D Rus
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, 32 Vassar St, Cambridge, MA, 02139, USA
| | - C Bruns
- Department of General, Visceral, Tumor and Transplant Surgery, University Hospital Cologne, Kerpenerstrasse 62, 50937, Cologne, Germany
| | - H F Fuchs
- Department of General, Visceral, Tumor and Transplant Surgery, University Hospital Cologne, Kerpenerstrasse 62, 50937, Cologne, Germany
| | - O Meireles
- Surgical Artificial Intelligence and Innovation Laboratory, Department of Surgery, Massachusetts General Hospital, 15 Parkman Street, WAC339, Boston, MA, 02114, USA
| |
Collapse
|
14
|
Su B, Zhang Q, Gong Y, Xiu W, Gao Y, Xu L, Li H, Wang Z, Yu S, Hu YD, Yao W, Wang J, Li C, Tang J, Gao L. Deep learning-based classification and segmentation for scalpels. Int J Comput Assist Radiol Surg 2023; 18:855-864. [PMID: 36602643 DOI: 10.1007/s11548-022-02825-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2022] [Accepted: 12/22/2022] [Indexed: 01/06/2023]
Abstract
PURPOSE Scalpels are typical tools used for cutting in surgery, and the surgical tray is one of the locations where the scalpel is present during surgery. However, there is no known method for the classification and segmentation of multiple types of scalpels. This paper presents a dataset of multiple types of scalpels and a classification and segmentation method that can be applied as a first step for validating segmentation of scalpels and further applications can include identifying scalpels from other tools in different clinical scenarios. METHODS The proposed scalpel dataset contains 6400 images with labeled information of 10 types of scalpels, and a classification and segmentation model for multiple types of scalpels is obtained by training the dataset based on Mask R-CNN. The article concludes with an analysis and evaluation of the network performance, verifying the feasibility of the work. RESULTS A multi-type scalpel dataset was established, and the classification and segmentation models of multi-type scalpel were obtained by training the Mask R-CNN. The average accuracy and average recall reached 94.19% and 96.61%, respectively, in the classification task and 93.30% and 95.14%, respectively, in the segmentation task. CONCLUSION The first scalpel dataset is created covering multiple types of scalpels. And the classification and segmentation of multiple types of scalpels are realized for the first time. This study achieves the classification and segmentation of scalpels in a surgical tray scene, providing a potential solution for scalpel recognition, localization and tracking.
Collapse
Affiliation(s)
- Baiquan Su
- Medical Robotics Laboratory, School of Automation, Beijing University of Posts and Telecommunications, Beijing, China
| | - Qingqian Zhang
- Medical Robotics Laboratory, School of Automation, Beijing University of Posts and Telecommunications, Beijing, China
| | - Yi Gong
- Medical Robotics Laboratory, School of Automation, Beijing University of Posts and Telecommunications, Beijing, China
| | - Wei Xiu
- Chinese Institute of Electronics, Beijing, China
| | - Yang Gao
- Chinese Institute of Electronics, Beijing, China
| | - Lixin Xu
- Department of Neurosurgery, Xuanwu Hospital, Capital Medical University, Beijing, China
| | - Han Li
- Medical Robotics Laboratory, School of Automation, Beijing University of Posts and Telecommunications, Beijing, China
| | - Zehao Wang
- Medical Robotics Laboratory, School of Automation, Beijing University of Posts and Telecommunications, Beijing, China
| | - Shi Yu
- Medical Robotics Laboratory, School of Automation, Beijing University of Posts and Telecommunications, Beijing, China
| | - Yida David Hu
- Brigham and Women's Hospital, Harvard Medical School, Boston, USA
| | - Wei Yao
- Gastroenterology Department, Peking University Third Hospital, Beijing, China
| | - Junchen Wang
- School of Mechanical Engineering and Automation, Beihang University, Beijing, China
| | - Changsheng Li
- School of Mechatronical Engineering, Beijing Institute of Technology, Beijing, China
| | - Jie Tang
- Department of Neurosurgery, Xuanwu Hospital, Capital Medical University, Beijing, China.
| | - Li Gao
- Department of Periodontology, National Stomatological Center, Peking University School and Hospital of Stomatology, Beijing, China.
- National Clinical Research Center for Oral Diseases, Beijing, China.
- National Engineering Research Center of Oral Biomaterials and Digital Medical Devices, Beijing, China.
- Beijing Key Laboratory of Digital Stomatology, Beijing, China.
| |
Collapse
|
15
|
Reiter W. Domain generalization improves end-to-end object detection for real-time surgical tool detection. Int J Comput Assist Radiol Surg 2022; 18:939-944. [PMID: 36581742 DOI: 10.1007/s11548-022-02823-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2022] [Accepted: 12/20/2022] [Indexed: 12/31/2022]
Abstract
PURPOSE Computer assistance for endoscopic surgery depends on knowledge about the contents in an endoscopic scene. An important step of analysing the video contents is real-time surgical tool detection. Most methods for tool detection nevertheless depend on multi-step algorithms building upon prior knowledge like anchor boxes or non-maximum suppression which ultimately decrease performance. A real-world difficulty encountered by learning-based methods are limited datasets. Training a neural network on data matching a specific distribution (e.g. from a single hospital or showing a specific type of surgery) can result in a lack of generalization. METHODS In this paper, we propose the application of a transformer based architecture for end-to-end tool detection. This architecture promises state-of-the-art accuracy while decreasing the complexity resulting in improved run-time performance. To improve the lack of cross-domain generalization due to limited datasets, we enhance the architecture with a latent feature space via variational encoding to capture common intra-domain information. This feature space models the linear dependencies between domains by constraining their rank. RESULTS The trained neural networks show a distinct improvement on out-of-domain data indicating better generalization to unseen domains. Inference with the end-to-end architecture can be performed at up to 138 frames per second (FPS) achieving a speedup in comparison to older approaches. CONCLUSIONS Experimental results on three representative datasets demonstrate the performance of the method. We also show that our approach leads to better domain generalization.
Collapse
|
16
|
Surgical Tool Datasets for Machine Learning Research: A Survey. Int J Comput Vis 2022. [DOI: 10.1007/s11263-022-01640-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
AbstractThis paper is a comprehensive survey of datasets for surgical tool detection and related surgical data science and machine learning techniques and algorithms. The survey offers a high level perspective of current research in this area, analyses the taxonomy of approaches adopted by researchers using surgical tool datasets, and addresses key areas of research, such as the datasets used, evaluation metrics applied and deep learning techniques utilised. Our presentation and taxonomy provides a framework that facilitates greater understanding of current work, and highlights the challenges and opportunities for further innovative and useful research.
Collapse
|
17
|
Deepika P, Udupa K, Beniwal M, Uppar AM, V V, Rao M. Automated Microsurgical Tool Segmentation and Characterization in Intra-Operative Neurosurgical Videos. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2022; 2022:2110-2114. [PMID: 36086279 DOI: 10.1109/embc48229.2022.9871838] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Checklist based routine evaluation of surgical skills in any medical school demands quality time and effort from the supervising expert and is highly influenced by assessor bias. Alternatively, automated video based surgical skill assessment is a simple and viable method to analyse surgical dexterity offline without the need for acute presence of an expert surgeon throughout the surgery. In this paper, a novel approach and results for the automated segmentation of microsurgical instruments from the real-world neurosurgical video dataset was presented. The proposed tool segmentation model showcased mean average precision of 96.7% in detecting, and localizing five surgical instruments from the real-world neurosurgical videos. Accurate detection and characterization of motion features of the microsurgical tool from the novel annotated neurosurgical video dataset forms the key step towards automated surgical skill evaluation. Clinical Relevance- Tool segmentation, localization, and characterization in neurosurgical video, has several applications including assessing surgeons skills, training novice surgeons, understanding critical operating procedures post surgery, characterizing any critical anatomical response to the tool that leads to the success or failure of the surgery, and building models for conducting autonomous robotic surgery. Semantic segmentation, and characterization of the microsurgical tools forms the basis of the modern neurosurgery.
Collapse
|
18
|
Sun X, Zou Y, Wang S, Su H, Guan B. A parallel network utilizing local features and global representations for segmentation of surgical instruments. Int J Comput Assist Radiol Surg 2022; 17:1903-1913. [PMID: 35680692 DOI: 10.1007/s11548-022-02687-z] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2021] [Accepted: 05/19/2022] [Indexed: 11/30/2022]
Abstract
PURPOSE Automatic image segmentation of surgical instruments is a fundamental task in robot-assisted minimally invasive surgery, which greatly improves the context awareness of surgeons during the operation. A novel method based on Mask R-CNN is proposed in this paper to realize accurate instance segmentation of surgical instruments. METHODS A novel feature extraction backbone is built, which could extract both local features through the convolutional neural network branch and global representations through the Swin-Transformer branch. Moreover, skip fusions are applied in the backbone to fuse both features and improve the generalization ability of the network. RESULTS The proposed method is evaluated on the dataset of MICCAI 2017 EndoVis Challenge with three segmentation tasks and shows state-of-the-art performance with an mIoU of 0.5873 in type segmentation and 0.7408 in part segmentation. Furthermore, the results of ablation studies prove that the proposed novel backbone contributes to at least 17% improvement in mIoU. CONCLUSION The promising results demonstrate that our method can effectively extract global representations as well as local features in the segmentation of surgical instruments and improve the accuracy of segmentation. With the proposed novel backbone, the network can segment the contours of surgical instruments' end tips more precisely. This method can provide more accurate data for localization and pose estimation of surgical instruments, and make a further contribution to the automation of robot-assisted minimally invasive surgery.
Collapse
Affiliation(s)
- Xinan Sun
- Key Laboratory of Mechanism Theory and Equipment Design of Ministry of Education, Tianjin University, 135 Yaguan Road, Tianjin, 300350, China.,School of Mechanical Engineering, Tianjin University, 135 Yaguan Road, Jinnan District, Tianjin, 300350, China
| | - Yuelin Zou
- Key Laboratory of Mechanism Theory and Equipment Design of Ministry of Education, Tianjin University, 135 Yaguan Road, Tianjin, 300350, China.,School of Mechanical Engineering, Tianjin University, 135 Yaguan Road, Jinnan District, Tianjin, 300350, China
| | - Shuxin Wang
- Key Laboratory of Mechanism Theory and Equipment Design of Ministry of Education, Tianjin University, 135 Yaguan Road, Tianjin, 300350, China.,School of Mechanical Engineering, Tianjin University, 135 Yaguan Road, Jinnan District, Tianjin, 300350, China
| | - He Su
- Key Laboratory of Mechanism Theory and Equipment Design of Ministry of Education, Tianjin University, 135 Yaguan Road, Tianjin, 300350, China. .,School of Mechanical Engineering, Tianjin University, 135 Yaguan Road, Jinnan District, Tianjin, 300350, China.
| | - Bo Guan
- Key Laboratory of Mechanism Theory and Equipment Design of Ministry of Education, Tianjin University, 135 Yaguan Road, Tianjin, 300350, China.,School of Mechanical Engineering, Tianjin University, 135 Yaguan Road, Jinnan District, Tianjin, 300350, China
| |
Collapse
|
19
|
Ramesh A, Beniwal M, Uppar AM, V V, Rao M. Microsurgical Tool Detection and Characterization in Intra-operative Neurosurgical Videos. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2021; 2021:2676-2681. [PMID: 34891803 DOI: 10.1109/embc46164.2021.9630274] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Brain surgery is complex and has evolved as a separate surgical specialty. Surgical procedures on the brain are performed using dedicated micro-instruments which are designed specifically for the requirements of operating with finesse in a confined space. The usage of these microsurgical tools in an operating environment defines the surgical skill of a surgeon. Video recordings of micro-surgical procedures are a rich source of information to develop automated surgical assessment tools that can offer continuous feedback for surgeons to improve their skills, effectively increase the outcome of the surgery, and make a positive impact on their patients. This work presents a novel deep learning system based on the Yolov5 algorithm to automatically detect, localize and characterize microsurgical tools from recorded intra-operative neurosurgical videos. The tool detection achieves a high 93.2% mean average precision. The detected tools are then characterized by their on-off time, motion trajectory and usage time. Tool characterization from neurosurgical videos offers useful insight into the surgical methods employed by a surgeon and can aid in their improvement. Additionally, a new dataset of annotated neurosurgical videos is used to develop the robust model and is made available for the research community.Clinical relevance- Tool detection and characterization in neurosurgery has several online and offline applications including skill assessment and outcome of the surgery. The development of automated tool characterization systems for intra-operative neurosurgery is expected to not only improve the surgical skills of the surgeon, but also leverage in training the neurosurgical workforce. Additionally, dedicated neurosurgical video based datasets will, in general, aid the research community to explore more automation in this field.
Collapse
|
20
|
Using deep learning to identify the recurrent laryngeal nerve during thyroidectomy. Sci Rep 2021; 11:14306. [PMID: 34253767 PMCID: PMC8275665 DOI: 10.1038/s41598-021-93202-y] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2021] [Accepted: 06/22/2021] [Indexed: 11/16/2022] Open
Abstract
Surgeons must visually distinguish soft-tissues, such as nerves, from surrounding anatomy to prevent complications and optimize patient outcomes. An accurate nerve segmentation and analysis tool could provide useful insight for surgical decision-making. Here, we present an end-to-end, automatic deep learning computer vision algorithm to segment and measure nerves. Unlike traditional medical imaging, our unconstrained setup with accessible handheld digital cameras, along with the unstructured open surgery scene, makes this task uniquely challenging. We investigate one common procedure, thyroidectomy, during which surgeons must avoid damaging the recurrent laryngeal nerve (RLN), which is responsible for human speech. We evaluate our segmentation algorithm on a diverse dataset across varied and challenging settings of operating room image capture, and show strong segmentation performance in the optimal image capture condition. This work lays the foundation for future research in real-time tissue discrimination and integration of accessible, intelligent tools into open surgery to provide actionable insights.
Collapse
|
21
|
Accurate instance segmentation of surgical instruments in robotic surgery: model refinement and cross-dataset evaluation. Int J Comput Assist Radiol Surg 2021; 16:1607-1614. [PMID: 34173182 DOI: 10.1007/s11548-021-02438-6] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2020] [Accepted: 06/17/2021] [Indexed: 10/21/2022]
Abstract
PURPOSE Automatic segmentation of surgical instruments in robot-assisted minimally invasive surgery plays a fundamental role in improving context awareness. In this work, we present an instance segmentation model based on refined Mask R-CNN for accurately segmenting the instruments as well as identifying their types. METHODS We re-formulate the instrument segmentation task as an instance segmentation task. Then we optimize the Mask R-CNN with anchor optimization and improved Region Proposal Network for instrument segmentation. Moreover, we perform cross-dataset evaluation with different sampling strategies. RESULTS We evaluate our model on a public dataset of the MICCAI 2017 Endoscopic Vision Challenge with two segmentation tasks, and both achieve new state-of-the-art performance. Besides, cross-dataset training improved the performance on both segmentation tasks compared with those tested on the public dataset. CONCLUSION Results demonstrate the effectiveness of the proposed instance segmentation network for surgical instruments segmentation. Cross-dataset evaluation shows our instance segmentation model presents certain cross-dataset generalization capability, and cross-dataset training can significantly improve the segmentation performance. Our empirical study also provides guidance on how to allocate the annotation cost for surgeons while labelling a new dataset in practice.
Collapse
|
22
|
Bamba Y, Ogawa S, Itabashi M, Shindo H, Kameoka S, Okamoto T, Yamamoto M. Object and anatomical feature recognition in surgical video images based on a convolutional neural network. Int J Comput Assist Radiol Surg 2021; 16:2045-2054. [PMID: 34169465 PMCID: PMC8224261 DOI: 10.1007/s11548-021-02434-w] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2021] [Accepted: 06/17/2021] [Indexed: 12/14/2022]
Abstract
Purpose Artificial intelligence-enabled techniques can process large amounts of surgical data and may be utilized for clinical decision support to recognize or forecast adverse events in an actual intraoperative scenario. To develop an image-guided navigation technology that will help in surgical education, we explored the performance of a convolutional neural network (CNN)-based computer vision system in detecting intraoperative objects. Methods The surgical videos used for annotation were recorded during surgeries conducted in the Department of Surgery of Tokyo Women’s Medical University from 2019 to 2020. Abdominal endoscopic images were cut out from manually captured surgical videos. An open-source programming framework for CNN was used to design a model that could recognize and segment objects in real time through IBM Visual Insights. The model was used to detect the GI tract, blood, vessels, uterus, forceps, ports, gauze and clips in the surgical images. Results The accuracy, precision and recall of the model were 83%, 80% and 92%, respectively. The mean average precision (mAP), the calculated mean of the precision for each object, was 91%. Among surgical tools, the highest recall and precision of 96.3% and 97.9%, respectively, were achieved for forceps. Among the anatomical structures, the highest recall and precision of 92.9% and 91.3%, respectively, were achieved for the GI tract. Conclusion The proposed model could detect objects in operative images with high accuracy, highlighting the possibility of using AI-based object recognition techniques for intraoperative navigation. Real-time object recognition will play a major role in navigation surgery and surgical education. Supplementary Information The online version contains supplementary material available at 10.1007/s11548-021-02434-w.
Collapse
Affiliation(s)
- Yoshiko Bamba
- Department of Surgery, Institute of Gastroenterology, Tokyo Women's Medical University, 8-1, Kawadacho Shinjuku-ku, Tokyo, 162-8666, Japan.
| | - Shimpei Ogawa
- Department of Surgery, Institute of Gastroenterology, Tokyo Women's Medical University, 8-1, Kawadacho Shinjuku-ku, Tokyo, 162-8666, Japan
| | - Michio Itabashi
- Department of Surgery, Institute of Gastroenterology, Tokyo Women's Medical University, 8-1, Kawadacho Shinjuku-ku, Tokyo, 162-8666, Japan
| | | | | | - Takahiro Okamoto
- Department of Breast Endocrinology Surgery, Tokyo Women's Medical University, Tokyo, Japan
| | - Masakazu Yamamoto
- Department of Surgery, Institute of Gastroenterology, Tokyo Women's Medical University, 8-1, Kawadacho Shinjuku-ku, Tokyo, 162-8666, Japan
| |
Collapse
|
23
|
Yang C, Zhao Z, Hu S. Image-based laparoscopic tool detection and tracking using convolutional neural networks: a review of the literature. Comput Assist Surg (Abingdon) 2020; 25:15-28. [PMID: 32886540 DOI: 10.1080/24699322.2020.1801842] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022] Open
Abstract
Intraoperative detection and tracking of minimally invasive instruments is a prerequisite for computer- and robotic-assisted surgery. Since additional hardware, such as tracking systems or the robot encoders, are cumbersome and lack accuracy, surgical vision is evolving as a promising technique to detect and track the instruments using only endoscopic images. The present paper presents a review of the literature regarding image-based laparoscopic tool detection and tracking using convolutional neural networks (CNNs) and consists of four primary parts: (1) fundamentals of CNN; (2) public datasets; (3) CNN-based methods for the detection and tracking of laparoscopic instruments; and (4) discussion and conclusion. To help researchers quickly understand the various existing CNN-based algorithms, some basic information and a quantitative estimation of several performances are analyzed and compared from the perspective of 'partial CNN approaches' and 'full CNN approaches'. Moreover, we highlight the challenges related to research of CNN-based detection algorithms and provide possible future developmental directions.
Collapse
Affiliation(s)
- Congmin Yang
- School of Control Science and Engineering, Shandong University, Jinan, China
| | - Zijian Zhao
- School of Control Science and Engineering, Shandong University, Jinan, China
| | - Sanyuan Hu
- Department of General surgery, First Affiliated Hospital of Shandong First Medical University, Jinan, China
| |
Collapse
|
24
|
Tanzi L, Piazzolla P, Vezzetti E. Intraoperative surgery room management: A deep learning perspective. Int J Med Robot 2020; 16:1-12. [PMID: 32510857 DOI: 10.1002/rcs.2136] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2020] [Revised: 04/21/2020] [Accepted: 06/03/2020] [Indexed: 12/22/2022]
Abstract
PURPOSE The current study aimed to systematically review the literature addressing the use of deep learning (DL) methods in intraoperative surgery applications, focusing on the data collection, the objectives of these tools and, more technically, the DL-based paradigms utilized. METHODS A literature search with classic databases was performed: we identified, with the use of specific keywords, a total of 996 papers. Among them, we selected 52 for effective analysis, focusing on articles published after January 2015. RESULTS The preliminary results of the implementation of DL in clinical setting are encouraging. Almost all the surgery sub-fields have seen the advent of artificial intelligence (AI) applications and the results outperformed the previous techniques in the majority of the cases. From these results, a conceptualization of an intelligent operating room (IOR) is also presented. CONCLUSION This evaluation outlined how AI and, in particular, DL are revolutionizing the surgery field, with numerous applications, such as context detection and room management. This process is evolving years by years into the realization of an IOR, equipped with technologies perfectly suited to drastically improve the surgical workflow.
Collapse
|
25
|
Evaluation of Surgical Skills during Robotic Surgery by Deep Learning-Based Multiple Surgical Instrument Tracking in Training and Actual Operations. J Clin Med 2020; 9:jcm9061964. [PMID: 32585953 PMCID: PMC7355689 DOI: 10.3390/jcm9061964] [Citation(s) in RCA: 27] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2020] [Revised: 06/13/2020] [Accepted: 06/15/2020] [Indexed: 12/17/2022] Open
Abstract
As the number of robotic surgery procedures has increased, so has the importance of evaluating surgical skills in these techniques. It is difficult, however, to automatically and quantitatively evaluate surgical skills during robotic surgery, as these skills are primarily associated with the movement of surgical instruments. This study proposes a deep learning-based surgical instrument tracking algorithm to evaluate surgeons’ skills in performing procedures by robotic surgery. This method overcame two main drawbacks: occlusion and maintenance of the identity of the surgical instruments. In addition, surgical skill prediction models were developed using motion metrics calculated from the motion of the instruments. The tracking method was applied to 54 video segments and evaluated by root mean squared error (RMSE), area under the curve (AUC), and Pearson correlation analysis. The RMSE was 3.52 mm, the AUC of 1 mm, 2 mm, and 5 mm were 0.7, 0.78, and 0.86, respectively, and Pearson’s correlation coefficients were 0.9 on the x-axis and 0.87 on the y-axis. The surgical skill prediction models showed an accuracy of 83% with Objective Structured Assessment of Technical Skill (OSATS) and Global Evaluative Assessment of Robotic Surgery (GEARS). The proposed method was able to track instruments during robotic surgery, suggesting that the current method of surgical skill assessment by surgeons can be replaced by the proposed automatic and quantitative evaluation method.
Collapse
|
26
|
Madad Zadeh S, Francois T, Calvet L, Chauvet P, Canis M, Bartoli A, Bourdel N. SurgAI: deep learning for computerized laparoscopic image understanding in gynaecology. Surg Endosc 2020; 34:5377-5383. [PMID: 31996995 DOI: 10.1007/s00464-019-07330-8] [Citation(s) in RCA: 38] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2019] [Accepted: 12/24/2019] [Indexed: 11/25/2022]
Abstract
BACKGROUND In laparoscopy, the digital camera offers surgeons the opportunity to receive support from image-guided surgery systems. Such systems require image understanding, the ability for a computer to understand what the laparoscope sees. Image understanding has recently progressed owing to the emergence of artificial intelligence and especially deep learning techniques. However, the state of the art of deep learning in gynaecology only offers image-based detection, reporting the presence or absence of an anatomical structure, without finding its location. A solution to the localisation problem is given by the concept of semantic segmentation, giving the detection and pixel-level location of a structure in an image. The state-of-the-art results in semantic segmentation are achieved by deep learning, whose usage requires a massive amount of annotated data. We propose the first dataset dedicated to this task and the first evaluation of deep learning-based semantic segmentation in gynaecology. METHODS We used the deep learning method called Mask R-CNN. Our dataset has 461 laparoscopic images manually annotated with three classes: uterus, ovaries and surgical tools. We split our dataset in 361 images to train Mask R-CNN and 100 images to evaluate its performance. RESULTS The segmentation accuracy is reported in terms of percentage of overlap between the segmented regions from Mask R-CNN and the manually annotated ones. The accuracy is 84.5%, 29.6% and 54.5% for uterus, ovaries and surgical tools, respectively. An automatic detection of these structures was then inferred from the semantic segmentation results which led to state-of-the-art detection performance, except for the ovaries. Specifically, the detection accuracy is 97%, 24% and 86% for uterus, ovaries and surgical tools, respectively. CONCLUSION Our preliminary results are very promising, given the relatively small size of our initial dataset. The creation of an international surgical database seems essential.
Collapse
Affiliation(s)
- Sabrina Madad Zadeh
- Department of Gynaecological Surgery, CHU Clermont-Ferrand, 1 Place Lucie et Raymond Aubrac, 63000, Clermont-Ferrand, France
- EnCoV, Institut Pascal, CNRS, Université Clermont Auvergne, Clermont-Ferrand, France
| | - Tom Francois
- EnCoV, Institut Pascal, CNRS, Université Clermont Auvergne, Clermont-Ferrand, France
| | - Lilian Calvet
- EnCoV, Institut Pascal, CNRS, Université Clermont Auvergne, Clermont-Ferrand, France
| | - Pauline Chauvet
- Department of Gynaecological Surgery, CHU Clermont-Ferrand, 1 Place Lucie et Raymond Aubrac, 63000, Clermont-Ferrand, France
- EnCoV, Institut Pascal, CNRS, Université Clermont Auvergne, Clermont-Ferrand, France
| | - Michel Canis
- Department of Gynaecological Surgery, CHU Clermont-Ferrand, 1 Place Lucie et Raymond Aubrac, 63000, Clermont-Ferrand, France
- EnCoV, Institut Pascal, CNRS, Université Clermont Auvergne, Clermont-Ferrand, France
| | - Adrien Bartoli
- EnCoV, Institut Pascal, CNRS, Université Clermont Auvergne, Clermont-Ferrand, France
| | - Nicolas Bourdel
- Department of Gynaecological Surgery, CHU Clermont-Ferrand, 1 Place Lucie et Raymond Aubrac, 63000, Clermont-Ferrand, France.
- EnCoV, Institut Pascal, CNRS, Université Clermont Auvergne, Clermont-Ferrand, France.
| |
Collapse
|
27
|
Ni ZL, Bian GB, Xie XL, Hou ZG, Zhou XH, Zhou YJ. RASNet: Segmentation for Tracking Surgical Instruments in Surgical Videos Using Refined Attention Segmentation Network. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2020; 2019:5735-5738. [PMID: 31947155 DOI: 10.1109/embc.2019.8856495] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Segmentation for tracking surgical instruments plays an important role in robot-assisted surgery. Segmentation of surgical instruments contributes to capturing accurate spatial information for tracking. In this paper, a novel network, Refined Attention Segmentation Network, is proposed to simultaneously segment surgical instruments and identify their categories. The U-shape network which is popular in segmentation is used. Different from previous work, an attention module is adopted to help the network focus on key regions, which can improve the segmentation accuracy. To solve the class imbalance problem, the weighted sum of the cross entropy loss and the logarithm of the Jaccard index is used as loss function. Furthermore, transfer learning is adopted in our network. The encoder is pre-trained on ImageNet. The dataset from the MICCAI EndoVis Challenge 2017 is used to evaluate our network. Based on this dataset, our network achieves state-of-the-art performance 94.65% mean Dice and 90.33% mean IOU.
Collapse
|
28
|
Graph Convolutional Nets for Tool Presence Detection in Surgical Videos. LECTURE NOTES IN COMPUTER SCIENCE 2019. [DOI: 10.1007/978-3-030-20351-1_36] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
|
29
|
|
30
|
A Kalman-Filter-Based Common Algorithm Approach for Object Detection in Surgery Scene to Assist Surgeon's Situation Awareness in Robot-Assisted Laparoscopic Surgery. JOURNAL OF HEALTHCARE ENGINEERING 2018; 2018:8079713. [PMID: 29854366 PMCID: PMC5954863 DOI: 10.1155/2018/8079713] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/10/2017] [Revised: 02/10/2018] [Accepted: 04/03/2018] [Indexed: 02/07/2023]
Abstract
Although the use of the surgical robot is rapidly expanding for various medical treatments, there still exist safety issues and concerns about robot-assisted surgeries due to limited vision through a laparoscope, which may cause compromised situation awareness and surgical errors requiring rapid emergency conversion to open surgery. To assist surgeon's situation awareness and preventive emergency response, this study proposes situation information guidance through a vision-based common algorithm architecture for automatic detection and tracking of intraoperative hemorrhage and surgical instruments. The proposed common architecture comprises the location of the object of interest using feature texture, morphological information, and the tracking of the object based on Kalman filter for robustness with reduced error. The average recall and precision of the instrument detection in four prostate surgery videos were 96% and 86%, and the accuracy of the hemorrhage detection in two prostate surgery videos was 98%. Results demonstrate the robustness of the automatic intraoperative object detection and tracking which can be used to enhance the surgeon's preventive state recognition during robot-assisted surgery.
Collapse
|