BPG is committed to discovery and dissemination of knowledge
Minireviews
©Author(s) (or their employer(s)) 2026. No commercial re-use. See Permissions. Published by Baishideng Publishing Group Inc.
Artif Intell Gastrointest Endosc. Mar 8, 2026; 7(1): 117988
Published online Mar 8, 2026. doi: 10.37126/aige.v7.i1.117988
Multimodal artificial intelligence in capsule endoscopy: Integrating video and sensor data for advanced gastrointestinal diagnostics
Rishi Chowdhary, Param Darpan Sheth, Insiya Mohammed Rampurawala, Chitresh Kapadia, Chirag Vohra, Rahul Chowdhary, Kirti Arora, Varna Taranikanti, Ashita Rukmini Vuthaluru, Omesh Goyal, Manjeet Kumar Goyal
Rishi Chowdhary, Department of Medicine, MetroHealth Medical Center, Cleveland, OH 44109, United States
Param Darpan Sheth, Insiya Mohammed Rampurawala, Department of Internal Medicine, J.S.S Medical College, JSS Academy of Higher Education and Research, Mysuru 570015, Karnātaka, India
Chitresh Kapadia, Department of Internal Medicine, Government Medical College, Miraj 416410, Mahārāshtra, India
Chirag Vohra, Department of Medicine, All India Institute of Medical Sciences, Jodhpur 342005, Rājasthān, India
Rahul Chowdhary, Kirti Arora, Manjeet Kumar Goyal, Department of Internal Medicine, Cleveland Clinic Akron General Hospital, Akron, OH 44307, United States
Varna Taranikanti, Department of Foundational Medical Studies, Oakland University William Beaumont School of Medicine Rochester, Rochester, MI 48309, United States
Ashita Rukmini Vuthaluru, Department of Anesthesiology, All India Institute of Medical Sciences, New Delhi 110029, Delhi, India
Omesh Goyal, Department of Gastroenterology, Dayanand Medical College and Hospital, Tagore Nagar, Ludhiana 141001, Punjab, India
Author contributions: Chowdhary Ri, Sheth PD, Rampurawala IM, Chowdhary Ra, and Goyal MK performed the conceptualization of the study; Chowdhary Ri, Goyal O, and Kapadia C developed the methodology and design; Chowdhary Ra, Rampurawala IM, Vohra C, Chowdhary Ri, Taranikanti V, and Arora K conducted the literature review and data curation; Rampurawala IM and Goyal MK carried out the visualization and figure preparation; Chowdhary Ra, Sheth PD, Vuthaluru AR, and Rampurawala IM wrote the original draft; all authors contributed to the review and editing of the subsequent versions of the manuscript; Goyal MK provided supervision and validation of the study; all authors read and approved the final version of the manuscript.
Conflict-of-interest statement: All the authors report no relevant conflicts of interest for this article.
Corresponding author: Manjeet Kumar Goyal, DM, DNB, MD, Department of Internal Medicine, Cleveland Clinic Akron General Hospital, 1 Akron General Avenue, Akron, OH 44308, United States. manjeetgoyal@gmail.com
Received: December 22, 2025
Revised: January 8, 2026
Accepted: January 22, 2026
Published online: March 8, 2026
Processing time: 73 Days and 0.6 Hours
Core Tip

Core Tip: Capsule endoscopy (CE) generates thousands of images per study, creating diagnostic and workflow challenges due to manual interpretation and localization errors. The integration of multimodal artificial intelligence combining visual data with sensor inputs such as inertial measurement units, magnetic trackers, and physiological monitors has significantly improved lesion detection, localization, and reading efficiency. Advanced architectures achieve sub-millimeter localization accuracy and > 95% diagnostic precision. These developments represent a paradigm shift in CE, transforming it from a passive imaging tool into an intelligent, context-aware diagnostic platform with the potential to enhance accuracy, reduce reading time, and standardize interpretation across clinicians.