BPG is committed to discovery and dissemination of knowledge
Minireviews
Copyright ©The Author(s) 2025.
World J Gastrointest Surg. Nov 27, 2025; 17(11): 112058
Published online Nov 27, 2025. doi: 10.4240/wjgs.v17.i11.112058
Table 1 Types of machine learning
Type of machine learning
Methods & data used
Applications
Supervised learningLabeled dataPredicts known outcome, e.g., SVM
Unsupervised learningUnlabeled dataIdentifies hidden pattern in the data, e.g., K-means, clustering
Semi supervised learningHybrid-less labeled & more unlabeled dataUseful when obtaining labeled data is expensive or time consuming
Reinforcement learningLearning through interaction and feedback (based on evolutionary concept of human behavior of reward and punishment)Useful in robotics and natural language processing
Ensemble learningMultiple base models combined to develop more accurate predictive modelEnhance accuracy and reduce overfitting
Deep learningBased on Artificial Neuronal Network inspired by the human brain neural networksUseful in handling unlabeled data, e.g., image recognition, natural language processing
Table 2 Comparison between supervised and unsupervised machine learning
Characteristics
Unsupervised
Supervised
Definition Machine tries to find hidden pattern in the data by itself without human interferenceMachines identify the pattern in the new data based on labeled input data with human interference
Input dataUnlabeledLabeled
When to useYou do not know what you are looking for in the dataYou know what you are looking for in the data
Typical tasksClustering and association problemsClassification and regression problems
Accuracy of resultMay provide less accurate resultProvide more accurate result
Common algorithm k-means clustering, hierarchical clustering, principal component analysisSupport vector machine
Decision tree
Random forest
Example of useAnomaly detectionSpam filters
Customer segmentationPrice prediction
Preparing data for supervised learningImage identification
Table 3 Commonly used machine learning model in liver transplant
Random forest
Can predict post-transplant outcomes
Decision treeUseful for classification and regression tasks
Logistic regressionUseful in predicting binary outcomes like survival or graft failure
Support vector machinesUseful in classification, regression, and outlier detection
can predict patient outcomes from medical records and diagnose diseases
K-Nearest neighborsClassifies data points based on their proximity to other data points assuming they share similar traits
Can predict patient outcomes by comparing new data to historical cases
Artificial neural networksBased on human brain. Consists of multiple layers of neurons
Useful in image, speech recognition and predictive models
Gradient boosting machinesEnsemble learning method
Useful in regression and classification problems
Can handle large scale data
XGBoostAdvanced form of GBM
Less overfitting as compared to GBM
AdaBoostEnsemble learning method
Useful in image classification, sentiment analysis, and fraud detection
RuleFitEnsemble learning method
combines decision trees and linear models to form interpretable rules that reveal data patterns
useful for understanding decisions, especially in regulatory contexts
Tab transformersUseful to handle tabular data
Table 4 Limitation of artificial intelligence and its potential solutions
Limitations of AI
Potential solutions
Data and model drift
Data availability and data quality; (b) Complexity of medical database; (c) Missing data points; (d) Inconsistent reporting of data class; (e) Imbalance in training data; and (f) Perpetuation of socioeconomic factors that affect outcomes(a) Curated datasheets; (b) Adaptation and development of new model; (c) Multimodal AI model; (d) Inclusion of diverse population in data; and (e) Ranking of AI model based on fairness matrices
Ethical concern
(a) Data privacy and security; (b) Equitable access; and (c) Integration with clinical practice(a) Transparency of AI models; (b) Adress biases; (c) Ensure data privacy and security; and (d) Establish accountability with regular audit and monitoring
Spectrum bias and overfitting(a) Diversified and representative training data; (b) Identifying and mitigating bias; (c) Cross validation; and (d) Ensemble learning
Hallucinations due to insufficient or dirty training data(a) High quality training data; (b) Fine tuning of AI model; (c) Inclusion of fact checking mechanism; and (d) Retrieval-augmented generation
Generalizability (difficulty in achieving the similar level of accuracy in different geography or populations)(a) To use curated training data set; and (b) Population based validation of AI model
Interpretability (due to Blackbox design of AI model)(a) Shapley analysis; (b) Explainable AI model; (c) Saliency maps; and (d) Surrogate model