Copyright
©The Author(s) 2025.
World J Radiol. Nov 28, 2025; 17(11): 114754
Published online Nov 28, 2025. doi: 10.4329/wjr.v17.i11.114754
Published online Nov 28, 2025. doi: 10.4329/wjr.v17.i11.114754
Table 1 Methodological comparison of standard large language models and large concept models
| Feature | LLMs | LCMs |
| Level of abstraction | Token-level prediction (word/sub word) | Concept-level prediction (sentence/idea) |
| Input representation | Processes individual tokens, language-specific | Uses sentence embeddings, language-agnostic |
| Reasoning and planning | Focuses on local predictions, lacks structured reasoning | Explicitly models hierarchical reasoning and structured planning |
| Zero-shot generalization | Requires fine-tuning for new tasks/Languages | Strong zero-shot learning across languages and modalities |
| Architectural modularity | Monolithic transformer, hard to modify | Modular design, allows easy extension and updates |
Table 2 Key limitations of artificial intelligence in radiology: From machine learning to large language models
| Category | Description | Ref. |
| Data requirements | AI models (ML, DL, LLMs) require vast amounts of high-quality, annotated data, which is scarce in the medical domain. Privacy concerns and the cost of data acquisition and annotation are significant barrier | Nadkarni and Merchant[19], 2022 |
| Hager et al[52], 2024 | ||
| Variability and bias | Differences in imaging protocols, scanner types, and patient demographics can reduce model robustness. Training on biased datasets can perpetuate and even amplify clinical disparities | Marcus et al[79], 2023 |
| Chen et al[80], 2021 | ||
| Guo et al[81], 2024 | ||
| Incorrectness and hallucinations | LLMs, in particular, may produce outputs that are factually inaccurate or fabricated. This is a critical issue in high-stakes clinical scenarios where accuracy is paramount | Guo et al[81], 2024 |
| Olabiyi et al[17], 2025 | ||
| Floridi et al[88], 2018 | ||
| Bajaj S et al[6], 2024 | ||
| Limited contextual uderstanding | LLMs operate at a token level and struggle with long-range dependencies, abstract reasoning, and integrating non-linguistic data. This leads to outputs that are often superficial and lack the depth of a human physician’s diagnostic reasoning | Hendrycks et al[86], 2021 |
| Marcus et al[79], 2023 | ||
| Polonski et al[96], 2018 | ||
| Bender et al[87], 2021 | ||
| Najjar et al[18], 2023 | ||
| Jiang et al[57], 2023 | ||
| Nam et al[31], 2025 | ||
| Grandison[64], 2025 | ||
| Vaswani et al[30], 2017 | ||
| Lack of interpretability and trust | Many AI models function as “black boxes”, providing no insight into their decision- making process. This opacity undermines trust among clinicians and poses a significant hurdle to clinical adoption and patient safety | Hendrycks et al[86], 2021 |
| Bender et al[87], 2021 | ||
| Performance in rare diseases | AI performance often deteriorates in rare conditions, atypical presentations, and underrepresented demographics. These “edge cases” demand nuanced reasoning that goes beyond narrow pattern recognition | Busch et al[43], 2025 |
| Ouyang et al[84], 2025 | ||
| Atil et al[85], 2025 | ||
| Obsolescence and adaptability | The rapid pace of innovation in radiology means models must continually adapt to new techniques and protocols. Without frequent retraining and validation, AI systems can become obsolete, or their performance can degrade | Najjar et al[18], 2023 |
| Clinical integration and medicolegal | Embedding AI into clinical workflows | Nadkarni and Merchant[19], 2022 |
| Liability | Requires extensive infrastructure and training. If an AI error leads to patient harm, the question of legal liability between developers, institutions, and clinicians remains a significant, unresolved barrier | Hager et al[52], 2024 |
| Soni et al[55], 2025 | ||
| Computational and environmental costs | Training and running large-scale models like LLMs are resource-intensive, expensive, and have a high carbon footprint, which presents a sustainability challenge | Faiz et al[90], 2024 |
| The Institution of Engineering and Technology[92], 2024 | ||
| Ren et al[93], 2024 | ||
| Delusions of progress | The overestimation of current AI capabilities, often due to misleading metrics, can lead to a “delusion of progress” that results in flawed decision-making and misplaced trust in high-stakes clinical scenarios | Topol[1], 2019 |
- Citation: Merchant SA, Merchant N, Varghese SL, Shaikh MJS. Large language models and large concept models in radiology: Present challenges, future directions, and critical perspectives. World J Radiol 2025; 17(11): 114754
- URL: https://www.wjgnet.com/1949-8470/full/v17/i11/114754.htm
- DOI: https://dx.doi.org/10.4329/wjr.v17.i11.114754
