Merchant SA, Merchant N, Varghese SL, Shaikh MJS. Large language models and large concept models in radiology: Present challenges, future directions, and critical perspectives. World J Radiol 2025; 17(11): 114754 [DOI: 10.4329/wjr.v17.i11.114754]
Corresponding Author of This Article
Suleman A Merchant, MD, Former Dean, Professor and Head, Department of Radiology, LTM Medical College and LTM General Hospital, Sion, Mumbai 400022, Maharashtra, India. suleman.a.merchant@gmail.com
Research Domain of This Article
Radiology, Nuclear Medicine & Medical Imaging
Article-Type of This Article
Review
Open-Access Policy of This Article
This article is an open-access article which was selected by an in-house editor and fully peer-reviewed by external reviewers. It is distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/
Nov 28, 2025 (publication date) through Nov 27, 2025
Times Cited of This Article
Times Cited (0)
Journal Information of This Article
Publication Name
World Journal of Radiology
ISSN
1949-8470
Publisher of This Article
Baishideng Publishing Group Inc, 7041 Koll Center Parkway, Suite 160, Pleasanton, CA 94566, USA
Share the Article
Merchant SA, Merchant N, Varghese SL, Shaikh MJS. Large language models and large concept models in radiology: Present challenges, future directions, and critical perspectives. World J Radiol 2025; 17(11): 114754 [DOI: 10.4329/wjr.v17.i11.114754]
Author contributions: Merchant SA was responsible for the conceptualization of the critical perspective, defining the scope and structure, conducting the primary literature searches and synthesis, and drafting the initial and final versions of the manuscript; Merchant N, Varghese SL, and Shaikh MJS performed supplementary literature searches, provided critical intellectual content throughout the analysis, and critically reviewed and revised the manuscript for scientific accuracy; and all authors have read and approved the final manuscript.
Conflict-of-interest statement: All the authors report no relevant conflicts of interest for this article.
Open Access: This article is an open-access article that was selected by an in-house editor and fully peer-reviewed by external reviewers. It is distributed in accordance with the Creative Commons Attribution NonCommercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: https://creativecommons.org/Licenses/by-nc/4.0/
Corresponding author: Suleman A Merchant, MD, Former Dean, Professor and Head, Department of Radiology, LTM Medical College and LTM General Hospital, Sion, Mumbai 400022, Maharashtra, India. suleman.a.merchant@gmail.com
Received: September 27, 2025 Revised: October 7, 2025 Accepted: November 3, 2025 Published online: November 28, 2025 Processing time: 61 Days and 6.7 Hours
Abstract
Large language models (LLMs) have emerged as transformative tools in radiology artificial intelligence (AI), offering significant capabilities in areas such as image report generation, clinical decision support, and workflow optimization. The first part of this manuscript presents a comprehensive overview of the current state of LLM applications in radiology, including their historical evolution, technical foundations, and practical uses. Despite notable advances, inherent architectural constraints, such as token-level sequential processing, limit their ability to perform deep abstract reasoning and holistic contextual understanding, which are critical for fine-grained diagnostic interpretation. We provide a critical perspective on current LLMs and discuss key challenges, including model reliability, bias, and explainability, highlighting the pressing need for novel approaches to advance radiology AI. Large concept models (LCMs) represent a nascent and promising paradigm in radiology AI, designed to transcend the limitations of token-level processing by utilizing higher-order conceptual representations and multimodal data integration. The second part of this manuscript introduces the foundational principles and theoretical framework of LCMs, highlighting their potential to facilitate enhanced semantic reasoning, long-range context synthesis, and improved clinical decision-making. Critically, the core of this section is the proposal of a novel theoretical framework for LCMs, formalized and extended from our group’s foundational concept-based models - the world’s earliest articulation of this paradigm for medical AI. This conceptual shift has since been externally validated and propelled by the recent publication of the LCM architectural proposal by Meta AI, providing a large-scale engineering blueprint for the future development of this technology. We also outline future research directions and the transformative implications of this emerging AI paradigm for radiologic practice, aiming to provide a blueprint for advancing toward human-like conceptual understanding in AI. While challenges persist, we are at the very beginning of a new era, and it is not unreasonable to hope that future advancements will overcome these hurdles, pushing the boundaries of AI in Radiology, far beyond even the most state-of-the-art models of today.
Core Tip: Current capabilities, applications, limitations of large language models (LLMs) in radiology artificial intelligence (AI). LLMs transformed radiology AI, improved textual-analysis, workflow automation, clinical decision support. Challenges: Limited reasoning depth, inaccuracies. Transformative role of LLMs in radiology, their architectural foundations, clinical utility are discussed. LLM limitations, like token-level processing, hallucinations, and challenges in clinical adoption. Exploring new paradigm large concept models, having conceptual reasoning and multimodal integration to enhance clinical accuracy and reliability. Ethical, regulatory, and explainability considerations for AI tools in healthcare also discussed and a balanced and forward-looking view on AI’s role in radiology, covering both current innovations and anticipated advances through large concept models.
Citation: Merchant SA, Merchant N, Varghese SL, Shaikh MJS. Large language models and large concept models in radiology: Present challenges, future directions, and critical perspectives. World J Radiol 2025; 17(11): 114754
Artificial intelligence (AI) has profoundly influenced radiology practice, particularly through the development of large language models (LLMs) that support textual analysis, report generation, and clinical workflow automation. The impact is felt across the board: For radiology and other clinicians, mainly via rapid and accurate image interpretation; for health systems, by major workflow improvements and their immense potential for reducing medical errors; and for patients, by enabling them to better process their own medical data for health advocacy[1]. However, LLMs’ token-based architecture inherently limits their reasoning depth and the conceptual robustness necessary for complex diagnostic tasks. By reviewing both achievements and limitations, we lay down the foundational understanding essential for appreciating next-generation AI paradigms in radiology.
Part 2
Building upon the exploration of LLMs in part 1, part 2 introduces large concept models (LCMs) as a conceptual framework for a new paradigm in radiology AI. Specifically, the core contribution of part 2 is the proposal of a novel theoretical framework for LCMs, directly extended and formalized from our foundational concept-based models first articulated in 2022. This approach represents the world’s earliest articulation of concept-based models for medical AI, emphasizing a holistic approach and the integration of clinical expertise and multimodal data to enable robust AI reasoning beyond traditional token-level approaches.
While this foundational work has since been validated and extended by other research groups, including Meta AI (who have adopted the term “LCMs” and demonstrated implementations grounded in explicit semantic representations and cross-modal reasoning), our manuscript synthesizes these external advances with novel theoretical extensions rooted in our original conceptual framework. By integrating these strands, part 2 offers a unified, forward-looking framework for LCMs, situating our early vision within the evolving scientific landscape and outlining pathways toward clinical translation. Our contribution is therefore both foundational and progressive, setting scholarly expectations and underscoring the unique role our work plays in advancing AI’s conceptual evolution in radiology. It is important to note, however, that these are early beginnings, with significant development still required before LCMs become integrated into everyday clinical use.
The advent of LLMs, epitomized by OpenAI’s ChatGPT, marked a transformative leap in natural language processing (NLP). However, despite their impressive capabilities, LLMs face inherent architectural limitations, especially in specialized domains such as Radiology. Their token-level sequential processing constrains their capacity for robust abstract reasoning and deep conceptual understanding. By predicting one word or sub-word at a time based on preceding tokens, LLMs struggle to capture long-range dependencies and the complex semantic relationships essential for nuanced diagnostic interpretation. For radiology stakeholders, it is crucial to recognize that this token-level processing is fundamentally insufficient for the intricate complexities of diagnostic reasoning, where contextual understanding and abstract inference are paramount. This new paradigm promises to address persistent challenges in diagnostic accuracy, interpretability, and contextual synthesis, signaling a critical advance toward more intelligent and clinically meaningful AI systems in radiology.
PART 1: CURRENT CAPABILITIES AND LIMITATIONS OF LLMS IN RADIOLOGY AI
The capabilities of LLMS in radiology
This is a rapidly expanding field, with new research emerging daily. Results already indicate that LLMs will soon be applied to every aspect of radiology practice[2]. They have proven to be valuable in answering Radiology-related questions, providing clarifications regarding procedures, and offering general information about different types of imaging modalities[3]. LLMs play a significant role in patient triage and workflow optimization, helping in the automated determination of imaging studies and protocols based on radiology request forms[4] and the generation of radiology reports[5]. Algorithms like ChatGPT can improve patient outcomes, increase the efficiency of radiology interpretation, and aid in the overall workflow of radiologists[6,7] and it is reported that Generative Pre-trained Transformer (GPT)-4 was an effective tool for post hoc structured reporting in radiology, with its autonomous ability to select the most appropriate structuring template making it a highly scalable and easily implementable solution that required minimal database restructuring effort. Furthermore, LLMs have been shown to be versatile tools for establishing an accurate differential diagnosis, with the potential to improve clinicians’ diagnostic reasoning and accuracy in challenging cases and to empower physicians and widen patients’ access to specialist-level expertise[8]. ChatRadio-Valuer, for example, surpassed state-of-the-art models in disease diagnosis from radiology reports and alleviated the annotation workload of experts[9]. This part of the manuscript explores the evolution and current landscape of LLMs in radiology AI, emphasizing their capabilities and practical applications. Part 2 will cover LCMs.
Historical context and evolution of AI to LCMs: To appreciate the significance of LLMs and LCMs, it is important to trace the historical evolution of AI, from foundational principles established decades ago, to the emergence of LCMs in 2025. This journey represents a continuum of transformative milestones, with many advances still ahead.
Early foundations-symbolic AI: The birth of AI is often marked by the Dartmouth Conference in 1956, where the term “AI” was coined and the initial aspirations for creating thinking machines were discussed. This early era was dominated by symbolic AI, which focused on representing knowledge through symbols, rules, and logical structures. Expert systems, designed to mimic the decision-making of human experts using predefined rules, were a prominent application of this approach[10,11]. While the Dartmouth Conference aimed to bring together researchers from various disciplines with the specific intent of exploring the possibility of creating machines that could exhibit intelligent behavior, and laid the foundation for the field, symbolic AI’s reliance on explicitly programmed rules proved to be a significant limitation. Importantly, the term “AI” was adopted at this conference and many of the early AI programs were started after this conference. It also aided in establishing key areas of research that continue to be explored today, as well as fostered collaborations and connections among researchers who would go on to make significant contributions to AI. Thus, this conference marked the beginning of AI as a distinct field of research and set the stage for decades of AI development and innovation. It is worth noting that while the conference was groundbreaking, the early optimism about achieving human-level AI quickly proved to be overly ambitious[12]. The field experienced periods of “AI winters” when funding and interest waned[13]. However, the Dartmouth Conference’s legacy remains strong, and its impact on the development of modern AI is undeniable[14]. Turing’s question “Can machines think” along with his further fundamental contributions to AI inspired the Dartmouth conference participants to look forward to building the first machines that could pass the Turing test[15]. The conference aimed to explore the possibilities of creating machines that could demonstrate intelligent behavior, a goal emanating from Turing’s inquiries[16]. Early AI relied on rule-based systems, which struggled with real-world complexities, particularly in handling uncertainty and learning from data. The manual creation of these rules was time-consuming, brittle, and difficult to scale, ultimately limiting the capabilities of symbolic AI in addressing complex, real-world problems. This inherent limitation paved the way for the rise of data-driven approaches[11,17].
Machine learning, neural networks, and deep learning: Machine learning (ML) involves developing software that learns autonomously from data, identifying patterns to make predictions or decisions without explicit programming[18]. Its main objective is to enable systems to adapt without human intervention. ML methods are broadly divided into supervised learning, which predicts or classifies new data from labeled examples, and unsupervised learning, which discovers hidden patterns without pre-defined outputs[19]. By the 1980s, ML marked a shift from symbolic AI. Decision trees[20], support vector machines[21], and Bayesian networks (which demonstrated success in various tasks, offering a more adaptive and data-driven approach) broadened the AI landscape in healthcare. These algorithms enabled data-driven predictions and classification, offering a more flexible alternative to rigid rule-based systems[11,17]. ML soon radically improved radiology and enabled automated pattern recognition across modalities including X-ray, computed tomography (CT), magnetic resonance imaging (MRI), positron emission tomography; and even radiology reports, supporting diagnosis, reducing errors; and became capable of extracting complex, high-level data, even from unlabeled datasets[18,22]. It also fueled advances in computer-aided diagnosis, which integrated radiology, pathology, and genomics data to improve workflows and productivity[23].
Neural networks: Basic neural networks were introduced in the 1980s but gained momentum after 2003, when Bengio et al[24] developed the first feed-forward neural network language model that predicted words in sequence, overcoming earlier statistical limitations and laying the foundations for modern NLP[24]. While early neural nets showed promise in pattern recognition, they struggled with sequential data and lacked the scale for complex tasks[17]. Recurrent neural networks (RNNs) later improved contextual information retention with long short-term memory networks[25] and gated recurrent units becoming dominant sequence-processing architectures[26]. These networks powered breakthroughs in image recognition and NLP, fueling an AI renaissance[27]. However, RNNs faced challenges: (1) Retaining information across long sequences; and (2) Limited parallelization during training, which constrained scalability. These shortcomings highlighted the need for deeper, more efficient architectures.
Deep learning: Deep learning (DL), a subset of ML, uses multilayered (“deep”) artificial neural networks to model complex data patterns[18,19] and proved particularly effective for tasks where manual feature extraction was impractical, especially in large imaging datasets.
Significant limitations remain: DL and ML both require large, diverse, and well-annotated datasets, which are often difficult to obtain due to privacy restrictions, institutional variability, and the high cost of expert labeling. Models are also vulnerable to bias and poor generalizability, as performance often drops when applied to underrepresented demographics, new imaging protocols, or to data from different equipment. Furthermore, their interpretability remains limited: Deep neural networks function as “black boxes,” making clinical trust and regulatory approval difficult[28]. Equally pressing are integration and sustainability issues. Clinical adoption requires seamless incorporation into workflows, ongoing retraining to match rapidly evolving imaging techniques, and careful attention to ethical, regulatory, and workforce considerations[19]. Overfitting and narrow task-specific performance further constrain real-world utility, underscoring the gap between research prototypes and practical deployment. In summary, ML laid the groundwork for automated pattern recognition in radiology, neural networks expanded contextual understanding, and DL harnessed depth and scale for tackling complex imaging tasks. Together, these approaches have transformed radiology while still grappling with data scarcity, bias, interpretability, and integration challenges.
Attention breaking mechanism and the Transformers
Dzmitry Bahdanau et al[29], introduced the attention mechanism in 2016 addressing challenges faced with traditional architectures that were known to struggle with longer input sentences. This was a significant breakthrough in the context of sequence-to-sequence models, particularly for neural machine translation. The attention mechanism allowed the model to selectively focus on different parts of the input sequence when generating each word of the output sequence and was a huge improvement over previous methods, which struggled with long sentences because they had to compress the entire input into a fixed-length vector. Bahdanau et al[29], “attention mechanism” allowed the model to dynamically “pay attention” to the relevant parts of the input. This mechanism therefore allowed models to focus entirely on relevant parts of input sequences dynamically, improving translation quality immensely. This innovation directly addressed a key limitation of earlier neural network architectures in NLP.
Transformers: Subsequently, 8 Google employees, wrote the historic “Transformers Paper” the most consequential technology breakthrough in recent history[27]. The transformer architecture was a revolutionary neural network architecture that relied entirely on the attention mechanism. It eliminated the need for RNNs or convolutional neural networks, which were previously the dominant architectures for sequence-to-sequence tasks. The limitations in handling long sequences effectively before the “Transformer era” were a core motivation for the Transformer’s development as highlighted in the “Attention is all you need” paper[30]. The Transformer architecture provided a more effective way to process sequential data, paving the way for significant advancements in NLP. Although the transformer architecture used “attention” it is actually much greater than the original attention mechanism. Bahdanau et al[29], introduced the concept of attention, while the Google team built an entire architecture around it. The transformer architecture used a self-attention model, that allowed the model to look at the input, and then look at the input again, in relation to itself and also uses an encoder and decoder system, that greatly enhances the ability of the model. This allowed the model to: (1) Directly access and weigh the importance of different parts of the input sequence when processing each position, which addressed the long-range dependency problem; and (2) Perform parallel processing. Unlike RNNs, the Transformer architecture allows for significant parallelization during training. This made it possible to train much larger models on massive datasets, unlocking unprecedented levels of language understanding and generation. This scalability and improved handling of context were crucial steps towards the development of LLMs and their subsequent evolutions, including the multi-modal LLMs (MLLMs) used in medical imaging[31]. Although Bahdanau et al’s work[29] was a crucial steppingstone that paved the way for the transformer architecture, the Google “8 authors team” took that idea and built a much more powerful and versatile system. These authors started with neural networks - and made it into something else: A digital system so powerful that its output can feel like the product of an alien intelligence. Called transformers, this architecture is the not-so-secret sauce behind all those mind-boggling AI products, including ChatGPT, Gemini and the omnipresent mesmerizing graphic generators one interacts with of late[27]. The Transformer innovation was a turning point, enabling more advanced NLP models which provided the impetus for the emergence of sophisticated language models like Llama 2 (developed by Meta AI), GPT-4 (developed by Open AI), and a host of others[32] including the hot out of the oven Llama 4, with a mind boggling 2 trillion parameter model[33]. All of these, underpinned by extensive training data, have elevated NLP to a level of understanding and text generation that closely approximates human-like language. The success of the Transformer architecture demonstrated the power of scaling and novel architectural designs in overcoming previous limitations in NLP. Earlier, the dominant sequence transduction models were based on convolutional neural networks in an encoder and decoder configuration. Vaswani et al[30] proposed a novel, simple network architecture based solely on an attention mechanism, dispensing with recurrence and convolutions entirely. Their model entirely replaced recurrent and convolutional networks with self-attention mechanisms, enabling much greater parallelization and efficiency. Their single model with 165 million parameters, achieved 27.5 Bilingual Evaluation Understudy (BLEU) on English-to-German translation, improving over the then existing best ensemble result by over 1 BLEU. On English-to-French translation, they outperformed the previous single state-of-the-art with model by 0.7 BLEU, achieving a BLEU score of 41.1[30]. This outperformed previous state-of-the-art models significantly and gave an immense boost to AI. Unfortunately, Google was slow to utilize this concept invented by 8 people from their own team. However, a (then) start-up, called Open AI was quick to pounce upon this transformer-based concept[27]; and that is how Open AI and Sam Altman made their names! It’s striking how clearly the authors anticipated the broad applicability of Transformer models beyond just text. Their vision for extending it to images, audio, and video has indeed come to fruition in a massive way. Their statement “We are excited about the future of attention-based models”[30] now reads like a profound understatement. It’s fascinating to see how a seemingly simple title, inspired by a famous Beatles song (all you need is love), ended up getting attached to such a revolutionary piece of work that has reshaped the entire landscape of AI.
Impact on LLMs: The Transformer architecture was the foundation on which LLMs like GPT[34] and Bidirectional Encoder Representations from Transformers[35] and Google’s Bard[36] were established and enabled advancements in tasks such as machine translation, question answering (QA), and multimodal generative AI. For example, Google’s Bard sought to combine the breadth of the world’s knowledge with the power, intelligence and creativity of their LLMs and drew information from the web to provide fresh, high-quality responses[36]. Pichai explained how AI could deepen our understanding of information and turn it into useful knowledge more efficiently, making it easier for people to get to the heart of what they were looking for. The Transformer’s ability to process long sequences and its parallelization capabilities enabled the training of models with an unprecedented number of parameters, leading to the emergence of LLMs with remarkable language capabilities[17,30]. This represented a significant leap in AI’s ability to understand and generate human-like text. Yet these models still primarily operated on the statistical relationships between words (tokens). Bard (now known as Gemini) was initially powered by a family of LLMs developed by Google AI called Language Model for Dialogue Applications[37]. Language Model for Dialogue Applications itself was built upon the transformer architecture.
Human-like language approximation: The capabilities demonstrated by models like GPT-4, Gemini and Llama 2, in terms of fluency, coherence, and contextual understanding are evident in their performance on various NLP benchmarks and in real-world applications. While “human-like” is a complex and debated term, the advancements in these models represent a significant leap in text generation quality. These capabilities are often showcased in their respective technical reports and evaluations. However, this impressive linguistic ability often masks a fundamental limitation: LLMs primarily excel at manipulating tokens and identifying statistical patterns in language, rather than possessing a genuine understanding of the underlying concepts and their relationships in the real world. This lack of deep conceptual understanding becomes particularly apparent when faced with tasks requiring abstract reasoning or cross-modal integration, an integral part of current healthcare needs. Thus, limiting their ability to truly understand and reason presented concepts.
Scaling up of LLMs
LLMs like GPT-3 (2020) demonstrated unprecedented capabilities in generating human-like text by scaling up parameters and training on vast datasets. LLMs soon began driving innovations across a vast spectrum of industries. Earliest LLMs were mainly designed as general-purpose chatbots. The deployment of LLMs within the healthcare sector had sparked both enthusiasm and apprehension. These models exhibited the remarkable capability to provide proficient responses to free-text queries, demonstrating a nuanced understanding of professional medical knowledge etc[38]. While scaling has led to emergent capabilities, such as improved performance on certain reasoning tasks; it has not fundamentally altered the core mechanism of token prediction. This inherent limitation restricts their ability to perform robust abstract reasoning, integrate information across different modalities (like text and images), and truly grasp the conceptual underpinnings of the data they process, including presented concepts - the very challenges that LCMs aim to address.
Specialized medical domain models and limitations
Subsequently, research focused on developing specialized models for the medical domain, such as Meditron[39], Med-PaLM[40], GMAI[41], BioMistral[42], by medical knowledge enriching of the training data of LLMs. However, this approach required significant computational resources that were not commonly available and the same was also not applicable to closed-source LLMs, which are often the most powerful[43]. While LLMs had achieved state-of-the-art performance on a wide range of subjects, including medical QA tasks, they still faced challenges with hallucinations and outdated knowledge[44]. Hence techniques such as the already widely adopted retrieval-augmented generation (RAG)[45] were introduced in the medical domain, which allowed information to be dynamically retrieved from medical databases during the model generation process. Thus, enriching the output with medical knowledge without any need to train the model[44]. The reliance on external knowledge retrieval mechanisms like RAG highlights the LLMs’ inherent limitations in storing and reasoning with vast amounts of information in a conceptually rich and readily accessible manner[45]. It underscores their dependence on surface-level pattern matching rather than a deep understanding of the underlying medical concepts. Despite their proficiency in language, the lack of deep conceptual understanding can lead to errors in reasoning and the generation of plausible but incorrect medical information (hallucinations), a critical concern in high-stakes domains like healthcare, where accuracy and reliability are paramount. This fundamental limitation underscores the need for AI models that go beyond surface-level language processing and possess a more robust understanding of medical concepts and their interrelationships[31].
Detailed applications of LLMs in clinical workflow
LLMs have been used for other varied purposes in the medical domain which included named entity recognition, extracting other clinical information from electronic health records, document classification, summarizing, structuring, or explaining medical texts, relation extraction to natural language inference, streamlining administrative tasks in clinical practice, and enhancing medical research, multi-modal applications, quality control, education and question-answering[7,38,46,47]. Nazih and Peng[38] provide a comprehensive exploration of the current landscape of LLMs in healthcare, addressing their role in transforming medical applications. By leveraging the capabilities of LLMs in healthcare processes, organizations can provide better patient care, research, and data privacy thanks to LLMs’ ability to generate, and summarize text-rich data[48]. In an article titled “Best 10 LLMs in Healthcare in 2025” Dilmegani et al[48] have elaborated upon LLMs in healthcare in detail; wherein they have also stated that LLMs have also been used to assist in transcribing and summarizing doctor-patient conversations, drafting patient discharge summaries, generating diagnostic reports and that AI-powered chatbots can handle tasks like symptom triage, providing health information, and translating medical jargon into patient-friendly language. LLMs can also help bridge communication gaps between healthcare providers and patients, especially those who speak different languages, enhancing understanding and care delivery. For literature review/research support too LLMs can assist in extracting key insights from the vast resources of available medical literature, summarizing recent studies, and collating requisite evidence on clinical conditions or treatments. Such LLMs evolved rapidly from a score of 33.3% (GPT-Neo December 2020) to 86.5% in March 2023 (Medical PaLM 2), on Medical Question Answering (United States Medical Licensing Examination style) accuracy[48]. While these applications demonstrate the utility of LLMs in automating and enhancing various aspects of healthcare, they often rely on sophisticated pattern recognition and language manipulation rather than a deep, conceptual understanding of the underlying medical knowledge. This highlights the potential for more robust and reliable AI, through models that can truly understand and reason about medical concepts. An AI model is helping researchers detect disease based on coughs. Google Research introduced health acoustic representations (HeAR), a bio-acoustic foundation model designed to listen to human sounds and flag early signs of disease[49]. HeAR was trained on 300 million pieces of de-identified audio data and used to develop a cough model trained with approximately 100 million cough sounds. This model is now being used by Swaasa to enhance early detection of tuberculosis (TB)[50]. “Every missed case of TB is a tragedy; every late diagnosis, a heartbreak,” attributed by Shetty[50], to Kakarmath a product manager at Google Research working on HeAR and “Acoustic biomarkers offer the potential to rewrite this narrative”. Having proposed a collaborative global effort to eradicate the scourge of TB, vide our “TB Revisited” plan, amongst others[51], we heartily welcome such accomplishments and feel that they will further the global effort to eradicate TB. This example highlights the potential of AI beyond language and underscores the need for models capable of integrating and reasoning across different data modalities - a hallmark of LCMs.
Radiological perspective[2,19,52-56]: The opportunities to apply LLMs in radiology is diverse, with new research being published every day. These results already indicate that LLMs will soon be applied to every aspect of radiology practice[2]. They have proven to be valuable in answering radiology-related questions, providing clarifications regarding procedures, and offering general information about different types of imaging modalities[3]. They can play a role in patient triage and workflow optimization, by helping in the automated determination of the imaging study and protocol based on radiology request forms[4] and in generation of radiology reports[5]. Algorithms like ChatGPT can improve patient outcomes, increase the efficiency of radiology interpretation, and aid in the overall workflow of radiologists[6]. Adams et al[7] reported that GPT-4 was an effective tool for post hoc structured reporting in radiology, with its autonomous ability to select the most appropriate structuring template making it a highly scalable and easily implementable solution that required minimal database restructuring effort. Furthermore, LLMs have been shown to be versatile tools for establishing an accurate differential diagnosis (with the potential to improve clinicians’ diagnostic reasoning and accuracy in challenging cases and to empower physicians and widen patients’ access to specialist-level expertise[8]. ChatRadio-Valuer, for example surpassed state-of-the-art models in disease diagnosis in radiology reports and reduced the workload of expert annotation[8]. LLMs have also served as prognostic models in medicine, analyzing vast, complex datasets to predict patient outcomes and guide treatment decisions[38,57]. Jiang et al[57] approach, for instance, leveraged advances in NLP to train an LLM for medical language and subsequently fine-tuned it across a wide range of clinical and operational predictive tasks. The deployment of these intelligent systems has bolstered decision-making, expedited diagnostic processes, and elevated the quality of patient care. They have been used to grapple with the ever-expanding body of medical knowledge, decipher intricate patient records, and formulate highly tailored treatment plans[38].
Evolution to multimodal models and advanced architectures: More recently, large multimodal models (LMMs) are emerging as powerful tools, particularly in radiology[55,56]. These models, built upon LLMs, including through the use of techniques like context engineering[58] and prompt engineering[55], integrate various imaging types (e.g., CT, MRI, X-ray, endoscopy, digital pathology) alongside textual data such as radiology reports, clinical notes, and structured electronic health record (EHR) data[31]. Their defining feature is the ability to concurrently process and align information across modalities, often mapping them into a shared representational space. This synergy allows for a more comprehensive understanding than unimodal approaches permit[55]. This capability enables them to tackle complex cross-modal tasks such as radiology report generation from images and visual QA (VQA) that incorporates both imaging and clinical context[31,55]. Current applications of LMMs span automatic generation of preliminary radiology reports, visual query answering, and interactive diagnostic support. Examples of this evolution include embeddings for language/image-aligned X-rays which leveraged a language-aligned image encoder grafted onto a fixed LLM, PaLM 2, to perform a broad range of chest X-ray tasks[59]. CheX-GPT[60], RadLing[61] are other examples for the chest X-ray domain. Despite these promising capabilities, several significant challenges hinder widespread clinical adoption. For example, LMMs require access to large-scale, high-quality multimodal datasets, which are scarce in the medical domain. Their implementation is further complicated by risks of hallucinated findings, lack of transparency in decision-making processes, and high computational demands, a challenge shared by earlier LLMs as well. Nam et al’s review[31] summarizes the current capabilities and limitations of MLLMs in medicine and outlines key directions for future research. They suggested critical areas include incorporating region-grounded reasoning to link model outputs to specific image regions, developing robust foundation models pre-trained on large-scale medical datasets, and establishing strategies for the safe and effective integration of LMMs into clinical practice. Recently, it has been reported that LLMs can now learn without labels, enabling self-evolving Language Models using unlabeled data[62]. A new algorithm called self logits evolution decoding (SLED) has been introduced[63]. Standard LLMs often rely solely on the final layer, potentially leading to incorrect but “popular” answers due to missed contextual cues. SLED improves this by using information from all layers of the LLM, not just the last one. It does this by reusing the final projection matrix in the Transformer architecture on early exit logits to create probability distributions over the same set of possible tokens that the final layer uses. It then takes a weighted average of the distributions from all the layers, giving more importance to some layers than others, thus refining the LLM’s predictions by incorporating information from different stages of its processing. However, all these advanced models are still fundamentally built upon LLMs and are therefore vulnerable to similar limitations. Being LLM based models, they still fundamentally process statistical patterns between words rather than abstract concepts, potentially limiting their depth of reasoning and long-range coherence compared to a human physician’s diagnostic abilities.
Limitations of AI
ML, DL, and LLMs: While ML, DL, and LLMs have driven major advances in medical imaging, their clinical adoption remains tempered by important limitations. These technologies require large, diverse, and expertly annotated datasets, which are often scarce in healthcare settings. They are susceptible to variability and bias, raising risks of inequitable outcomes across patient groups and imaging protocols. Their decision-making processes often lack interpretability, making it difficult for clinicians to trust or validate outputs in critical diagnostic contexts. In addition, training and deploying advanced models entails substantial computational and environmental costs, challenging sustainability and scalability. Finally, there are unresolved ethical and regulatory concerns, including questions of liability, data privacy, and the absence of robust frameworks for validating non-deterministic outputs[19,52-55]. Collectively, these constraints highlight why AI systems in medicine are best viewed as decision-support tools rather than autonomous replacements for clinicians. Taken together, these limitations illustrate why AI in its current forms - ML, DL, and LLMs, cannot yet serve as fully autonomous clinical decision-makers. They remain valuable adjuncts but not replacements. Although AI has advanced far beyond the rule-based systems of the past, there is still a pressing need for models capable of deeper conceptual reasoning and contextual understanding, thinking more like the human brain does. Further in this manuscript we highlight the continuous efforts undertaken to enhance AI’s capabilities in understanding and reasoning about complex information including the birth of LCMs[30], which in reality have been considered by some to represent a potential natural evolutionary step for LLMs[64]. LCMs offer a conceptual leap, moving beyond simple tokenization and enabling models to operate at the level of clinical concepts, sentences, and integrated multi-modal data streams. This architecture allows for the preservation of complex relationships across a patient’s medical record - from imaging and pathology to laboratory results and genomic data, driving more robust abstraction, contextual synthesis, and reliable clinical interpretation.
The concept of concept models
In June 2022, Merchant et al[51] published their maxim: “AI needs real intelligence to guide it!” They proposed a multimodal, holistic approach to AI in healthcare, emphasizing that to maximize accuracy and utility in diagnosis and treatment, AI must incorporate experiential wisdom accumulated over decades of clinical practice. Key teaching and clinical parameters, including prognostic indicators, should complement big data to enhance algorithmic performance[19,51]. This appears to be among the earliest works to advocate for concept-based processing in AI, particularly in the medical domain. The authors stressed the need to move beyond a purely data-driven approach and to integrate human expertise for more reliable outcomes. As Merchant[65] noted - echoing Alan Kay’s quote, “The best way to predict the future is to invent it” - his vision of making “machines think like us, not the other way around”[65] finds a natural extension in the development of LCMs[66].
The birth of LCMs: The first successful implementation of LCMs was from META’s FAIR team on December 11, 2024[67,68]. They introduced a concept-based architecture operating on explicit semantic representations that were language and modality-agnostic. Their “LCMs” treated a concept as equivalent to a sentence, relying on SONAR - an embedding space supporting 200 languages in both text and speech. Using models with 1.6B parameters and training on 1.3T tokens, LCMs demonstrated impressive zero-shot generalization across languages, outperforming same-size LLMs. This marked a major step toward human-like reasoning: While LLMs excel at statistical language modeling, LCMs target structured understanding of concepts and their relationships[67,68]. Although still in their infancy and not yet as widely recognized as LLMs, LCMs are positioned to address the key limitations of LLMs and to enable genuine conceptual reasoning, logical inference, and ethical alignment[66,69].
Understanding LCMs: LLMs process input at the token level, often struggling with coherence over long sequences, a major drawback in domains like healthcare that demand extended, precise reasoning[70]. LCMs, by contrast, process sentences or concepts, enabling richer semantic reasoning and better handling of long contexts[71] Their design mirrors human thought by emphasizing concepts over words, spanning modalities and languages[65,66]. Figure 1 shows the conceptual shift from LLMs (sequential, token-based) to LCMs (relational, conceptual). Since LLMs handle input at the token level, processing one word or sub-word at a time[70], this fine-grained approach can lead to difficulties in maintaining coherence over long sequences, particularly for extended text generation, which is a common requirement of healthcare, including radiology and allied fields. In contrast LCMs process fewer units (sentences instead of tokens), enabling them to handle large contexts more efficiently and produce more structured outputs[71]. This allows them to build “conceptual maps” that mirror how humans think. The key distinctions are summarized in Table 1, which outlines the key methodological differences between LLMs trained primarily on natural language and specialized LCMs that are co-trained on vast repositories of programming code. LCMs thus represent a significant advancement over LLMs, and their creative layout mimics human thought processes by emphasizing concepts over words and span across modalities and languages. For example, when prompted with “The scientist worked late in the …”, an LLM predicts “laboratory” without understanding context. An LCM, however, might generate: “The scientist worked late in the laboratory to finalize results for a groundbreaking study”, reflecting deeper comprehension of intent[66,72]. Figure 2 displays the architectural contrast between LLMs that use a language-centric flow vs LCMs’ integrative, multimodal concept integration.
Figure 1 The conceptual shift from large language models to large concept models.
This diagram illustrates the fundamental difference in processing. Large language models operate via a sequential, token-based analysis, leading to potential context fragmentation and struggles with long-range dependencies. Large concept models operate on a relational, conceptual level, building a deeper relational understanding and maintaining coherence across longer contexts. LLM: Large language model; LCM: Large concept model.
Figure 2 Architectural contrast of data processing in large language models vs large concept models.
This diagram contrasts the linear, text-focused processing of large language model (LLM) with the integrative, multimodal architecture of large concept models (LCMs). A: LLM - language-centric flow: Text input flows into a centralized language processing core, producing text-based output. This represents LLM’ sequential, language-driven pipeline; B: LCM - multimodal concept integration: Multiple input types - text, image, audio, structured data - converge into a more complex processing core capable of extracting and reasoning over abstract, cross-modal concepts. The output is richer and reflects a broader understanding that spans diverse modalities. LLM: Large language model; LCM: Large concept model.
Table 1 Methodological comparison of standard large language models and large concept models.
Feature
LLMs
LCMs
Level of abstraction
Token-level prediction (word/sub word)
Concept-level prediction (sentence/idea)
Input representation
Processes individual tokens, language-specific
Uses sentence embeddings, language-agnostic
Reasoning and planning
Focuses on local predictions, lacks structured reasoning
Explicitly models hierarchical reasoning and structured planning
Zero-shot generalization
Requires fine-tuning for new tasks/Languages
Strong zero-shot learning across languages and modalities
Key abilities of LCMs (prelude) include: (1) Understanding and manipulating abstract concepts: Going beyond statistical patterns in language to grasp the underlying meaning, relationships, and properties of concepts (like “infection”, “immunity”, “drug resistance”, “socioeconomic factors”); (2) Causal reasoning and inference: Understanding cause-and-effect relationships more deeply than correlation patterns often found by LLMs; and (3) Multi-modal concept integration: Connecting concepts across different data types (text, images, structured data, genomic data). This concept-based approach allows LCMs to capture the underlying meaning and relationships between different pieces of information, enabling more sophisticated AI applications, including in healthcare. For example, building and reasoning over knowledge graphs/ontologies: Explicitly modelling the relationships between different concepts relevant to a common disease TB (e.g., mycobacterium TB causes TB, rifampicin treats TB but can cause hepatitis, poverty is a risk factor for exposure). This capability to model complex relationships is crucial for tasks that require deeper understanding and inference. As Ahmed et al[71] describe, LCMs identify and embed concepts in semantic space, reasoning across diverse expressions. This “conceptual learning” extends beyond pattern recognition, allowing models to infer meaning and apply it in context[73,74]. Since they are trained on both text data and conceptual data, this allows LCMs to develop a deeper understanding of the meaning of the text that they generate and hence implement concepts too[66].
Key disadvantages of ML/DL and LLMs in radiology and healthcare
While ML, DL, and LLMs have transformed radiology and medical imaging, they remain constrained by several key limitations that must be acknowledged for responsible clinical adoption[2,19,52-56]. Many limitations of LLMs have already been described in general AI discourse, but their impact in radiology and healthcare is especially acute. Key challenges include restricted reasoning, resource intensity, ethical risks, hallucinations, and misplaced trust in model fluency. Together, these issues highlight why current AI models cannot yet function as autonomous clinical decision-makers (Table 2). In their systematic review of 89 LLM studies across 29 specialties, Busch et al[43] identified widespread shortcomings including lack of medical domain optimization, poor transparency, non-reproducibility, unsafe outputs, and bias. Their taxonomy distinguished between design limitations (e.g., insufficient domain adaptation, restricted data access, opaque validation) and output limitations (e.g., non-comprehensiveness, incorrectness, unsafety, bias). Alarmingly, nearly half the studies reviewed reported that LLMs were not optimized for the medical domain.
Table 2 Key limitations of artificial intelligence in radiology: From machine learning to large language models.
Category
Description
Ref.
Data requirements
AI models (ML, DL, LLMs) require vast amounts of high-quality, annotated data, which is scarce in the medical domain. Privacy concerns and the cost of data acquisition and annotation are significant barrier
Differences in imaging protocols, scanner types, and patient demographics can reduce model robustness. Training on biased datasets can perpetuate and even amplify clinical disparities
LLMs, in particular, may produce outputs that are factually inaccurate or fabricated. This is a critical issue in high-stakes clinical scenarios where accuracy is paramount
LLMs operate at a token level and struggle with long-range dependencies, abstract reasoning, and integrating non-linguistic data. This leads to outputs that are often superficial and lack the depth of a human physician’s diagnostic reasoning
Many AI models function as “black boxes”, providing no insight into their decision- making process. This opacity undermines trust among clinicians and poses a significant hurdle to clinical adoption and patient safety
AI performance often deteriorates in rare conditions, atypical presentations, and underrepresented demographics. These “edge cases” demand nuanced reasoning that goes beyond narrow pattern recognition
The rapid pace of innovation in radiology means models must continually adapt to new techniques and protocols. Without frequent retraining and validation, AI systems can become obsolete, or their performance can degrade
Requires extensive infrastructure and training. If an AI error leads to patient harm, the question of legal liability between developers, institutions, and clinicians remains a significant, unresolved barrier
Training and running large-scale models like LLMs are resource-intensive, expensive, and have a high carbon footprint, which presents a sustainability challenge
The overestimation of current AI capabilities, often due to misleading metrics, can lead to a “delusion of progress” that results in flawed decision-making and misplaced trust in high-stakes clinical scenarios
Data requirements: ML/DL/LLM models require vast, diverse, well-annotated datasets, for effective training and validation. These are difficult to obtain in healthcare due to privacy constraints, institutional heterogeneity, and the intensive expert annotation costs.
Variability and bias: Continue to undermine robustness. Differences in protocols, scanners, and patient demographics introduce heterogeneity in image appearance. Validating the information generated by LLMs is essential and requires regular audits, fairness-aware training[75-77]. Providing additional context or paraphrasing a question to an LLM can change the subsequent response[78]. Hence, the validation needs to be not only on scientific rigor but also on the contextual understanding of LLMs. For example, a non-contrast CT’s rounded peripheral hyper density - may reflect a contusion (in the context of trauma), a metastatic lesion (in a patient with a known malignant melanoma), a spontaneous hemorrhage (in an older patient with amyloid angiopathy), or a hemorrhagic venous infarct (in a young female on oral contraceptives)[55]. Understanding the clinical context in such cases is critical. Bias in training data leads to underperformance and inequities. LLMs, which rely on massive internet-scale corpora, are especially prone to propagating cultural, gender, or racial biases[79-81]. The inability of LLMs to filter biases properly continues to be a significant drawback.
Incorrectness, hallucinations, and related phenomena: Incorrectness was reported in 32.9% of cases (Busch et al[43]). Many of these errors stemmed from hallucinations, fabricated or inaccurate outputs without grounding in input or reality. Busch et al[43] further categorized these into: (1) Illusion (13.5%): Characterized by the generation of deceptive perceptions or the distortion of information by conflating similar but separate concepts (e.g., suggesting that MRI-type sounds might be experienced during standard nuclear medicine imaging); (2) Delirium (38.2%): Generating significant gaps in vital information, resulting in a fragmented or confused understanding of a subject (e.g., omission of crucial information about caffeine cessation for stress myocardial perfusion scans); (3) Extrapolation (12.4%): Applying general knowledge or patterns to specific situations where they are inapplicable (e.g., advice about injection-site discomfort that is more typical of CT contrast administration); (4) Delusion (15.7%): A fixed, false belief despite contradictory evidence (e.g., inaccurate waiting times for a thyroid scan); and (5) Confabulation (20.2%): Filling in memory or knowledge gaps with plausible but invented information (e.g., “You should drink plenty of fluids to help flush the radioactive material from your body”, for a biliary system–excreted radiopharmaceutical)[43]. These errors, ranging from minor inaccuracies to complete fabrications, raise serious concerns about the reliability of LLMs in medical contexts, where even small mistakes can have significant consequences[43].
Hallucination: Rates are not trivial: 15%-20% in ChatGPT alone[6,82]. Factors such as biased training data, source-reference divergence, and token-by-token prediction increase susceptibility[82]. In medicine, where accuracy is critical, hallucinations raise the risk of misinformation, unsafe recommendations, or discriminatory content[55]. A summary of causes and types of Hallucinations and other forms of model incorrectness in LLMs are listed in Figure 3. LLM hallucination may stem from factors such as source-reference divergence, biased training data, and privacy concerns; leading to potential spread of misinformation, discriminatory content, and privacy violations[82]. The problem of hallucinations is particularly concerning in critical applications like healthcare, where inaccurate or fabricated information can have severe consequences. This highlights a fundamental limitation of LLMs: Their lack of grounding in reality and true understanding makes them prone to generating outputs that are not only incorrect but also potentially harmful.
Figure 3 Causes and types of large language models hallucinations and other forms of model incorrectness (such as Illusion, delirium, extrapolation, delusion, confabulation).
LLM: Large language model.
Privacy and safety: Busch et al[43] found that 32% of studies judged LLM outputs unsafe (including misleading), 28.2% contained harmful content, and smaller but notable proportions revealed language bias (2.3%), insurance-related bias, or disparities affecting underserved racial groups and underrepresented procedures (1.1% each). These findings raise concerns that LLMs may perpetuate or even exacerbate healthcare inequities. Compounding these risks, LLMs remain vulnerable to “jailbreak” prompts - crafted queries that disguise harmful intent within narratives or instructions. Such attacks exploit token-based prediction and weak safety filters, leading to inappropriate, biased, or unsafe outputs[83]. In healthcare, this could result in unauthorized advice, misinformation, or privacy violations.
Non-comprehensiveness and reliability: Busch et al[43] reported that nearly 90% of reviewed studies demonstrated non-comprehensiveness. In clinical settings, incomplete results can have severe consequences, ranging from missed diagnoses to insufficient therapy recommendations that undermine entire treatment plans. For example, an incomplete therapy suggestion could compromise an entire treatment plan and risk patient harm. This pervasive non-comprehensiveness therefore represents a major obstacle to the safe and effective use of LLMs in healthcare decision-making. Additional reports[84,85] emphasize the non-reproducibility of LLM outputs, which obstructs validation and raises serious challenges for clinical trust and safe integration.
Interpretability and trust: DL and LLMs operate as “black boxes,” recognizing correlations without causal reasoning. Although LLMs show promise for text-heavy integration, coherence across data types is inconsistent[55]. Their opacity undermines clinicians’ trust, particularly when recommendations lack medical interpretability[55,86,87]. Providing identical queries in slightly altered contexts may also produce inconsistent responses, illustrating the fragility of their contextual reasoning[55].
Feedback loops and error propagation: Systematic misinterpretations or clinician overreliance may amplify diagnostic errors. Without active monitoring and retraining, these “model drifts” can compromise patient safety[55].
Ethical risks: Remain unresolved. Prejudices embedded in training datasets can manifest in biased or harmful outputs[17,81,88]. Some of the ethical concerns can be attributed to ”Hallucinations”[82]; and also relate to its potential impact on employment.
Computational resources and environmental costs/resource intensity and sustainability: Training deep models and large LLMs demands advanced graphics processing units/tensor processing units, massive storage, and a high energy consumption[55]. This poses financial and sustainability barriers[69,89], particularly for under-resourced health systems[90,91]. Training ChatGPT-3, for example, consumed energy comparable to millions of hours of video streaming[92]. Although some analyses argue that LLMs may offer efficiency gains compared to human labor[93], the balance between performance and environmental cost remains unresolved. Diffusion models, though powerful, remain computationally prohibitive for many medical applications[94,95].
Clinical integration: Integrating AI into healthcare workflows requires substantial infrastructure, clinician training, and cultural adaptation[19]. Models must also be frequently retrained to reflect new imaging modalities, evolving protocols, and shifting data distributions, creating heavy resource burdens. Overreliance on imperfect systems risks “error propagation”, where clinician trust amplifies systematic model errors[55].
Delusions of progress: Excessive trust in AI can create “delusions of progress”, where fluency is mistaken for comprehension. This misplaced confidence risks adoption of outputs that lack nuance, clinical grounding and contextual understanding needed for ‘high-stakes medicine’ decision making[1,55].
Limited contextual understanding and high-level reasoning: ML, DL systems and LLMs, while powerful, have inherent limitations in contextual comprehension and high-level reasoning. LLMs operate primarily at the token level, processing language as patterns rather than interconnected concepts[55]. This often results in superficial outputs that lack depth and nuance, with potential misinterpretation of metaphorical or context-dependent language[79,86,96]. The “stochastic parrots” critique underscores that fluency does not equal comprehension[87].
Limited high-level reasoning: These AI models struggle with high-level reasoning, particularly in tasks requiring abstraction, multi-step problem solving, or integration of domain-specific knowledge[18,79]. In radiology, this can manifest as misinterpretation of complex imaging patterns, incorrect differential diagnoses, or oversights in multi-modal clinical reasoning when subtle relationships across patient history, imaging, and laboratory data must be synthesized[31,57]. These limitations highlight the need for advanced AI frameworks, such as LCMs, which aim to model concepts and relationships in a manner more akin to human cognition[30,64].
Regulatory, ethical challenges and practical considerations: Include validation, safety, accountability, ethical concerns, data privacy, and algorithmic bias. Radiology is a rapidly evolving field with new imaging techniques and data types emerging frequently. Most models require continuous retraining and updating to keep pace with these advancements, which is resource intensive[55]. Regulatory approval remains limited. The United States Food and Drug Administration (FDA) has so far authorized nearly 1000 AI-enabled medical devices, but none have been based on LLMs. This highlights a regulatory gap between conventional AI systems, which are relatively constrained and validated for specific functions, and LLMs, whose non-deterministic and non-transparent outputs pose unique challenges for clinical validation and safety monitoring. Their susceptibility to variability, error, and non-reproducibility calls for entirely new regulatory frameworks[97].
Model specific limitations: Include limited scope and overfitting. Many current ML/DL applications in radiology are designed for specific tasks or clinical situations, rather than the broad spectrum of real-world clinical scenarios. DL models can overfit training data, leading to poor performance on unseen datasets if training data are not sufficiently diverse or representative.
Rare diseases and edge cases: AI performance often deteriorates in rare conditions, atypical presentations, and underrepresented demographics. This “long tail” of medicine challenges models trained primarily on common conditions and is particularly relevant in radiology, where edge cases demand nuanced reasoning beyond narrow pattern recognition.
Transparency, reproducibility, and peer review: Lack of transparency in model design, opaque validation, and absence of external reproducibility hinder clinical trust[43,84,85]. Independent peer review and access to external datasets are essential for acceptance.
Dynamic nature of radiology: The field’s rapid pace of innovation poses its own challenge. DL models must continually adapt to new imaging techniques, protocols, and disease presentations, requiring frequent retraining and validation[18]. Without such updates, AI systems risk obsolescence or degraded performance.
Medicolegal liability: Remains unresolved: If an AI-generated recommendation leads to harm, liability between developers, institutions, and clinicians remains unclear[55]. This ambiguity represents a significant barrier to clinical deployment of LLM-based systems. Ethical concerns, such as patient data privacy, algorithmic bias, and the potential impact of automation on radiology employment, further complicate adoption.
Moving forward
The advent of LLMs, epitomized by OpenAI’s ChatGPT marked a transformative leap in NLP. However, despite their impressive capabilities, LLMs face inherent architectural limitations, especially in specialized domains such as radiology[98-100]. Additionally, the development of LLMs has been led largely by computer scientists and business stakeholders, with little direct input from radiologists and physicians[19,51,101]. Without stronger collaboration, the gap between technical performance and clinical applicability will persist. These constraints underscore that LLMs should be treated as adjuncts, not autonomous decision-makers. Radiology, given its complexity and reliance on contextual integration, demonstrates both their utility as well as exposes their critical shortcomings. Despite the transformative advances through ML and DL in imaging analytics, anomaly detection, and diagnostic support, unresolved challenges remain: Data access, interpretability, bias, workflow integration, sustainability, and especially regulatory and legal oversight. Until these are systematically addressed, AI in radiology will remain powerful but constrained, unable to fully deliver on its full promise. More recently, LMMs are emerging as powerful tools, particularly in radiology[55,56]. These models, built upon LLMs and enhanced by techniques like prompt engineering[55], integrate multiple imaging types (e.g., CT, MRI, X-ray, endoscopy, digital pathology) alongside textual data (radiology reports, clinical notes, and structured EHR data)[31]. Their defining feature is the ability to concurrently process and align information across modalities, mapping them into a shared representational space. This synergy enables more comprehensive understanding than unimodal approaches[55]. This capability enables them to address complex cross-modal tasks such as radiology report generation from images and VQA that incorporates both imaging and clinical context[31,55]. Current applications of MLLMs span automatic generation of preliminary radiology reports, VQA and interactive diagnostic support. Examples include embeddings for language/image-aligned X-rays, which leveraged a language-aligned image encoder grafted onto a fixed LLM, PaLM 2, to perform a broad range of chest X-ray tasks[59]. CheX-GPT[60]) and RadLing[61] are other recent chest X-ray domain models. While these multimodal LLMs show great promise by integrating imaging and clinical data[31], their clinical adoption is limited by data quality, implementation challenges, and insufficient clinician involvement. For example, MLLMs require access to large-scale, high-quality multimodal datasets, which are scarce in the medical domain. Their implementation is further complicated by risks of hallucinated findings, lack of transparency in decision-making processes, and high computational demands - a challenge shared by earlier LLMs as well. Nam et al’s review[31] summarizes the current capabilities and limitations of MLLMs in medicine and outlines key directions for future research. They suggested critical areas include incorporating region-grounded reasoning to link model outputs to specific image regions, developing robust foundation models pre-trained on large-scale medical datasets, and establishing strategies for the safe and effective integration of MLLMs into clinical practice. Recently, it has been reported that LLMs can now learn without labels, enabling self-evolving language models using unlabeled data[62]. A new algorithm called SLED has been introduced[63]. Standard LLMs often rely solely on the final layer, potentially leading to incorrect but “popular” answers due to missed contextual cues. SLED improves this by using information from all layers of the LLM, not just the last one. It does this by reusing the final projection matrix in the Transformer architecture on early exit logits to create probability distributions over the same set of possible tokens that the final layer uses. It then takes a weighted average of the distributions from all the layers, giving more importance to some layers than others, thus refining the LLM’s predictions by incorporating information from different stages of its processing. However, all of these advanced models are still fundamentally built upon LLMs and are therefore vulnerable to similar limitations. They still fundamentally process statistical patterns between words rather than abstract concepts, potentially limiting their depth of reasoning and long-range coherence, compared to a human physician’s diagnostic abilities. LLM/LMM constraints in abstract reasoning and multimodal integration motivate the exploration of new paradigms. Future directions may lie in LCMs, which move beyond token prediction to represent clinical concepts and multimodal integration in a manner more akin to human reasoning[30,64]. By enabling abstraction and contextual synthesis across complex medical data streams, LCMs represent a potential evolutionary step toward more trustworthy, clinically aligned AI. This discussion will be continued in part 2, which elaborates upon the conceptual framework of LCMs, an emerging paradigm that aims to overcome these limitations by enabling richer semantic understanding and contextual reasoning. We will explore the evolution of these more advanced models that address these limitations and the challenges that lie ahead.
PART 2 - BEYOND LLMS IN RADIOLOGY AI: INTRODUCING LCMS AS THE CONCEPTUAL FRAMEWORK OF A NEW PARADIGM AND FUTURE DIRECTIONS
LCMs
What are LCMs? LCMs represent a transformative shift in AI’s evolution beyond traditional LLMs, that rely on token-level sequential processing. Rooted in explicit semantic representations, LCMs operate at the level of clinical concepts and sentences rather than individual tokens, enabling enhanced semantic reasoning, long-range context synthesis, and integration of diverse, multi-modal data types; such as text, images, and structured clinical information. This richer conceptual framework allows LCMs to unlock insights and perform complex clinical decision-making that remains beyond the capabilities of current LLMs[66,69]. As Stephen Hawking suggested, “We are at the very beginning of time for the human race”. It is thus reasonable to hope for continued progress extending far beyond even today’s state-of-the-art AI models and agents.
Technical details of LCMs
Human cognition operates on concepts, not tokens. LCMs emulate this by leveraging SONAR embeddings, covering 200 languages and 76 speech inputs[102] to represent meaning across modalities[67,68]. Unlike LLMs, which tokenize input and predict sequential tokens, LCMs manipulate higher-order embeddings, enabling reasoning, independent of language boundaries[67,68,70]. Innovations in LCMs include hierarchical architectures for long-form coherence, diffusion-based generation that predicts embeddings instead of tokens, dual-tower models separating context encoding from generation, quant-LCMs for efficiency and scalability, zero-shot generalization across unseen modalities, and SONAR-based decoding that enables simultaneous multilingual outputs while reducing latency[74,103,104]. Figure 4A shows the detailed architecture of LCMs, illustrating language-agnostic, universal, and multimodal concept processing. Figure 4B combines the legends/keys for the visual elements, color coding, and capability categories shown in Figure 4A.
Figure 4 Large concept models.
A: Large concept models detailed architecture illustrates language-agnostic, multimodal data flow with universal concept encoding, hierarchical reasoning, and multilingual output capability. This figure details the end-to-end processing pipeline of a large concept model, contrasting it with traditional language-centric artificial intelligence. The architecture is composed of two main sections: Top (multimodal data flow): Illustrates the flow from a Language-Agnostic Multimodal Input (accepting text, images, audio, video, and sensor data) to a Concept Encoder (like SONAR) which performs universal concept extraction. This encoded concept is then processed in a Concept Embedding Space via multimodal reasoning and diffusion processes. Finally, a Concept Decoder generates a Multimodal Output in any language or modality, demonstrating a true “any-input-to-any-output” capability. Bottom (core capabilities): Summarizes the three foundational pillars that this architecture enables: (1) Language agnostic universal language understanding across hundreds of languages; (2) Multimodal integration for unified concept understanding and reasoning across diverse data types; and (3) Universal concepts, enabling high-level abstract, logical, and causal reasoning that is independent of culture or domain; B: Reveals the color coding, visual elements, input modality icons, key processing components and capability categories illustrated regarding large concept models shown in Figure 4A. LCM: Large concept model.
LCMs introduce several innovations: That can also be used to advance language modeling, such as[74,103,104]: (1) Hierarchical architecture which mirrors the human reasoning processes and enhances the coherence of long-form content and allows for local edits without disrupting the broader context; (2) Mean squared error regression - a foundational approach for sentence prediction; (3) Diffusion-based generation - such models predict the next SONAR embedding based on preceding embeddings and may be: Single tower architecture: Wherein a single transformer decoder handles context encoding and denoising (or combined context processing and sentence generation), or dual tower architecture: Which provides dedicated components for each task i.e. separates context encoding and denoising (separating context understanding from generation); (4) Quant-LCM for embedding quantization and enhanced robustness ensuring scalability and efficiency. This also addresses the quadratic complexity of standard Transformers, enabling LCMs to handle long contexts more effectively than token-level processing; (5) Zero-shot generalization: LCMs display strong zero-shot generalization capabilities on unseen languages and modalities by leveraging SONAR’s extensive multilingual and multimodal support; and (6) Search and stop criteria: The search algorithm based on the distance to the “document end” concept ensures coherent and complete generation without the need for any fine-tuning. Quantized SONAR models: Have been considered as an upcoming enhancement in the LCM ecosystem[104] and thereby should ensure relatively easier upgradation of LLMs to LCMs or hybrid systems.
Overcoming long-range dependencies: By encoding concepts instead of tokens, LCMs manage long-range dependencies more effectively, reducing repetition and hallucinations[73]. Their multimodal design integrates text, images, and audio, supporting more holistic reasoning[66]. This approach reduces computational costs and resembles META’s Joint Embedding Predictive Architecture; though Joint Embedding Predictive Architecture emphasized abstract representation learning, while LCMs model reasoning processes directly[105].
Representation and reasoning: LLMs represent text via token embeddings, limiting their ability to model complex relationships. With LCMs, the internal representations are closer to the “conceptual maps” that we humans form in our minds. LCMs explicitly encode concepts in high-dimensional space, enabling clustering, analogies, and hierarchical reasoning[73]. An analogy with hierarchical reasoning would be the example of a university professor taking a talk on a particular subject. He will definitely not write down all his words in his presentation, but just mention the key points through which he derives the rest. Every time (he gives the same talk) he may use different words, but his content/key points will remain the same, i.e. the essence of the talk remains unchanged despite the change in some/many words. Similarly, LCMs manipulate concepts, ensuring consistency across varied expressions. As LCMs generate output by manipulating whole concepts rather than predicting one token at a time, they maintain consistency over hundreds of words. Sentences are the basic building blocks that represent concepts. Eliot[106] stated that LCMs can devour sentences and adore concepts and has outlined the LCM workflow as a six-step process: (1) User enters a sentence composed of one or more words; (2) Concept encoder examines the sentence to figure out what concepts might mathematically and computationally underpin the text-based sentence provided in step[1] above; (3) Feed those speculated concepts into a LCM for computational processing; (4) LCM generates responsive concepts, which the LCM produces as internal outputs - a set of concepts; (5) Concept decoder produces text-based answers for the user; and (6) User is shown the generated sentence composed as a result of these steps[106]. This ensures outputs maintain fidelity to original meaning, while allowing flexible multilingual and multimodal decoding without re-running inference[67,68]. The key distinction is that LLMs predict “what to say next”, whereas LCMs know what the prompt is about and encode “why and how” to the answer, grounding outputs in reasoning and relationships[69]. What differentiates an LCM from other works is its power to dive deeper into the meaning and interrelation of concepts, that generates a far more accurate response and implementation in the outputs[69].
Strategies for conceptual correctness in LCMs
While LLMs excel at processing and generating language from statistical patterns, LCMs aim for structured understanding of concepts and their non-linguistic relationships[107,108]. This difference enables LCMs to address incorrectness in ways beyond the reach of standard LLMs.
Direct manipulation and verification of conceptual graphs/networks: LCMs represent knowledge in structured forms such as conceptual graphs or semantic networks, allowing for direct analysis and verification[107,109,110] enabling: (1) Identification of contradictions (e.g., “A is a type of B” vs “A is not a type of B”); (2) Verification of conceptual distance and relevance between linked concepts; (3) Comparison of conceptual graphs with ontologies or knowledge bases; and (4) Propagation of uncertainty scores. These operations are not feasible in LLMs, where knowledge is embedded in inaccessible neural weights[107,110].
Multi-modal conceptual grounding: While MLLMs align features across modalities, LCMs target modality-agnostic conceptual representations, allowing: (1) Verification of semantic consistency across modalities (e.g., “cat” in image vs “cat” in text); (2) Confirmation of conceptual implications across modalities (e.g., an X-ray of a broken leg validating textual implications of limited mobility)[107,108]; (3) Simulative verification and plausibility checks, by modeling causal relationships and rejecting implausible scenarios such as the objects floating upward plausibility[111]; and (4) Meta-conceptual reasoning and self-correction, including identification of low conceptual confidence, refinement of internal knowledge, and explanation of reasoning in terms of conceptual transformations rather than linguistic steps[110]. LLMs, by contrast, lack such meta-cognition.
Complementary strategies for correctness: Knowledge verification systems: LCMs must address factual errors, conceptual drift, and temporal inconsistencies by aligning with established knowledge frameworks[112-114]. Beyond textual retrieval, RAG integrates structured knowledge, ontologies, and multi-modal data to enrich conceptual grounding[45]. Uncertainty Awareness. LCMs should signal ambiguity, avoid false certainty, and request clarification when concepts are unclear[109,113]. Adversarial Training. Strengthens resistance against conceptual manipulation, prompt injection, and misleading inputs[114]. Human feedback loops: Enable alignment with human values, intuition, and expert knowledge across diverse contexts[114]. Transparency Mechanisms. Facilitate user trust by making reasoning pathways and conceptual links auditable. Multi-modal verification: Ensures consistent conceptual grounding across text, images, audio, and other modalities[108]. Provides an overview of strategies designed to address and prevent all forms of incorrectness in LCMs (Figure 5).
Figure 5 Strategies to address all forms of incorrectness in large concept models.
These strategies work together to address the full spectrum of incorrectness issues - from outright hallucinations to subtle misinterpretations, inappropriate confidence, and contextual misunderstandings. The goal isn’t just factual accuracy but a more reliable, trustworthy and useful artificial intelligence system overall. RAG: Retrieval-augmented generation; RLHF: Reinforcement learning from human feedback.
Integrated approach
A holistic framework integrates all these strategies[114]: (1) Strong pre-training on fundamental concepts; (2) Verification layers using RAG and knowledge frameworks; (3) Explicit handling of uncertainty; (4) Human feedback for continual conceptual refinement; and (5) Transparency to reveal reasoning and knowledge sources. Together, these strategies address factual inaccuracies, conceptual misunderstandings, and flawed reasoning. The goal is not only accuracy but a reliable, trustworthy, and conceptually sound AI system.
Key advantages of LCMs in healthcare
LCMs offer transformative benefits by emphasizing relationships, causal reasoning, and conceptual frameworks. For example, incorporating lamellar effusion into screening algorithms improved pediatric TB detection[51], while recognition of demographic disease shifts informs public health policy. This conceptual synthesis can significantly improve patient outcomes[19,51]. Merchant et al[51] emphasized that “AI needs real intelligence to guide it” underscoring the value of combining imaging with clinical evidence rather than relying on single modalities. This reflects a broader truth that meaningful intelligence, human or artificial, requires flexibility, humility, and willingness to adapt when established views are disproven[115]. As Merchant[65] observed, “we are drowning in information while starving for wisdom” pointing to the need for holistic, concept-based AI that embeds decades of experience and prognostic knowledge into its frameworks[19,51]. Unlike LLMs, which generate outputs token by token, LCMs reason with higher-level concepts, mapping symptoms to diagnoses, integrating comorbidities, and combining imaging, genetic, and lab data into cohesive assessments[19,51.116]. Cross-modal capacity further allows them to integrate narratives, data streams, and images into cohesive outputs[68,103], a critical advantage in clinical practice. A distinguishing strength of LCMs lies in semantic reasoning and context-aware decision-making[116]. By embedding patient-specific variables such as age, lifestyle, and comorbidities, their outputs are more interpretable than the “black box” systems of LLMs, providing explanations clinicians can validate[117,118], thereby building trust and improving adoption.
Efficiency is another advantage: With fewer parameters and lower energy demands, LCMs support “Green AI”[69,119-121]. Meta’s prototypes demonstrated that leaner architectures could deliver superior performance while reducing costs and emissions. Moreover, they can be domain-tailored for radiology, oncology, or cardiology offering specialized solutions while retaining their broader reasoning abilities[66,122]. Compared with LLMs like GPT or bidirectional encoder representations from Transformers, which excel primarily at natural language tasks, LCMs’ conceptual depth positions them as more suitable for complex, high-stakes healthcare applications where reasoning and interpretability matter most[50,122].
Regulatory outlook: Since LCMs emphasize explainability and transparency, they may face fewer regulatory hurdles. The FDA’s assessment framework prioritizes interpretability, clinician trust, and auditability, criteria that LCMs, with clearer decision pathways, are better positioned to meet compared to opaque LLMs.
AI’s journey in radiology and the dawn of LCMs
AI has long promised to revolutionize healthcare, with radiology- an early adopter, at the forefront of experimentation and deployment[123,124]. From initial excitement to real-world applications, the field has witnessed a decade of evolution in both expectations and implementation[1]. To appreciate the transformative potential of LCMs, it is helpful to briefly trace this journey. Early reviews[125] and later analyses[126] highlight both enthusiasm and anxiety around AI’s disruptive potential. They point to AI’s roles in workflow optimization, error reduction, improved imaging efficiency, and precision medicine, all enabling earlier diagnosis and proactive healthcare. By 2024, radiology remained central to healthcare, not only in preserving lives but also in generating significant economic output[126]. The rising demand for diagnostic services, advances in imaging, and AI integration have fueled rapid global growth in radiology. Industry revenues were projected to surpass 45 billion dollars by 2024 and 51 billion dollars by 2032[127,128], driven largely by chronic illness, aging populations, and expanded healthcare access. Similarly, the AI health market, once valued at 6.6 billion dollars in 2021 with a 40% compound annual growth rate, was forecast to grow more than tenfold within five years[129]. These market dynamics underscore radiology’s role as a global driver of healthcare innovation[124]. The integration of AI and now LCMs promises to further accelerate diagnostic accuracy, workflow efficiency, and personalized care - unlocking both clinical and economic impact.
The lessons of the past and the crossroads of AI in healthcare
The decline of Kodak serves as a cautionary tale about ignoring innovation[130]. Healthcare faces a similar crossroads: Whether to adopt AI now or wait. Despite rapid uptake of imaging equipment, software adoption, including AI, often lags. Many radiologists remain focused on ML, even as LLMs and LCMs open far greater possibilities[66]. The impending widespread implementation of large context models casts a long shadow of potential disruption across the healthcare landscape. This powerful new wave of AI promises unprecedented capabilities in diagnosis, treatment planning, drug discovery, and personalized medicine[131]. The journey of AI in radiology, from early excitement to the current transformative potential of LCMs, underscores the critical need for proactive adoption and strategic integration of these advanced models to shape the future of healthcare. Kodak’s fate illustrates the danger: (1) Strategic blunder leadership resisted digital photography to protect film profits; (2) Slow adoption investments in digital were half-hearted, designed not to cannibalize film; and (3) Inevitable decline as digital and then smartphone photography surged, Kodak became irrelevant, filing for bankruptcy in 2012[132-134]. For healthcare, the lesson is clear: Protecting the present at the expense of the future risks’ obsolescence. Innovate or evaporate[135]. As Darwin observed, “It is not the strongest of the species that survives, nor the most intelligent, but the one most adaptable to change”.
The impending revolution - LCMs and the future of healthcare
With LCMs now implementable, the landscape of AI in healthcare could shift dramatically. They promise unprecedented capabilities in diagnosis, treatment planning, drug discovery, and personalized medicine[131]. The transition, however, demands proactive adoption. History is littered with dominant entities that failed to adapt. Economic incentives are strong: Private payers could save 80-110 billion dollars annually in the next five years, and physician groups another 20-60 billion dollars through AI-driven efficiencies[130]. The development of responsible AI ensuring transparency, trust, and safety - is essential for deploying advanced technologies[136]. Yet challenges in data modeling (small sample sizes, noisy data) and algorithmic complexity must be addressed to ensure ethical, sustainable implementation. In diagnostic imaging and interventional radiology, AI and robotics are already enhancing workflows, though many models remain experimental[137-145]. LLM-based models are only beginning to appear in this space, and LCMs remain largely conceptual. Having established the foundational role of multimodal, concept-capable LCMs in healthcare, the discussion now shifts to their broader applications. LCMs and emerging hybrid LLM-LCM frameworks are expected to transform not only clinical care, but also research, education, and system-level decision-making. Their capacity to integrate complex multimodal data aligns closely with radiology’s trajectory, where AI has evolved from experimental promise to increasingly sophisticated practice. Positioned at this intersection, LCMs are poised to redefine the application of AI across healthcare domains. The next section outlines these illustrative applications across healthcare, highlighting the breadth of opportunities, before turning to detailed case scenarios.
Illustrative applications across healthcare
By integrating diverse conceptual frameworks, LCMs offer insights beyond conventional LLMs, synthesizing patient symptoms, medical histories, multimodal imaging, and research literature into concept-driven diagnoses[64,69,122]. This capability extends across domains from radiology to Interventional radiology, supporting research, clinical care, and system-level decisions. By enabling earlier diagnoses, personalized planning, improved workflows, and collaborative research, LCMs can enhance healthcare quality, accessibility, and outcomes worldwide. The following sections illustrate these wide-ranging applications.
Diagnostic excellence: LCMs integrate multimodal data - including symptoms, structured histories, imaging, biomarkers, and literature - to generate robust diagnostic insights that surpass LLMs[64,69,122]. This multidimensional synthesis supports early disease detection, identification of subtle radiological markers, and contextual understanding of clinical language, reconciling terminological differences across reports[51,66,74].
Key capabilities include: Early disease detection - detecting faint anomalies or subtle biomarkers by linking multimodal signals, conceptually linking multiple data types beyond simple keyword matching[146]. Multimodal data integration and synthesis - integrating clinical, genomic, and literature evidence to prioritize differential diagnoses, by conducting rapid and deep reviews of thousands of research articles to identify contradictions, knowledge gaps, and emerging hypotheses that support clinical and research decisions. Longitudinal data and clinical context awareness - summarizing prior imaging and tracking disease evolution across records truly encapsulating the patient’s entire clinical journey. Tracking progression of multiple conditions: Follow the evolution of symptoms, findings, and treatments across years of patient records, supporting updated diagnosis in complex cases and for follow-up, maintaining longitudinal coherence where standard LLMs struggle with extensive clinical context[74]. Diagnostic support and decision assistance - correlating findings with literature to suggest or rank differentials. Global and multilingual capabilities - analyzing multilingual clinical data at a conceptual level, enabling cross-border collaboration[67].
Longitudinal and personalized care: Radiology and medicine depend heavily on longitudinal interpretation - a challenge LCMs address by maintaining conceptual coherence across extended records. They track lesion progression, edema resolution, or metastasis emergence even when terminology shifts, effectively acting as a longitudinal “concept radar”[66,74]. Integrated with electronic medical records (EMRs), genomics, and lab data, LCMs support personalized treatment, predict therapeutic responses, flag contraindications, and optimize interventions[19,131,147,148].
Advanced imaging and radiology
Radiology’s multimodal complexity makes it an ideal test case for LCMs.
Applications include: Concept-aware archive: (e.g. Picture Archiving and Communication System) search: Moving beyond keyword-based retrieval, clinicians can perform nuanced conceptual queries such as “progressive bilateral ground-glass opacities” or “sclerotic bone lesions with periosteal reaction”, enabling retrieval of clinically meaningful studies over time[117,124]. For example, this allows radiologists to efficiently track disease progression or treatment response by retrieving and comparing conceptually relevant prior images, reducing diagnostic delays.
Automated and grounded reporting: Drafting radiology reports directly linked to visual findings and conceptual metadata, improving precision and reducing ambiguity. LCMs also assist in populating structured reporting templates (e.g., Breast Imaging Reporting and Data System, Liver Imaging Reporting and Data System) based on conceptual criteria rather than manual input[139,149]. This leads to more standardized reports that enhance communication with referring physicians and streamline clinical decision-making.
Image quality optimization: Enhance imaging by noise reduction and contrast adjustment, facilitating better tissue characterization and differentiation, which is critical for accurate diagnosis and treatment monitoring[141,149]. Improved image quality can reduce repeat scans, decreasing patient exposure to radiation and lowering healthcare costs.
Workflow standardization: Guiding acquisition protocols in real time to improve efficiency, LCMs help ensure acquisition of diagnostically optimal images, accelerating throughput and reducing the need for repeat exams[150].
Research and knowledge discovery: LCMs accelerate scientific discovery and translational medicine by modeling complex relationships between biological, clinical, and imaging data: (1) Literature review acceleration: Uncovering hidden patterns, reconciling conflicting findings and surfacing new hypotheses, through conceptual linkage with medical literature[116,151]; (2) Global knowledge integration: Bridging linguistic/cultural divides in research[64]; (3) Drug development: Predicting molecule efficacy, toxicity, and interactions by integrating molecular, clinical trial, and regulatory datasets[19,152]; (4) Pharmacovigilance: Real-time monitoring of adverse events[149]; (5) Precision medicine: Identifying novel correlations between imaging phenotypes, genomic markers, and clinical outcomes[153]; and (6) Patient engagement and communication: LCMs improve health literacy and shared decision-making by translating complex concepts into accessible language.
Multilingual and concept translation - improved cross-cultural and inter-provider communication: Utilizing a language-agnostic concept (like SONAR), LCMs serve as universal medical translators and preserve the subtlety of medical nuance beyond rote translation, enabling precise cross-cultural communication and remote healthcare coordination[136]. This in turn enables more accurate communication between healthcare providers, patients, and their relatives, helping patients make informed decisions. This capability can prove to be a major game-changer in various nationwide/worldwide ‘remote’ activities and research coordination where multiple languages/dialects are common.
Simplify medical language: LCMs can make complex reports, trial info, and discharge summaries patient-friendly for enhanced engagement and literacy, supporting better understanding and engagement[154].
Improved patient monitoring: Integrating sensor and EMR data to issue alerts, encouraging proactive care[116,151,155]. This patient-centered approach promotes equity in care and empowers vulnerable populations.
Education and training: Concept-driven platforms support clinicians and students.
Case simulation: Engaging learners in reasoning about clinical concept interrelationships, with real-time feedback and adaptive challenges that emphasize conceptual consistency[19,65].
Immersive augmented reality/virtual reality: Providing AI tutor guidance within virtual patient scenarios, reinforcing visual and conceptual learning grounded in the reasoning behind medical facts[19,66,149,156].
Predictive analytics and system-level decisions: LCMs aid system-level foresight by modeling interdependencies across healthcare data.
Disaster preparedness and outbreak modeling: Simulating pandemics or supply chain disruptions[64,130].
Global data integration: Bridging linguistic and cultural barriers for enriched epidemiologic and policy insights.
Strategic market and operational analysis: Supporting strategic planning, mergers, and product development[68].
Real-world and emerging applications
Though nascent, early implementations highlight LCM potential across medicine and beyond. By moving beyond text-centric LLMs to include visual and conceptual data, they advance diagnosis, treatment, and multi-omics research, revealing novel biological patterns, uncovering hidden relationships e.g. protein folding dynamics[64,74,116,151]. Their ability to process multiple data types simultaneously and discover higher-order correlations broadens applicability into logistics, finance, manufacturing, defense, and environmental sciences. LCMs thus need not be confined to a single purpose but may be fine-tuned for diverse industries and use cases[64] Additionally, LCMs show promise in the pharmaceutical and biopharmaceutical analysis of drugs, vitamins, and minerals across biological matrices like whole blood, plasma, serum, and urine, providing nuanced insights that accelerate research pipelines. Finally, some researchers, including our lead author, anticipate that LCMs may represent a step toward artificial general intelligence (AGI) by bridging knowledge domains[67,68].
Bridging to radiology - systemic benefits of LCMs
Beyond their broad applications across healthcare, LCMs also offer systemic advantages particularly relevant to radiology. First, their conceptual reasoning strengthens predictive capabilities, enabling more accurate forecasts of disease progression and treatment outcomes[157]. Second, by lowering technical barriers, LCMs democratize AI use, making advanced tools accessible to radiologists without extensive computational expertise. Third, their integration fosters innovation and cross-disciplinary collaboration, accelerating the development of novel AI applications. Taken together, these systemic benefits provide a strong foundation for radiology-specific adoption. These theoretical advantages become most compelling when examined through real-world clinical applications. We have elaborated upon how concept-based inputs could be incorporated into radiology AI algorithms via case scenarios, as detailed below, in scenarios 1-3, demonstrating the practical utility of LCMs in clinical diagnostics and workflow optimization.
Scenario 1
Adding key imaging parameters to AI algorithms: Leveraging lamellar effusion in childhood pulmonary TB (CPTB) screening: Offers several critical advantages[19,51,65]: The urgent need for enhanced algorithms in CPTB screening. CPTB remains a major global health challenge, particularly in high-burden, resource-limited settings, where accurate diagnosis is hampered by clinical complexity and limited expertise[158]. Despite a significant disease burden, children often lack access to antituberculosis treatment, as most control programs still prioritize sputum smear-positive adults[159].
Effective screening is vital for early diagnosis and timely intervention, yet several barriers persist: (1) Non-specific clinical presentation: CPTB often mimics other childhood illnesses, complicating clinical differentiation[160]; (2) Microbiological diagnostic challenges: Paucibacillary disease and difficulties in obtaining samples reduce the yield of microbiological confirmation[160]; (3) Chest radiograph complexity: Chest X-ray interpretation for pediatric TB is difficult[160]. The classic Ghon complex (Ghon focus, lymphangitis, hilar lymphadenopathy) can be subtle and/or mimic normal anatomical variations or other pathologies, demanding specialized expertise[161]. Figure 6 is a radiographic demonstration of the classic findings of the Primary TB complex (Ghon complex) in a pediatric patient. The figure consists of two parts and shows the typical radiographic appearance of the Ghon complex and an annotated version of the chest X-ray clearly demarcating the peripheral Ghon focus and the enlarged, draining lymphatic vessels and hilar lymph nodes that define the complex. Figure 7A is an axial CT image of the chest in a child and demonstrates the complete primary TB complex (Ghon complex). The image clearly shows the parenchymal Ghon focus, prominent hilar lymphadenopathy, and the definitive finding of a continuous chain of connecting lymphatics (lymphangitis) between the focus and the enlarged nodes; (4) Hilar lymphadenopathy: Differentiating pathological lymphadenopathy from prominent pulmonary arteries is particularly difficult in young children[51,162]; (5) Limited radiological expertise: Many high-burden countries, face a severe shortage of pediatric chest imaging experts, resulting in delays and diagnostic bottlenecks[163-166]; (6) Recognizing potentially fatal complications: Childhood TB can also be associated with serious complications such as Hemophagocytic Lymphohistiocytosis (HLH), complicating diagnosis and management further; and (7) Potential for algorithmic assistance: Innovative AI tools offer an opportunity to augment radiological interpretation and improve accessibility[167]. Marais et al[160] highlighted foundational concepts in childhood pulmonary TB: (1) Accurate case definitions: Differentiating primary infection from active disease; (2) Risk stratification: Recognizing variable progression risk by age; and (3) Diverse disease pathology: Understanding wide clinical presentations requiring precise classification. Integrating these time-tested principles into modern AI design can merge expert wisdom with large datasets to improve clinical outcomes[19,51,66].
Figure 6 Tuberculosis primary complex - the classic Ghon complex.
A: As seen in in a 5-year-old child: (1) Multiple: Ghon foci in the right upper lobe; (2) Draining lymphatics heading towards; (3) Right hilar lymphadenopathy (white open arrowheads); and (4) Minimal (lamellar) effusion in right horizontal fissure with minimal blunting of the right costo-phrenic angle. Additionally, a retrocardiac air bronchogram indicates tuberculosis (TB) broncho pneumonia (circled area) and hepatomegaly are noted. (Image courtesy Dr. Prakash V Vaidya, Child Health Clinic, Mumbai and Sr. Consultant Pediatrician, Fortis Hospital, Mumbai 400080); B: TB primary complex in a 5-year-old child: The classic Ghon Complex: (1) Multiple: Ghon foci in the rt. upper lobe; (2) Draining lymphatics [(1) and (2) are the areas marked with a red outline]; (3) Heading towards right. hilar lymphadenopathy (large white open arrowheads); and (4) Minimal (lamellar) effusion in right horizontal fissure (white arrows) with minimal blunting of the right costo-phrenic angle. Additionally, a retrocardiac air bronchogram indicates TB broncho pneumonia (circled area), and hepatomegaly are noted. (Image courtesy Dr. Prakash V Vaidya, Child Health Clinic, Mumbai and Sr. Consultant Pediatrician, Fortis Hospital, Mumbai 400080).
Figure 7 Computed tomography image.
A: Childhood tuberculosis primary Ghons’ complex computed tomography appearance: Laterally positioned Ghon foci with sub-pleural involvement, with their draining lymphatics heading all the way up to right hilar lymphadenopathy. Additionally note mediastinal lymphadenopathy (many lymph nodes reveal necrotic areas); B: Computed tomography image: Large thick-walled tuberculosis cavity communicating with the right main bronchus in a multidrug-resistant patient. (Image courtesy Dr. Anagha Joshi, Prof and HOD, Radiology, LTMMC and LTMGH, Mumbai 400022).
The potential impact of leveraging lamellar effusion in algorithms: Integrating lamellar effusion detection into AI-powered algorithms for pediatric chest X-ray analysis offers several critical advantages: (1) Improved accuracy: The high specificity of this sign could significantly enhance the accuracy of TB screening, reducing false positives and negatives; (2) Overcoming expertise gaps: Algorithms trained to identify lamellar effusion can function as reliable first-line screening tools in settings without pediatric radiology specialists; (3) Empowering imaging technicians: Clear guidelines on lamellar effusion visualization, supported by algorithmic assistance, can equip technicians to play a stronger role at the point of care.; (4) Enhance early case detection: By focusing on a distinctive radiological marker - lamellar effusion detection supports prompt diagnosis and treatment initiation; and (5) Feasibility and scalability: AI integration leverages existing X-ray infrastructure, enabling cost-effective expansion in underserved populations.
AI-powered lamellar effusion detection: Translating clinical insights into AI involves designing LCMs trained on expert-annotated pediatric chest X-rays to accurately detect lamellar effusions, distinguishing TB from other thoracic conditions: (1) Concept: LCMs trained on expert-annotated pediatric chest X-rays - including confirmed CPTB cases and non-TB controls - can learn subtle imaging patterns associated with lamellar effusion; and (2) Benefit: Such models would enable highly sensitive and accessible first-line screening. Early identification of lamellar effusion could trigger timely diagnostic workups, reduce missed cases, and enhance confidence in treatment decisions, particularly where microbiological confirmation is not feasible. Importantly, this targeted marker allows non-expert healthcare workers to flag high-probability cases for further expert review, strengthening TB suspicion in otherwise ambiguous presentations. Furthermore, the presence of lamellar effusion can strengthen TB suspicion in cases where treatment is based on strong clinical and other radiological findings.
Conclusion: Screening for childhood pulmonary TB is fraught with clinical and radiological challenges, compounded by limited expert availability in high-burden regions. Incorporating highly specific signs such as lamellar effusion into AI algorithms, particularly within LCM frameworks - offers a promising approach to enhance accuracy, accessibility, and scalability of CPTB screening. This integration has the potential to improve case detection, accelerate treatment, and ultimately reduce morbidity, mortality, and transmission in vulnerable pediatric populations[19,51].
Scenario 2
The concept of screening for pulmonary cavities in TB patients: (i.e., open TB) is crucial for isolating infectious individuals and preventing further spread[51]. TB remains a major global health threat[168] and pulmonary cavities remain a central feature of adult PTB. Parenchymal cavities occur in 40%-45% of post-primary PTB cases and are strongly associated with reactivation of disease[169-171]. Multiple cavities are seen more frequently in patients with MDR TB[169-172] and (> 3 cavities) were observed only in patients with MDR TB[173]. In resource-limited settings, anti-TB treatment (ATT) often starts on strong clinical and radiological suspicion despite negative microbiological tests, underscoring the critical need for accurate radiological interpretation. Studies demonstrate that thick-walled cavities (> 5 mm) are frequently seen in patients with early active TB and represent necrotizing consolidation in the early stage. The presence of lung cavities is a significant indicator of active, often infectious, disease[51,174]. Patients with cavitary TB often have higher sputum bacterial counts, delayed culture conversion, prolonged therapy, and increased relapse risk[168,175,176]. Kim et al[177] demonstrated that cavity wall thickness is a reliable marker of active disease, with thinner walls (mean 3.4 mm, range 1.4-10.2 mm) indicating chronic inactive disease and thicker walls (> 5 mm, mean 6.6 mm, range 1.3-14.7 mm) reflecting necrotizing, transmissible disease. Figure 7B is a CT image of a large, thick-walled TB cavity communicating with the bronchus in a multidrug-resistant (MDR) patient. Imaging also distinguishes active cavities, often with irregular thick walls or air-fluid levels, from healed, fibrotic cavities with smooth, non-enhancing walls[169,178]. Figure 8 reveals a comparison of axial CT images showing a TB cavity before and after 4 months of ATT. Patients with cavitary TB had more acid-fast bacillus culture-positive results at 2 months, longer treatment duration, treatment relapses/higher recurrence rates than those with non-cavitary TB[168,175]. This is particularly concerning in the context of drug-resistant TB, including MDR-TB and extensively drug-resistant (XDR) TB, given that these strains are increasingly difficult and costly to treat, and are associated with poorer patient outcomes[168,176]. It is pertinent to note that most cases of MDR and XDR TB have cavities[176]. TB cavities are dynamic physical and biochemical structures that interface the host response with a unique mycobacterial niche to drive TB-associated morbidity and transmission; and advances in non-invasive imaging can provide valuable insights into the drivers of cavitation, which in turn will guide the development of tailored pharmacological interventions to prevent cavitation in individuals with TB[168]. Saeed[176] emphasized three critical aspects of cavitary TB: (1) Immune hyperactivation in response to high bacillary load causes largely irreversible lung damage. Once the bacillary growth in a cavity has increased manifold, it triggers a putative architectural distorting cascade with very little reversibility possible, including post-treatment; (2) Once mycobacterium tuberculosis achieves a load of a billion plus bacilli/gram, the population gradually slows down its activity, underpinning persistence and relapse risk, finally becoming dormant, which is the root of the cause of persistence and relapse; and (3) At this load there is a real, though small, threat of emergence of resistant mutants and a substantially bigger threat of secondary resistance in chronic cavities, the holy grail of MDR and XDR-TB development[179]. Timely identification of cavitary TB is pivotal for isolation and infection control, enabling tailored treatment and reducing transmission risk[160,168,175].
Figure 8 Axial computed tomography images in soft tissue window and lung window in a 59-year-old female, with sputum culture positive for mycobacterium tuberculosis.
A-C: Demonstrates a thick-walled cavity in the right upper lobe. After 4 months of anti-tuberculosis treatment; D-F: Reduced wall thickness of the right upper lobe cavity is noted (from 12.5 mm to 8.14 mm) and the cavity appears much smoother. (Image courtesy Dr. Anagha Joshi, Prof and HOD, Radiology, LTMMC and LTMGH, Mumbai 400022).
Challenges of clinical and radiological detection: Traditional methods - clinical symptoms, sputum microscopy, or bronchoscopy - are insufficient for reliably detecting cavities[180-182]. Bronchoscopy, though useful in smear-negative cases, is invasive and offers limited scope[180]. Despite their essential role in diagnosing pulmonary TB and detecting cavities. Chest X-rays present challenges in interpreting cavitary lesions. These challenges arise from subtle findings, potential obscuration by lung opacities, and the presence of multiple or small cavities[183-185]. Figure 9A is a frontal chest radiograph of a 14-year-old male presenting with cough and fever, demonstrating a thin-walled tuberculous cavity in the right upper lobe. The situation is further compounded in high-prevalence areas due to the scarcity of skilled radiologists[186,187]. This diagnostic gap underscores the need for automated, scalable solutions.
Figure 9 Frontal chest radiograph.
A: Frontal chest radiograph of a 14-year-old male presenting with cough and fever. Radiography demonstrates a thin-walled tuberculous cavity in the right upper lobe (white open arrowheads); B: Frontal chest radiograph of 9-year-old male with vomiting and abdominal pain, subsequently diagnosed with MDR tuberculosis (TB). Miliary pattern seen throughout both lung fields along with multiple larger lesions indicating multiple developing cavities (circled area), minimal effusion in the right horizontal fissure (white arrows) and hepatomegaly. Multiple air-fluid levels are seen in the visualized abdomen, signifying intestinal obstruction due to TB adhesions (short black arrows). (Figure 9A and B courtesy Dr. Jairaj Nair. Prof and HOD, Chest Medicine, LTMMC and LTMGH, Mumbai 400022); C: Frontal chest radiograph of 12-year-old boy with secondary hemophagocytic lymphohistiocytosis as a result of disseminated TB. Numerous miliary nodules are noted in both lungs (extensively) with additional patchy airspace opacities noted in the medial aspect of right lower lobe (note air-bronchogram). Calcified right hilar lymph nodes are also noted. He recovered well post treatment for same (including anti-TB treatment). (Image courtesy Dr. Prakash V Vaidya, Child Health Clinic, Mumbai and Sr. Consultant Pediatrician, Fortis Hospital, Mumbai 400080).
The need for AI-assisted detection: Traditional techniques to increase early detection, lower misdiagnosis, and strengthen international TB control initiatives can be supported by AI[188]. The increasing burden of drug-resistant TB underscores the urgency of rapid and accurate identification of infectious individuals[189]. Delayed detection of cavitary disease in these patients can have significant public health implications, contributing to the amplification and dissemination of drug-resistant strains[189]. Figure 9B is a frontal chest radiograph of a child with MDR TB showing a miliary pattern in the lungs and multiple air-fluid levels in the abdomen, signs of TB intestinal obstruction. Thus, there is an urgent need for automated, reliable, widely deployable tools capable of accurate pulmonary cavity detection on chest X-rays in suspected or confirmed TB patients[190]. Such tools offer: (1) Enhanced infection control: Early isolation of infectious patients to prevent spread of drug-susceptible and resistant TB strains; (2) Improved treatment strategies: Tailored therapy based on cavity presence and characteristics; (3) Enhanced workflow: Rapid assessments reduce radiologist backlog in high-volume settings; (4) Wider accessibility: Usable in resource-poor areas with limited expert radiology; and (5) Proactive MDR/XDR TB identification: Timely detection supports targeted interventions.
AI-powered thick-walled cavity identification: Concept: Train LCMs on a vast and diverse global dataset of adult chest X-rays. This dataset will include: (1) Images of patients with microbiologically confirmed pulmonary TB, with detailed annotations of lung cavities, specifically focusing on those characterized by thick walls. Annotations should include location, size, shape, wall thickness, and presence/absence of air-fluid levels. This should also include chest X-rays of patients who started treatment based on clinic-radiological findings (even if initially smear-negative) and documentation of any cavities present; (2) Images of patients with other lung diseases that can present with cystic or cavitary lesions (e.g., lung abscess, fungal infections, malignancy, pneumatoceles), with detailed annotations of these lesions to enable the LCM to differentiate them from TB cavities; and (3) Images of healthy adult lungs and those with other non-cavitary lung findings to establish a baseline and identify potential false positives.
Learning focus: The LCM will be trained to learn the specific visual characteristics of thick-walled lung cavities associated with active TB, across various imaging modalities, patient demographics, and disease stages. It will also learn to differentiate these from other lung lesions based on subtle visual cues and contextual information within the chest X-ray.
Benefit: This initiative aims to develop a universally applicable AI tool capable of automatically detecting and localizing thick-walled lung cavities on standard adult chest X-rays, with high sensitivity and specificity. Such a tool would be a crucial screening mechanism for identifying potentially infectious TB patients, including those with MDR/XDR TB, who may require isolation. Early identification of cavitary lesions can facilitate the timely implementation of appropriate infection control measures, optimize treatment strategies, and help restrict disease spread, especially for drug-resistant forms. This automated screening is particularly valuable in high-burden settings and for rapid initial assessment[19,51].
Scenario 3
Leveraging known (and rare) TB associations to improve outcomes.
HLH: HLH is a severe and often fatal hematological disorder, increasingly recognized in TB, particularly during paradoxical worsening on ATT[191,192]. Active TB accounts for 9%-25% of infection-associated HLH cases across all ages (14 days-83 years; median 40 years). HLH is driven by immune dysregulation with excessive macrophage and lymphocyte activation, systemic inflammation, and tissue destruction, often progressing to organ failure and death. Clinical features include fever, organomegaly, cytopenias, hyperferritinemia, hypertriglyceridemia, hypofibrinogenemia, hemophagocytosis, low NK-cell activity, and elevated soluble interleukin-2 receptor[192]. Ferritin > 10000 µg/L is highly specific for HLH in children[193]. HLH can be familial (infancy/early childhood) or sporadic (all ages), with infections as common triggers. TB-HLH is diagnostically challenging due to overlap with disseminated TB. Awareness is critical: Combined anti-TB and immunomodulatory therapy improves survival to 66%, compared with 56% for anti-TB drugs alone and 0% with immunomodulators alone[192]. Timely recognition of hyperferritinemia and cytopenias can guide diagnosis and save lives[191].
Why TB - HLH matters for AI integration: Untreated TB-HLH carries 100% mortality, underscoring the urgency of early detection. Key clinical dynamics: (1) TB as a trigger for HLH: Disseminated TB, especially miliary forms, can provoke secondary HLH, causing cytokine storm and hemophagocytosis. Active TB is implicated in a significant proportion (9%-25%) of infection-associated HLH cases[192]. Figure 9C is a frontal chest radiograph showing numerous miliary nodules and patchy opacities in a boy with disseminated TB leading to HLH; (2) HLH exacerbating TB: HLH can mimic or exacerbate paradoxical worsening during ATT, making it challenging to distinguish between an expected inflammatory response to effective TB treatment and the life-threatening condition of HLH. The uncontrolled inflammation in HLH can cause severe systemic inflammation, multi-organ failure and death, independent of the TB infection itself; (3) Diagnostic overlap: Fever, cytopenias, and organomegaly mimic disseminated TB. Immune dysregulation can render tests (tuberculin skin test, interferon gamma release assay) unreliable; and (4) The “vice versa”: While less direct, HLH predisposes to TB, masks TB signs with systemic inflammation, and complicates treatment due to immunosuppression, potentially increasing the risk of TB reactivation, or make it harder to control the TB infection. Careful consideration and often a combination of anti-TB and immunomodulatory therapies are required, without which the survival rate drops to an abysmal zero.
The critical importance of timely treatment - (100% mortality without it): Without prompt dual therapy (anti-TB drugs and immunomodulatory agents), TB-HLH is almost universally fatal. (1) Untreated HLH: Cytokine storm drives rapid multi-organ failure; (2) TB as the underlying driver: If the underlying TB infection driving the HLH is not addressed with effective anti-TB drugs, the inflammatory trigger persists, further fueling the HLH; and (3) Synergistic lethality: The combination of active, potentially disseminated TB with the systemic devastation of HLH creates a highly lethal scenario.
Enhancing recognition of TB-associated HLH using LCMs: Concept: Trained on multimodal datasets LCMs could transform detection of TB-HLH. Datasets should include: (1) Clinical data: Longitudinal TB records, paradoxical worsening, HLH features (fever, organomegaly, cytopenias), serial ferritin and inflammatory markers, comorbidities, and treatment regimens. In addition, guidelines suggest empirical treatment of TB when clinic-radiological suspicion is high, despite negative microbiological tests[193], a commonly practiced approach in developing countries, should be included in the dataset; (2) Imaging data: Chest X-rays/CTs, particularly miliary TB, annotated for disease evolution and paradoxical reactions. Annotations should focus on the evolution of TB-related findings alongside any imaging features potentially suggestive of HLH (though these may be non-specific); (3) Hematological data: Bone marrow findings, hemophagocytosis, differential counts; (4) Genetic data (where available): Predispositions to familial HLH in pediatric cases; and (5) Literature and expert knowledge: Incorporate relevant medical literature, guidelines and annotated diagnostic/management patterns. This will help the LCM understand the complex interplay between the two conditions.
Learning objectives: (1) Detect patterns linking TB treatment, paradoxical worsening, and HLH, while distinguishing these from drug-resistant TB; (2) Recognize HLH-specific cues (persistent fever, organomegaly, cytopenias, hyperferritinemia, hypofibrinogenemia and specific laboratory findings such as markedly elevated ferritin, hypertriglyceridemia, hypofibrinogenemia, hemophagocytosis, low natural killer-cell activity, and high levels of soluble interleukin-2 receptor; (3) Stratify HLH risk in TB patients, especially during paradoxical worsening, by integrating information across clinical, imaging, and hematological data and imaging, while considering the temporal dynamics that help differentiate it from drug resistance; (4) Differentiate HLH from drug resistance, emphasizing the role of drug susceptibility testing; (5) Correlate dual-treatment regimens with outcomes in TB-HLH cases, and contrast these with outcomes in drug-resistant TB; and (6) Understand paradoxical worsening as often self-limiting (with continued ATT) vs HLH, which requires urgent intervention.
Benefits: (1) Raise clinician awareness by flagging HLH patterns in TB patients; (2) Enable earlier HLH diagnosis and treatment initiation, while guiding drug resistance investigations; (3) Improve survival by reducing delays in dual therapy; (4) Support monitoring of paradoxical reactions and highlight when to suspect resistance; and (5) Advance medical knowledge and potential for personalized medicine by identifying novel TB-HLH associations. By embedding HLH recognition into AI-driven systems, LCMs can enable earlier, more accurate detection of this devastating TB complication, transforming outcomes where delayed diagnosis currently means near-certain mortality.
Additional novel integrations
TB and diabetes mellitus: The bidirectional relationship between TB and diabetes mellitus is well established[194,195]. Diabetes increases susceptibility to active TB and worsens treatment outcomes due to immune suppression, while TB impairs glycemic control. World Health Organization (WHO) recommends screening TB patients for diabetes mellitus and vice versa, making this a natural fit for algorithmic integration to enable earlier detection and management[195].
TB and papillary thyroid carcinoma: Though far rarer, a possible association has been proposed[196]. While causality and mechanisms remain unproven, LCMs could flag subsets of TB patients for thyroid evaluation where clinically indicated, promoting hypothesis generation and vigilance in complex presentations. Incorporating these and similar comorbidities would enable LCMs to support more holistic TB care, addressing both common complications such as diabetes and rarer, exploratory links like TB-papillary thyroid carcinoma.
Final thought
Whereas LLMs excel at pattern recognition, detecting imaging features such as opacities, nodules, and cavities[197,198], LCMs advance toward conceptual reasoning, modeling causal and mechanistic relationships. By synergizing with existing AI methods rather than replacing them, LCMs could transform TB imaging and management: Moving from recognition of patterns to understanding of disease mechanisms, and ultimately to more timely, personalized, and life-saving interventions. An LCM would not replace these methods, but rather work synergistically to add deeper layers of conceptual understanding and reasoning.
Federated learning
Enabling collaborative AI while preserving privacy. Despite its potential, LCMs face persistent privacy, security, and regulatory challenges, especially with sensitive medical data, including imaging. Federated learning (FL) addresses these challenges by training models across decentralized institutions or devices holding local datasets, without centralizing patient data[199]. Instead, only encrypted model updates are exchanged, with a central server maintaining the global model and local models retained at each site. Contributions are quality-checked before aggregation to prevent malicious or biased updates. Unlike centralized learning, FL iteratively refines the global model via secure exchanges to deliver solutions suited for clinical environments[19,51,200-202]. Beyond privacy, FL offers systemic benefits: It trains on diverse, heterogeneous datasets, reduces algorithmic bias, enhances generalizability, and accelerates clinical adoption. For example, FL has been successfully applied to train diagnostic models for TB chest X-rays across multiple centers, including resource-limited settings, achieving high accuracy while safeguarding patient privacy[203].
Potential applications in healthcare[19,136]: (1) Privacy-preserving medical image analysis (e.g., FL-based TB chest X-ray models outperforming local models in heterogeneous African datasets); (2) Development of personalized treatment models by leveraging distributed clinical data; (3) Decentralized outbreak prediction using federated disease surveillance; (4) Collaborative drug discovery and real-world evidence generation while maintaining data sovereignty; and (5) Equipment manufacturers can address the digital imaging and communications in medicine issues faced by radiologists with modern ultrasound machines.
Integrating FL with LCMs can unite multimodal data streams: Text, imaging, genomics, and sensors - across institutions, producing more robust, globally relevant models without recurring data transfer and regulatory hurdles. Together, LCMs and FL form complementary pillars of next-generation radiology AI, enabling unbiased, scalable, and ethically sound solutions for TB, oncology, rare diseases, and beyond.
The hype cycle and challenges of LCMs in radiology AI
As with many emerging technologies, LCMs in radiology AI are also likely to face the classic “hype cycle” characterized by initial enthusiasm followed by a phase of realistic assessment of the significant challenges ahead[51,66,67]. While LCMs offer great promise to go beyond token-based models by enabling conceptual reasoning, multimodal integration, and long-range context synthesis, they remain early-stage technologies confronting formidable hurdles. A major barrier is the immense difficulty of creating large-scale, expertly annotated datasets for concepts, which is substantially more complex than annotating tokens. This requires capturing rich clinical knowledge, context, and multimodal correspondences at scale, a task necessitating intensive radiologist expertise and standardized annotation protocols[19,66]. This challenge was one of the core motivations behind our original concept-based framework[19,51]. While large-scale collaborations are mandated to ensure quality LCM development, earlier implementation challenges might be mitigated by the growing adoption of FL, which facilitates multi-institutional collaboration while preserving patient privacy and data security[19,51,199]. This collaborative approach holds promise in overcoming barriers related to data heterogeneity, regulatory compliance, and annotation costs. The innovative architectures underpinning LCMs, such as hierarchical and dual-tower models, introduce significant computational demands and require advanced infrastructure[67,73]. However, despite their computational intensity, these models are expected to be relatively more environmentally sustainable (“green”) compared to traditional LLMs, due to optimization techniques like embedding quantization and efficient diffusion mechanisms; offering potential energy savings without compromising performance[67,66]. This environmental consideration is critical for scaling LCMs in resource-constrained clinical settings. The path to clinical adoption hinges on a rigorous, multi-stage validation process involving prospective trials that assess diagnostic accuracy, safety, and generalizability across diverse patient populations. Encouragingly, LCMs’ enhanced transparency, explainability, and ethical alignment may expedite regulatory approvals, including from the FDA, overcoming challenges faced by earlier opaque AI models[19,51,66,67]. It is notable that although over 1000 AI-enabled medical devices have received FDA clearance, none to date involve LLM-based methodologies, underscoring the hurdles faced by black-box, token-based AI systems in clinical validation[97]. LCMs represent a meaningful advance toward AI systems that emulate human clinical reasoning and deliver actionable insights in radiology. A nuanced understanding of the hype cycle and attendant challenges promotes a balanced perspective, tempering undue optimism with cautious realism. The successful translation of LCMs from concept to clinical reality will require sustained interdisciplinary collaboration, adherence to robust standards for data and evaluation, ethical governance, and ongoing technical innovation.
Future directions and regulatory landscape
The future of LCMs in healthcare is vibrant and multi-faceted. Essential priorities include: (1) Developing hybrid LCM architectures combining symbolic and DL for optimal reasoning, interpretability, and clinical relevance; (2) Optimizing architectures to process the unique characteristics of medical data, such as high dimensionality and class imbalance; (3) Pioneering multimodal fusion techniques to align imaging, structured EHR, text, genomics, and sensor data for a holistic picture of patient health; (4) Integrating genomic data with imaging and clinical notes to enable personalized diagnostic and treatment strategies; (5) Advancing explainable AI methods, with clinician-in-the-loop development to provide actionable rationale and transparency; (6) Leveraging clinician expertise during model development to improve trust, transparency, and intuitive explanations; (7) Conducting large-scale, prospective, multi-center validation studies measuring clinical impact, workflow integration, cost-effectiveness, and outcomes compared to standard practice; and (8) Proactively addressing data privacy, robustness, regulatory compliance, and socio-technical challenges, including lifelong learning and real-time updates.
Regulatory considerations: Healthcare demands exceptionally high levels of accuracy and safety. Rigorous clinical validation is essential before deployment, particularly in radiology, where subtle visual details are crucial. Linking abstract concepts to specific anatomical and imaging findings remains non-trivial. Key considerations include: (1) Data privacy and safety: Compliance with Health Insurance Portability and Accountability Act (United States)[204] General Data Protection Regulation (Europe)[205], Digital Personal Data Protection Act, 2023 (India)[206]. FL can address these concerns while enabling collaborative research; (2) Data quality and bias: FDA[207,208], European Medicines Agency[209], Central Drugs Standard Control Organization of India[210] require careful curation of diverse datasets, bias detection, and mitigation strategies. Collaboration with clinicians from varied backgrounds is essential; (3) Transparency, explainability, ethical considerations and interoperability and standards (Health Level Seven International-The Standard, 2024) are also critical considerations that will effectively require regulatory approval[211]. Tools must provide interpretable outputs for clinical use; and (4) WHO guidance: WHO provides guidance on governance, ethics, and regulation to ensure AI use in healthcare is safe, effective, and trustworthy[212]. Chauhan et al[213] reviewed regulatory strategies across European Union, FDA, and Central Drugs Standard Control Organization of India, emphasizing adaptive approaches, pre-certification programs, and compliance advice. FDA guidance for AI-enabled medical devices highlights best practices in ML development, version control, patient safety, and lifecycle management data[209]. The European Commission’s broader AI regulations address societal implications, requiring careful ethical oversight[214]. The ethical implications of using LCMs for diagnosis and treatment decisions, including issues of accountability and patient autonomy, will require careful consideration and the development of ethical guidelines specific to this technology in healthcare.
Current status of AI approvals: As of early 2025, the FDA has cleared over 1000 AI models for clinical use, mainly in radiology and cardiology[97]. These models assist in image analysis, diagnostic support, and treatment planning. However, no LLMs have yet received FDA approval. Contributing factors include their black-box nature, regulatory frameworks not designed for generative models, and challenges in clinical validation. Despite these hurdles, LCMs are expected to achieve faster regulatory approval when developed. Their structured conceptual reasoning, transparency, ethical design, and alignment with clinical workflows can directly address the concerns limiting LLM approval. Interpretable outputs and validated reasoning pathways make LCMs more amenable to regulatory scrutiny, potentially accelerating their safe adoption in healthcare.
Summary
The evolution of AI, from its early symbolic roots championed by McCarthy et al[10] to the current era of LLMs and the emerging LCMs, represents a transformative leap in technological capability[58,102,215-217]. As William Gibson observed, “The future is already here - it’s just not evenly distributed” a sentiment particularly resonant with the nascent yet immensely promising stage of LCM development[67,68]. LCMs offer a path forward beyond the hallucinations and ambiguities of current AI models, providing robustness, deeper understanding, and trustworthiness, qualities critical in complex fields such as medicine[216,217]; and thus represent a significant addition to our AI armamentarium.
Clinical integration and medical relevance: LCMs facilitate the integration of diverse data sources - including imaging, clinical parameters, and patient history, while incorporating expert knowledge and explainable AI techniques. This synergy of “wisdom combined with knowledge” directly addresses limitations in current medical AI, particularly in radiology, where decisions demand a holistic understanding of the patient’s clinical context. Historically, the breadth and complexity of clinical knowledge posed encoding challenges[19]. LCMs overcome these hurdles by representing nuanced clinical concepts, combining accumulated knowledge and meta-knowledge (problem-solving strategies) with AI’s computational versatility. This enables a shift from reactive diagnostics to predictive, personalized patient management. Explainable AI also mitigates the “black box” problem[28], fostering trust among clinicians and ensuring the responsible and ethical deployment of AI[88]. This aligns with the growing recognition of the need for multi-modal AI in healthcare, where seamless integration and reasoning across diverse data sources are paramount. Active involvement of radiologists and other clinicians ensures clinical relevance, supporting adoption and maximizing the utility of LCMs in practice.
From intelligence to general intelligence: LCMs unlock a new realm of possibility, where the canvas of imagination becomes the blueprint for AI reality. They are capable of learning from experience and adapting to novel scenarios without explicit programming and bring us closer to AGI; incorporating reasoning, problem-solving, and creativity[65,66]. This progression toward AGI in healthcare emphasizes augmenting, not replacing, human expertise. Multi-modal reasoning, cross-domain knowledge integration, and coherence across extended contexts make LCMs revolutionary for radiology and allied fields, providing a roadmap toward cognitive AI, AGI, and beyond[65,66]. The emphasis remains on leveraging AI to empower human clinicians, leading to more informed decisions and ultimately superior patient care. Rigorous validation remains crucial. LCMs are still research concepts, and their real-world performance, scalability, and reliability in high-stakes domains must be empirically proven[72,147]. Investment in proof-of-concept studies and robust validation will solidify their clinical role.
Synergistic horizons - quantum computing
Quantum computing (QC) promises unprecedented computational power, accelerating the capabilities of AI in healthcare[19,51]. Advances such as 256-qubit superconducting quantum computers and topological qubits in 2D materials[218-220] provide stable, robust quantum states essential for practical AI applications. These advancements are crucial for addressing the current limitations of QC and paving the way for practical applications in healthcare[156,221]. QC is poised to revolutionize healthcare, including radiology and its allied fields with its ability to process complex biological data, simulate molecular interactions, and optimize drug discovery, at scales previously unimaginable, presenting unprecedented opportunities for personalized medicine and disease modeling[221].
LCM-quantum synergy: The combination of LCMs with quantum computational power could enable: (1) Enable complex simulations of molecular interactions and disease processes, advancing drug discovery and personalized medicine; (2) Accelerate data analysis across genomics, medical imaging, and patient records, improving diagnostic speed and accuracy; and (3) Reveal novel insights in medical data, identifying biomarkers, therapeutic targets, and disease mechanisms, advancing toward artificial super intelligence[65,66,221].
Beyond quantum - towards artificial super intelligence and the infraspace
Future horizons include quantum-enhanced artificial super intelligence, potentially surpassing human cognition and operating near physical limits[222-225]. The profound implication is that “with quantum-enhanced AI, recursive self-improvement, a scenario where an AI system refines its own algorithms in a continuous loop, could occur at a pace and scale unimaginable with classical hardware”. This leap in computational power might allow AI systems to achieve levels of understanding and creativity surpassing human capabilities, while raising critical questions about control, ethics, and the very definition of intelligence[224-227]. The development of such advanced AI must be accompanied by robust ethical guidelines and control mechanisms to ensure beneficial outcomes.
Speculative manifestations include: (1) Quantum sovereign intelligence: Autonomous quantum entities capable of mastery at the quantum scale; (2) The omega sentience: Conceptual unified consciousness addressing fundamental scientific and philosophical questions; and (3) Post-quantum frontier (infraspace): Intelligence operating below the quantum level, manipulating the informational substrate of reality[228]. Additionally, bio-inspired computation, such as quantum super-radiance in human cells[229,230] may guide future AI paradigms, merging biological and quantum processing for unprecedented efficiency. Imagine an AI that mimics the quantum super-radiance of biological cells to achieve unprecedented processing power. Such bio-quantum AI could revolutionize not only healthcare but also our understanding of consciousness and biological intelligence itself.
CONCLUSION
The future of AI in radiology and healthcare stands at the exhilarating crossroads of QC, biological inspiration, and ever-advancing conceptual intelligence. As researchers probe the mysteries of the quantum realm and the untapped computational capacity of living cells, possibilities for medical imaging, both diagnostic and interventional, are rapidly expanding. Upcoming breakthroughs could make “even Sherlock Holmes’s legendary deductive powers seem quaint”, not because they are diminished, but because the tools at our disposal will be far more powerful, holistic, and accessible. Imagine an AI system that continuously ingests cross-continental radiology studies, EMRs, multiomics, and real-time biosensor data, alongside global epidemic alerts, then notifies city imaging departments, hospitals, rural health centers, and other stakeholders about emerging complications, outbreaks, or subclinical trends before clinical suspicion arises - what was once fiction now stands on the precipice of feasibility. The climb from the symbolic “logic” of early AI, through ML, to LCMs’ growing “intuition”, is akin to a symphony expanding from a solo performance to a collaborative orchestra: Accessible, rich, and multidimensional. Truly, the “song” of modern AI is composed by entire ensembles, each player contributing wisdom from their own tradition. The time for bold, responsible innovation is now. For radiologists, LCMs and quantum-powered AI promise breathtaking acceleration in clinical, research, and workflow advances previously inconceivable. No longer do AI’s strengths lie solely in linear logic or brute-force computation; LCMs herald a leap into imaginative, multi-modal reasoning - integrating imaging, text, sensor streams, and molecular profiles for a richer diagnostic canvas. As Dickstein[230] has been attributed to have observed, “Prediction is very difficult, especially about the future”, yet the momentum in this space suggests we are barreling toward previously unimaginable advances. As Albert Einstein wisely stated, “Imagination is more important than knowledge”. This burgeoning field invites us to move beyond the known limitations of traditional AI and conceptualize applications previously confined to science fiction.
Democratization of radiology and healthcare: Radiology will not only benefit from AI democratization but also serve as a central force driving it. Future-facing systems could allow a radiologist in Mumbai, an imaging technician in rural Africa, a trauma specialist in New York, a primary care doctor in rural Brazil, and even an astronaut on a Mars mission to collaborate through an LCM-powered imaging assistant or “holographic doctor”[231] - a virtual colleague capable of synthesizing literature, images, patient histories, biosensor feeds, research insights, and expert wisdom, translating findings into any language or context. Imagine a patient’s smartwatch detecting subtle heart irregularities, cross-referencing their medical history, and triggering proactive, understandable alerts, empowering personalized, preventative care at scale and extending AI’s benefits to a broader range of users without requiring deep technical expertise. This democratization is crucial for fostering widespread innovation and ensuring LCM benefits are broadly accessible. LCMs encourage us to “think left and think right and think low and think high”[232]. While cautious optimism and responsible development are vital, the foundation of LCMs lies in human creativity to envision and implement complex, transformative concepts. “Creativity is intelligence having fun” (Albert Einstein). Concept-based, multi-modal AI is here; humans need not learn coding, as machines handle that. Let human intelligence have fun. This vision holds immense potential for empowering individuals and driving innovation across all sectors.
Accelerated validation and responsible deployment: Historically, the validation of groundbreaking scientific technologies has often taken decades or even centuries, as exemplified by foundational theories in physics. For instance, Newton’s laws of motion were formulated in the late 1600s, but their full implications took years to unfold; Maxwell’s equations in the 1860s unified electricity and magnetism long before they found widespread experimental confirmation. Most notably, Einstein’s field equations proposed in 1915 laid the foundation for general relativity, but precise experimental validations, including gravitational waves, occurred many decades later[233,234]. These examples illustrate how the realm of theoretical physics is widely respected for its foundational concepts, often preceding empirical proof by significant time spans. Similarly, concept-based AI models like LCMs represent an analogous leap in theoretical computer science. With accelerated computational innovations and with pioneering independent validation by leaders such as Meta AI, whose recent publications mirror our core propositions and terminology concerning LCMs[67]. LCMs are poised to rapidly transition from theory to practice. This means we can have clinically relevant, multi-institutionally validated tools reaching our domains within a much shorter cycle. While radiologists have traditionally been swift adopters of new imaging hardware (ultrasound, CT, MRI), they have shown more caution in embracing advanced AI software, seeking clinically validated and transparent tools that support rather than replace their expertise[235,236]. The field continues to rigorously evaluate ML and DL models, often focusing narrowly on these paradigms. Yet, more advanced AI paradigms like LLMs and especially LCMs offer substantially greater potential, enabling deeper semantic understanding, holistic data integration, and improved clinical decision-making. It is therefore imperative to overcome this inertia and adopt these transformative AI frameworks to catalyze the next massive leap in clinical radiology and patient care. The journey from theory to clinical impact will require the collaboration of experts from AI, linguistics, cognitive science, clinicians, and patients, as well as those from specific application domains like radiology and other allied healthcare fields[67,68]. This must be accompanied by the establishment of clear ethical guidelines for their deployment in critical applications[221]. The shift from token-based, single-task models to concept-based, multi-modal systems is the next major frontier in medical AI. It promises to transform the role of the radiologist from an image interpreter to a holistic patient diagnostician, working in a powerful human-AI partnership. Key hurdles remain: Ensuring robustness against bias, developing ethical norms, and sustaining transparent, explainable decision-making to maintain professional and patient trust. Progress in FL, radiologist-in-the-loop annotation, and alignment with regulatory expectations for clinical-grade AI will help mitigate development costs and accelerate reliable deployment. Robustness against imaging bias, clinically actionable explainability, and collaboration amongst radiologists, data scientists, and end-users are essential for safe translation into clinical workflows. Radiologist input remains crucial for curating datasets, evaluating model relevance, and assessing the utility of concept-based reasoning in nuanced imaging scenarios such as rare disease detection, or subtle lesion follow-up. As Thomas Edison aptly said, “The value of an idea lies in the using of it”, the challenge is now to translate LCM theory into safe, real-world systems that deliver on the promise of improved patient care.
Overcoming challenges for a holistic future: A holistic approach - combining diverse data, expert knowledge, and explainable AI, promises more accurate diagnoses, personalized treatments, and effective disease management. LCMs offer richer differential diagnoses, deeper pattern recognition across imaging and non-imaging data, and patient-specific longitudinal analysis spanning years of prior images and contextual reports. While current AI faces difficulties in complex reasoning, context-aware decision-making, and cross-modality integration, LCMs synthesize CT, MR, ultrasound, and molecular data, drawing not just from pixels but from clinical and biological context, raising both, expectations and standards for radiology AI performance. Given the critical need for trust and transparency, enhancing explainability remains paramount for clinician adoption and regulatory compliance[136]. Demonstrating AI functionality clearly to professionals unfamiliar with the technology is crucial, and annotation costs can be mitigated with semi-supervised and active learning techniques. It’s also important to acknowledge that the annotation and labeling required for training, especially when demanding human expertise, can be costly[136]. However, advances in semi-supervised learning and active learning techniques may help mitigate these costs and accelerate the development of robust LCMs. Radiology stands to benefit from FL, human-in-the-loop expert feedback, and interpretability methods highlighting why findings were flagged. These advancements are essential for building trust, calibrating clinical utility, and streamlining deployment in reading rooms. Emerging paradigms such as QC and advanced interpretability will further bridge the gap toward precision outcome modeling, including radiotherapy[136]. Imagine LCMs, powered by quantum breakthroughs, instantly analyzing a patient’s complete imaging archive, genomics, and real-time biosensor data to forecast pathology evolution, recommend tailored scan protocols, link imaging findings to prospective outcomes, or even optimize individualized treatment for complex conditions. AI systems could proactively suggest lifestyle interventions, optimize surgical plans with millimeter precision, or design novel drug molecules tailored to an individual’s biology. By embracing a creative, collaborative spirit and centering radiologist leadership in development and validation, LCMs can accelerate the arrival of patient-centered, precision imaging, tackling global challenges in cancer screening, infectious disease monitoring, and chronic illness management, while broadening access to world-class imaging expertise. As we walk this path, let Einstein’s insight guide us: “Logic will get you from A to B. Imagination will take you everywhere”. By fostering collaboration, addressing ethical considerations proactively, and investing in rigorous research, we can usher in a new era of intelligent systems that truly augment human capabilities. For radiology, this evolution means smarter, more insightful, and ultimately more human-centered patient care. The journey from early AI to LCMs and beyond is a testament to human ingenuity. This ongoing evolution, driven by both scientific curiosity and practical necessity, ensures AI will continue to play an increasingly vital role in shaping the future of medicine and human well-being. LCMs represent not merely a technological transition but a profound exercise in humanity’s drive to heal, empower, and imagine a healthier tomorrow. With foresight, rigor, and open-mindedness, the “possible” in healthcare stands ready to be redefined. As Clarke[237] famously posited: “Any sufficiently advanced technology is indistinguishable from magic”. With large conceptual models, radiologists and allied professionals are not merely building tools, they are crafting the next generation of “magic” for radiology and healthcare, transforming the seemingly impossible into clinical reality and elevating the impact of radiology at the heart of precision healthcare.
ACKNOWLEDGEMENTS
The authors wish to acknowledge Dr. Mohammed Mulla Jafarali Mamjiwala for his dedicated interest in the topic, passion for academic pursuits, and his help with the bibliography section. The author sincerely hopes this encouragement and exposure inspires Dr. Mohammed Mulla Jafarali Mamjiwala toward future academic endeavors.
Footnotes
Provenance and peer review: Invited article; Externally peer reviewed.
Peer-review model: Single blind
Specialty type: Radiology, nuclear medicine and medical imaging
Country of origin: India
Peer-review report’s classification
Scientific Quality: Grade A, Grade A
Novelty: Grade A, Grade A
Creativity or Innovation: Grade A, Grade A
Scientific Significance: Grade A, Grade A
P-Reviewer: Omullo FP, MD, Senior Researcher, Kenya S-Editor: Bai Y L-Editor: A P-Editor: Lei YY
Adams LC, Truhn D, Busch F, Kader A, Niehues SM, Makowski MR, Bressem KK. Leveraging GPT-4 for Post Hoc Transformation of Free-text Radiology Reports into Structured Reporting: A Multilingual Feasibility Study.Radiology. 2023;307:e230725.
[RCA] [PubMed] [DOI] [Full Text][Cited by in RCA: 190][Reference Citation Analysis (0)]
McDuff D, Schaekermann M, Tu T, Palepu A, Wang A, Garrison J, Singhal K, Sharma Y, Azizi S, Kulkarni K, Hou L, Cheng Y, Liu Y, Mahdavi SS, Prakash S, Pathak A, Semturs C, Patel S, Webster DR, Dominowska E, Gottweis J, Barral J, Chou K, Corrado GS, Matias Y, Sunshine J, Karthikesalingam A, Natarajan V. Towards accurate differential diagnosis with large language models.Nature. 2025;642:451-457.
[RCA] [PubMed] [DOI] [Full Text][Cited by in Crossref: 3][Cited by in RCA: 32][Article Influence: 32.0][Reference Citation Analysis (0)]
Zhong T, Zhao W, Zhang Y, Pan Y, Dong P, Jiang Z, Jiang H, Zhou Y, Kui X, Shang Y, Zhao L, Yang L, Wei Y, Li Z, Zhang J, Yang L, Chen H, Zhao H, Liu Y, Zhu N, Li Y, Wang Y, Yao J, Wang J, Zeng Y, He L, Zheng C, Zhang Z, Li M, Liu Z, Dai H, Wu Z, Zhang L, Zhang S, Cai X, Hu X, Zhao S, Jiang X, Zhang X, Liu W, Li X, Zhu D, Guo L, Shen D, Han J, Liu T, Liu J, Zhang T. ChatRadio-Valuer: A Chat Large Language Model for Generalizable Radiology Impression Generation on Multi-institution and Multi-system Data.IEEE Trans Biomed Eng. 2025;PP.
[PubMed] [DOI] [Full Text]
McCarthy J, Minsky ML, Rochester N, Shannon CE. A Proposal for the Dartmouth Summer Research Project on Artificial Intelligence, August 31, 1955.AI Magazine. 2006;27:12.
[PubMed] [DOI] [Full Text]
Wood R.
Historical Context and Evolution of AI/ML: A Journey Through Time. In: Wood R. Business Applications of Artificial Intelligence and Machine Learning. Oklahoma: Oklahoma State Regents for Higher Education, 2024.
[PubMed] [DOI]
Cho K, van Merrienboer B, Bahdanau D, Bengio Y.
On the Properties of Neural Machine Translation: Encoder–Decoder Approaches. Proceedings of SSST-8, Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation; 2014 Oct, Doha, Qatar. Pennsylvania: Association for Computational Linguistics, 2014: 103-111.
[PubMed] [DOI]
Guidotti R, Monreale A, Ruggieri S, Turini F, Giannotti F, Pedreschi D. A Survey of Methods for Explaining Black Box Models.ACM Comput Surv. 2019;51:1-42.
[PubMed] [DOI] [Full Text]
Bahdanau D, Cho K, Bengio Y.
Neural Machine Translation by Jointly Learning to Align and Translate. 2016 Preprint. Available from: arXiv: 1409.0473.
[PubMed] [DOI] [Full Text]
Nam Y, Kim DY, Kyung S, Seo J, Song JM, Kwon J, Kim J, Jo W, Park H, Sung J, Park S, Kwon H, Kwon T, Kim K, Kim N. Multimodal Large Language Models in Medical Imaging: Current State and Future Directions.Korean J Radiol. 2025;26:900-923.
[RCA] [PubMed] [DOI] [Full Text][Reference Citation Analysis (0)]
Devlin J, Chang MW, Lee K, Toutanova K.
2019. BERT: Pre-training of deep bidirectional transformers for language understanding. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers); 2019 Jun; Minneapolis, Minnesota. Pennsylvania: Association for Computational Linguistics, 2019: 4171-4186.
[PubMed] [DOI]
Thoppilan R, De Freitas D, Hall J, Shazeer N, Kulshreshtha A, Cheng H, Jin A, Bos T, Baker L, Du Y, Li YG, Lee H, Zheng HS, Ghafouri A, Menegali M, Huang Y, Krikun M, Lepikhin D, Qin J, Chen D, Xu Y, Chen Z, Roberts A, Bosma M, Zhao V, Zhou Y, Chang CC, Krivokon I, Rusch W, Pickett M, Srinivasan P, Man L, Meier-Hellstern K, Morris MR, Doshi T, Delos Santos R, Duke T, Soraker J, Zevenbergen B, Prabhakaran V, Diaz M, Hutchinson B, Olson K, Molina A, Hoffman-John E, Lee J, Aroyo L, Rajakumar R, Butryna A, Lamm M, Kuzmina V, Fenton J, Cohen A, Bernstein R, Kurzweil R, Aguera-Arcas B, Cui C, Croak M, Chi E, Le Q.
LaMDA: Language Models for Dialog Applications. 2022 Preprint. Available from: arXiv: 2201.08239.
[PubMed] [DOI] [Full Text]
Nazi ZA, Peng W.
Large Language Models in Healthcare and Medical Domain: A Review. 2024 Preprint. Available from: arXiv: 2401.06775.
[PubMed] [DOI] [Full Text]
Chen ZM, Hernández Cano AH, Romanou A, Bonnet A, Matoba K, Salvi F, Pagliardini M, Fan S, Köpf A, Mohtashami A, Sallinen A, Sakhaeirad A, Swamy V, Krawczuk I, Bayazit D, Marmet A, Montariol S, Hartley MA, Jaggi M, Bosselut A.
2023. MEDITRON-70B: Scaling Medical Pretraining for Large Language Models. 2023 Preprint. Available from: arXiv: 2311.16079.
[PubMed] [DOI] [Full Text]
Singhal K, Azizi S, Tu T, Mahdavi SS, Wei J, Chung HW, Scales N, Tanwani A, Cole-Lewis H, Pfohl S, Payne P, Seneviratne M, Gamble P, Kelly C, Babiker A, Schärli N, Chowdhery A, Mansfield P, Demner-Fushman D, Agüera Y Arcas B, Webster D, Corrado GS, Matias Y, Chou K, Gottweis J, Tomasev N, Liu Y, Rajkomar A, Barral J, Semturs C, Karthikesalingam A, Natarajan V. Large language models encode clinical knowledge.Nature. 2023;620:172-180.
[RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)][Cited by in Crossref: 1698][Cited by in RCA: 1055][Article Influence: 527.5][Reference Citation Analysis (0)]
Labrak Y, Bazoge A, Morin E, Gourraud P, Rouvier M, Dufour R.
BioMistral: A Collection of Open-Source Pretrained Large Language Models for Medical Domains. Findings of the Association for Computational Linguistics ACL 2024; 2024 Aug; Bangkok, Thailand. PA, United States: Association for Computational Linguistics, 2024: 5848-5864.
[PubMed] [DOI]
Xiong GZ, Jin Q, Lu ZY, Zhang AD.
Benchmarking Retrieval-Augmented Generation for Medicine. Findings of the Association for Computational Linguistics ACL 2024; 2024 Aug 11-16; Bangkok, Thailand. PA, United States: Association for Computational Linguistics, 2024: 6233-6251.
[PubMed] [DOI]
Lewis P, Perez E, Piktus A, Petroni F, Karpukhin V, Goyal N, Küttler H, Lewis M, Yih W, Rocktäschel T, Riedel S, Kiela D.
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks. 2021 Preprint. Available from: arXiv: 2005.11401.
[PubMed] [DOI] [Full Text]
Pan LR, Zhao ZY, Lu Y, Tang KW, Fu LY, Liang QC, Peng SL. Opportunities and challenges in the application of large artificial intelligence models in radiology.Meta Radiol. 2024;2:100080.
[PubMed] [DOI] [Full Text]
Jiang LY, Liu XC, Nejatian NP, Nasir-Moin M, Wang D, Abidin A, Eaton K, Riina HA, Laufer I, Punjabi P, Miceli M, Kim NC, Orillac C, Schnurman Z, Livia C, Weiss H, Kurland D, Neifert S, Dastagirzada Y, Kondziolka D, Cheung ATM, Yang G, Cao M, Flores M, Costa AB, Aphinyanaphongs Y, Cho K, Oermann EK. Health system-scale language models are all-purpose prediction engines.Nature. 2023;619:357-362.
[RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)][Cited by in Crossref: 290][Cited by in RCA: 201][Article Influence: 100.5][Reference Citation Analysis (0)]
Mei LR, Yao JY, Ge YY, Wang YW, Bi BL, Cai YJ, Liu JZ, Li MY, Li ZZ, Zhang DZ, Zhou CL, Mao JY, Xia TZ, Guo JF, Liu SH.
A Survey of Context Engineering for Large Language Models. 2025 Preprint. Available from: arXiv: 2507.13334.
[PubMed] [DOI] [Full Text]
Xu S, Yang L, Kelly C, Sieniek M, Kohlberger T, Ma M, Weng WH, Kiraly A, Kazemzadeh S, Melamed Z, Park J, Strachan P, Liu Y, Lau C, Singh P, Chen C, Etemadi M, Kalidindi SR, Matias Y, Chou K, Corrado GS, Shetty S, Tse D, Prabhakara S, Golden D, Pilgrim R, Eswaran K, Sellergren A.
2023. Elixr: Towards a general purpose x-ray artificial intelligence system through alignment of large language models and radiology vision encoders. 2023 Preprint. Available from: arXiv: 2308.01317.
[PubMed] [DOI] [Full Text]
Gu J, Cho HC, Kim J, You K, Hong EK, Roh B.
CheX-GPT: Harnessing large language models for enhanced chest x-ray report labeling. 2024 Preprint. Available from: arXiv: 2401.11505.
[PubMed] [DOI] [Full Text]
Dhanaliwala AH, Ghosh R, Karn SK, Ullaskrishnan P, Farri O, Comaniciu D, Kahn CE.
2023. General-purpose vs. domain-adapted large language models for extraction of data from thoracic radiology reports. 2024 Preprint. Available from: arXiv: 2311.17213.
[PubMed] [DOI] [Full Text]
Barrault L, Duquenne PA, Elbayad M, Kozhevnikov A, Alastruey B, Andrews P, Coria M, Couairon G, Costa-jussà MR, Dale D, Elsahar H, Heffernan K, Janeiro JM, Tran T, Ropers C, Sánchez E, Roman RS, Mourachko A, Saleem S, Schwenk H.
Large Concept Models: Language Modeling in a Sentence Representation Space. 2024 Preprint. Available from: arXiv: 2412.08821.
[PubMed] [DOI] [Full Text]
Ahmed H, Goel D.
The Future of AI: Exploring the Potential of Large Concept Models. 2025 Preprint. Available from: arXiv: 2501.05487.
[PubMed] [DOI] [Full Text]
Esmradi A, Yip DW, Chan CF.
A Comprehensive Survey of Attack Techniques, Implementation, and Mitigation Strategies in Large Language Models. Communications in Computer and Information Science. Springer: Singapore, 2024: 76-95.
[PubMed] [DOI]
Chen M, Tworek J, Jun H, Yuan Q, de Oliveira Pinto HP, Kaplan J, Edwards H, Burda Y, Joseph N, Brockman G, Ray A, Puri R, Krueger G, Petrov M, Khlaaf H, Sastry G, Mishkin P, Chan B, Gray S, Ryder N, Pavlov M, Power A, Kaiser L, Bavarian M, Winter C, Tillet P, Petroski Such FP, Cummings D, Plappert M, Chantzis F, Barnes E, Herbert-Voss A, Guss WH, Nichol A, Paino A, Tezak N, Tang J, Babuschkin I, Balaji S, Jain S, Saunders W, Hesse C, Carr AN, Leike J, Achiam J, Misra V, Morikawa E, Radford A, Knight M, Brundage M, Murati M, Mayer K, Welinder P, McGrew B, Amodei D, McCandlish S, Sutskever I, Zaremba W.
Evaluating Large Language Models Trained on Code. 2021 Preprint. Available from: arXiv: 2107.03374.
[PubMed] [DOI] [Full Text]
Guo Y, Guo M, Su J, Yang Z, Zhu M, Li H, Qiu M, Liu SS.
2024. Bias in Large Language Models: Origin, Evaluation, and Mitigation. 2024 Preprint. Available from: arXiv: 2411.10915.
[PubMed] [DOI] [Full Text]
Huang L, Yu W, Ma W, Zhong W, Feng Z, Wang H, Chen Q, Peng W, Feng X, Qin B, Liu T. A Survey on Hallucination in Large Language Models: Principles, Taxonomy, Challenges, and Open Questions.ACM Trans Inf Syst. 2025;43:1-55.
[PubMed] [DOI] [Full Text]
Mozes M, He XL, Kleinberg B, Griffin LD.
Use of LLMs for Illicit Purposes: Threats, Prevention Measures, and Vulnerabilities. 2023 Preprint. Available from: arXiv: 2411. 2308.12833.
[PubMed] [DOI] [Full Text]
Atil B, Chittams A, Fu L, Ture F, Xu L, Baldwin B.
LLM Stability: A detailed analysis with some surprises. 2025 Preprint. Available from: arXiv: 2408.04667.
[PubMed] [DOI] [Full Text]
Hendrycks D, Burns C, Basart S, Zou A, Mazeika M, Song D, Steinhardt J.
Measuring Massive Multitask Language Understanding. 2021 Preprint. Available from: arXiv: 2009.03300.
[PubMed] [DOI] [Full Text]
Bender EM, Gebru T, Mcmillan-Major A, Shmitchell S.
On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency. 2021 Mar 3-10; Canada. New York: Association for Computing Machinery, 2021: 610-623.
[PubMed] [DOI]
Faiz A, Kaneda S, Wang R, Osi R, Sharma P, Chen F, Jiang L.
LLM Carbon: Modeling the End-To-End Carbon Footprint of Large Language Models. 2024 Preprint. Available from: arXiv: 2309.14393.
[PubMed] [DOI] [Full Text]
Wu JD, Ji W, Fu HZ, Xu M, Jin YM, Xu YW. MedSegDiff-V2: Diffusion-Based Medical Image Segmentation with Transformer.Proc AAAI Conf Artif Intell. 2024;38:6030-6038.
[PubMed] [DOI] [Full Text]
Tong R, Xu T, Ju XX, Wang. LR. Progress in Medical AI: Reviewing Large Language Models and Multimodal Systems for Diagonosis.AI Med. 2025;1:5.
[PubMed] [DOI] [Full Text]
Open AI, Achiam J, Adler S, Agarwal S, Ahmad L, Akkaya I, Aleman FL, Almeida D, Altenschmidt J, Altman S, Anadkat S, Avila R, Babuschkin I, Balaji S, Balcom V, Baltescu P, Bao H, Bavarian M, Belgum J, Bello I, Berdine J, Bernadett-Shapiro G, Berner C, Bogdonoff L, Boiko O, Boyd M, Brakman AL, Brockman G, Brooks T, Brundage M, Button K, Cai T, Campbell R, Cann A, Carey B, Carlson C, Carmichael R, Chan B, Chang C, Chantzis F, Chen D, Chen S, Chen R, Chen J, Chen M, Chess B, Cho C, Chu C, Chung HW, Cummings D, Currier J, Dai Y, Decareaux C, Degry T, Deutsch N, Deville D, Dhar A, Dohan D, Dowling S, Dunning S, Ecoffet A, Eleti A, Eloundou T, Farhi D, Fedus L, Felix N, Fishman SP, Forte J, Fulford I, Gao L, Georges E, Gibson C, Goel V, Gogineni T, Goh G, Gontijo-Lopes R, Gordon J, Grafstein M, Gray S, Greene R, Gross J, Gu SS, Guo Y, Hallacy C, Han J, Harris J, He Y, Heaton M, Heidecke J, Hesse C, Hickey A, Hickey W, Hoeschele P, Houghton B, Hsu K, Hu S, Hu X, Huizinga J, Jain S, Jain S, Jang J, Jiang A, Jiang R, Jin H, Jin D, Jomoto S, Jonn B, Jun H, Kaftan T, Kaiser Ł, Kamali A, Kanitscheider I, Keskar NS, Khan T, Kilpatrick L, Kim JW, Kim C, Kim Y, Kirchner JH, Kiros J, Knight M, Kokotajlo D, Kondraciuk Ł, Kondrich A, Konstantinidis A, Kosic K, Krueger G, Kuo V, Lampe M, Lan I, Lee T, Leike J, Leung J, Levy D, Li CM, Lim R, Lin M, Lin S, Litwin M, Lopez T, Lowe R, Lue P, Makanju A, Malfacini K, Manning S, Markov T, Markovski Y, Martin B, Mayer K, Mayne A, McGrew B, McKinney SM, McLeavey C, McMillan P, McNeil J, Medina D, Mehta A, Menick J, Metz L, Mishchenko A, Mishkin P, Monaco V, Morikawa E, Mossing D, Mu T, Murati M, Murk O, Mély D, Nair A, Nakano R, Nayak R, Neelakantan A, Ngo R, Noh H, Ouyang L, O'Keefe C, Pachocki J, Paino A, Palermo J, Pantuliano A, Parascandolo G, Parish J, Parparita E, Passos A, Pavlov M, Peng A, Perelman A, de Avila Belbute P, Petrov M, de Oliveira Pinto HP, Michael, Pokorny, Pokrass M, Pong VH, Powell T, Power A, Power B, Proehl E, Puri R, Radford A, Rae J, Ramesh A, Raymond C, Real F, Rimbach K, Ross C, Rotsted B, Roussez H, Ryder N, Saltarelli M, Sanders T, Santurkar S, Sastry G, Schmidt H, Schnurr D, Schulman J, Selsam D, Sheppard K, Sherbakov T, Shieh J, Shoker S, Shyam P, Sidor S, Sigler E, Simens M, Sitkin J, Slama K, Sohl I, Sokolowsky B, Song Y, Staudacher N, Such FP, Summers N, Sutskever I, Tang J, Tezak N, Thompson MB, Tillet P, Tootoonchian A, Tseng E, Tuggle P, Turley N, Tworek J, Uribe J FC, Vallone A, Vijayvergiya A, Voss C, Wainwright C, Wang JJ, Wang A, Wang B, Ward J, Wei J, Weinmann CJ, Welihinda A, Welinder P, Weng J, Weng L, Wiethoff M, Willner D, Winter C, Wolrich S, Wong H, Workman L, Wu S, Wu J, Wu M, Xiao K, Xu T, Yoo S, Yu K, Yuan Q, Zaremba W, Zellers R, Zhang C, Zhang M, Zhao S, Zheng T, Zhuang J, Zhuk W, Zoph B.
GPT-4 Technical Report. March. 2024 Preprint. Available from: arXiv: 2303.08774.
[PubMed] [DOI] [Full Text]
Zhao WX, Zhou K, Li J, Tang T, Wang X, Hou Y, Min Y, Zhang B, Zhang J, Dong Z, Du Y, Yang C, Chen Y, Chen Z, Jiang J, Ren R, Li Y, Tang X, Liu Z, Liu P, Nie JY, Wen JR.
2023. A Survey of Large Language Models. 2025 Preprint. Available from: arXiv: 2303.18223.
[PubMed] [DOI] [Full Text]
AIbase.
Meta Launches 'Large Concept Models' (LCMs)! Breaking Through LLM Limitations and Leading a New Direction in AI Language Understanding. Dec 16, 2024. [cited 3 September 2025]. Available from: https://www.aibase.com/news/13985.
[PubMed] [DOI]
Li ZW, Xu Q, Zhang D, Song H, Cai YQ, Qi Q, Zhou R, Pan JT, Li ZF, Vu VT, Huang ZD, Wang T.
Grounding GPT: Language enhanced multi-modal grounding model. 2024 Preprint. Available from: arXiv: 2401.06071.
[PubMed] [DOI] [Full Text]
Zhang Z, Wang Y, Yao Q.
Searching meta reasoning skeleton to guide LLM reasoning. 2025 Preprint. Available from: arXiv: 2510.04116.
[PubMed] [DOI] [Full Text]
Wang HR, Shu K.
Explainable Claim Verification via Knowledge-Grounded Reasoning with Large Language Models. Findings of the Association for Computational Linguistics: EMNLP; 2023 Dec 2023; Singapore. Pennsylvania: Association for Computational Linguistics, 2023: 6288-6304.
[PubMed] [DOI]
Bolón-canedo V, Morán-fernández L, Cancela B, Alonso-betanzos A. A review of green artificial intelligence: Towards a more sustainable future.Neurocomputing. 2024;599:128096.
[PubMed] [DOI] [Full Text]
Abimbola A. A Review of the Historical Development and Future Significance of Artificial Intelligence in Radiology.medtigo J Med. 2024;2:e3062253.
[PubMed] [DOI] [Full Text]
Stanwick PA, Stanwick SD. The Rise and Fall of Eastman Kodak: Looking Through Kodachrome Colored Glasses.Am J Hum Soc Sci Res. 2020;4:219-224.
[PubMed] [DOI]
Christensen CM.
The Innovator's Dilemma: When New Technologies Cause Great Firms to Fail. Harvard Business School. Boston: Harvard Business, 1997.
[PubMed] [DOI]
Higgins JM.
Innovate or Evaporate: Test and Improve Your Organization's Innovation Quotient. New York: New Management Publishing Company, 1995.
[PubMed] [DOI]
Ueda D, Katayama Y, Yamamoto A, Ichinose T, Arima H, Watanabe Y, Walston SL, Tatekawa H, Takita H, Honjo T, Shimazaki A, Kabata D, Ichida T, Goto T, Miki Y. Deep Learning-based Angiogram Generation Model for Cerebral Angiography without Misregistration Artifacts.Radiology. 2021;299:675-681.
[RCA] [PubMed] [DOI] [Full Text][Cited by in Crossref: 5][Cited by in RCA: 21][Article Influence: 5.3][Reference Citation Analysis (0)]
Wisniewski AG, Shiraz Bhurwani MM, Sommer KN, Monteiro A, Baig A, Davies J, Siddiqui A, Ionita CN. Quantitative angiography prognosis of intracranial aneurysm treatment failure using parametric imaging and distal vessel analysis.Proc SPIE Int Soc Opt Eng. 2022;12036:120360D.
[RCA] [PubMed] [DOI] [Full Text][Reference Citation Analysis (0)]
Fujimura S, Koshiba T, Kudo G, Takeshita K, Kazama M, Karagiozov K, Fukudome K, Takao H, Ohwada H, Murayama Y, Yamamoto M, Ishibashi T, Otani K. Development of Machine Learning Model for Selecting the 1st Coil in the Treatment of Cerebral Aneurysms by Coil Embolization.Annu Int Conf IEEE Eng Med Biol Soc. 2023;2023:1-4.
[RCA] [PubMed] [DOI] [Full Text][Reference Citation Analysis (0)]
Bagcilar O, Alis D, Alis C, Seker ME, Yergin M, Ustundag A, Hikmet E, Tezcan A, Polat G, Akkus AT, Alper F, Velioglu M, Yildiz O, Selcuk HH, Oksuz I, Kizilkilic O, Karaarslan E. Automated LVO detection and collateral scoring on CTA using a 3D self-configuring object detection network: a multi-center study.Sci Rep. 2023;13:8834.
[RCA] [PubMed] [DOI] [Full Text][Reference Citation Analysis (0)]
Park D, So K, Prabhakar SK, Kim C, Lee JJ, Sohn JH, Kim JH, Lee SH, Won DO. Early warning score and feasible complementary approach using artificial intelligence-based bio-signal monitoring system: a review.Biomed Eng Lett. 2025;15:717-734.
[PubMed] [DOI] [Full Text]
Karera A, Davidson F, Engel-Hills P. Operational challenges and collaborative solutions in radiology image interpretation: perspectives from imaging departments in a low-resource setting.J Med Radiat Sci. 2024;71:564-572.
[RCA] [PubMed] [DOI] [Full Text][Reference Citation Analysis (0)]
Kim SH, Min JH, Lee JY. How to Differentiate Inactive from Active Disease in Patients of Primary Multidrug-resistant Tuberculosis with Persistent Cavity after Anti-tuberculous Therapy.Hong Kong J Radiol. 2014;17:240-246.
[RCA] [PubMed] [DOI] [Full Text][Reference Citation Analysis (0)]
Memon S, Bibi S, He G. Integration of AI and ML in Tuberculosis (TB) Management: From Diagnosis to Drug Discovery.Diseases. 2025;13:184.
[PubMed] [DOI] [Full Text]
de Camargo TFO, Ribeiro GAS, da Silva MCB, da Silva LO, Torres PPTES, Rodrigues DDSDS, de Santos MON, Filho WS, Rosa MEE, Novaes MA, Massarutto TA, Junior OL, Yanata E, Reis MRDC, Szarf G, Netto PVS, de Paiva JPQ. Clinical validation of an artificial intelligence algorithm for classifying tuberculosis and pulmonary findings in chest radiographs.Front Artif Intell. 2025;8:1512910.
[RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)][Cited by in RCA: 1][Reference Citation Analysis (0)]
Fabila J, Garrucho L, Campello VM, Martín-Isla C, Lekadir K.
Federated learning in low-resource settings: A chest imaging study in Africa -- Challenges and lessons learned. 2025 Preprint. Available from: arXiv: 2505.14217.
[PubMed] [DOI]
The European Data Protection Regulation.
General Data Protection Regulation – GDPR. 2018. [cited 3 September 2025]. Available from: https://gdpr-info.eu.
[PubMed] [DOI]
Chauhan SB, Gaur R, Akram A, Singh I. Artificial Intelligence Driven insights for Regulatory Intelligence in Medical Devices: Evaluating EMA, FDA and CDSCO Frameworks.GlobalCE. 2025;7:11-24.
[PubMed] [DOI] [Full Text]
Rafalski K.
AGI vs ASI: Understanding the fundamental differences between artificial general intelligence and artificial superintelligence. Sep 9, 2025. [cited 10 September 2025]. Available from: https://www.netguru.com/blog/agi-vs-asi.
[PubMed] [DOI]
Kokotajlo D, Alexander S, Larsen T, Lifland E, Dean R.
AI 2027: Predictive scenario and analysis of superhuman artificial intelligence development and its global impact over the next decade. [cited 3 September 2025]. Available from: https://ai-2027.com/.
[PubMed] [DOI]