Copyright
©The Author(s) 2025.
World J Radiol. Nov 28, 2025; 17(11): 114754
Published online Nov 28, 2025. doi: 10.4329/wjr.v17.i11.114754
Published online Nov 28, 2025. doi: 10.4329/wjr.v17.i11.114754
Figure 1 The conceptual shift from large language models to large concept models.
This diagram illustrates the fundamental difference in processing. Large language models operate via a sequential, token-based analysis, leading to potential context fragmentation and struggles with long-range dependencies. Large concept models operate on a relational, conceptual level, building a deeper relational understanding and maintaining coherence across longer contexts. LLM: Large language model; LCM: Large concept model.
Figure 2 Architectural contrast of data processing in large language models vs large concept models.
This diagram contrasts the linear, text-focused processing of large language model (LLM) with the integrative, multimodal architecture of large concept models (LCMs). A: LLM - language-centric flow: Text input flows into a centralized language processing core, producing text-based output. This represents LLM’ sequential, language-driven pipeline; B: LCM - multimodal concept integration: Multiple input types - text, image, audio, structured data - converge into a more complex processing core capable of extracting and reasoning over abstract, cross-modal concepts. The output is richer and reflects a broader understanding that spans diverse modalities. LLM: Large language model; LCM: Large concept model.
Figure 3 Causes and types of large language models hallucinations and other forms of model incorrectness (such as Illusion, delirium, extrapolation, delusion, confabulation).
LLM: Large language model.
Figure 4 Large concept models.
A: Large concept models detailed architecture illustrates language-agnostic, multimodal data flow with universal concept encoding, hierarchical reasoning, and multilingual output capability. This figure details the end-to-end processing pipeline of a large concept model, contrasting it with traditional language-centric artificial intelligence. The architecture is composed of two main sections: Top (multimodal data flow): Illustrates the flow from a Language-Agnostic Multimodal Input (accepting text, images, audio, video, and sensor data) to a Concept Encoder (like SONAR) which performs universal concept extraction. This encoded concept is then processed in a Concept Embedding Space via multimodal reasoning and diffusion processes. Finally, a Concept Decoder generates a Multimodal Output in any language or modality, demonstrating a true “any-input-to-any-output” capability. Bottom (core capabilities): Summarizes the three foundational pillars that this architecture enables: (1) Language agnostic universal language understanding across hundreds of languages; (2) Multimodal integration for unified concept understanding and reasoning across diverse data types; and (3) Universal concepts, enabling high-level abstract, logical, and causal reasoning that is independent of culture or domain; B: Reveals the color coding, visual elements, input modality icons, key processing components and capability categories illustrated regarding large concept models shown in Figure 4A. LCM: Large concept model.
Figure 5 Strategies to address all forms of incorrectness in large concept models.
These strategies work together to address the full spectrum of incorrectness issues - from outright hallucinations to subtle misinterpretations, inappropriate confidence, and contextual misunderstandings. The goal isn’t just factual accuracy but a more reliable, trustworthy and useful artificial intelligence system overall. RAG: Retrieval-augmented generation; RLHF: Reinforcement learning from human feedback.
Figure 6 Tuberculosis primary complex - the classic Ghon complex.
A: As seen in in a 5-year-old child: (1) Multiple: Ghon foci in the right upper lobe; (2) Draining lymphatics heading towards; (3) Right hilar lymphadenopathy (white open arrowheads); and (4) Minimal (lamellar) effusion in right horizontal fissure with minimal blunting of the right costo-phrenic angle. Additionally, a retrocardiac air bronchogram indicates tuberculosis (TB) broncho pneumonia (circled area) and hepatomegaly are noted. (Image courtesy Dr. Prakash V Vaidya, Child Health Clinic, Mumbai and Sr. Consultant Pediatrician, Fortis Hospital, Mumbai 400080); B: TB primary complex in a 5-year-old child: The classic Ghon Complex: (1) Multiple: Ghon foci in the rt. upper lobe; (2) Draining lymphatics [(1) and (2) are the areas marked with a red outline]; (3) Heading towards right. hilar lymphadenopathy (large white open arrowheads); and (4) Minimal (lamellar) effusion in right horizontal fissure (white arrows) with minimal blunting of the right costo-phrenic angle. Additionally, a retrocardiac air bronchogram indicates TB broncho pneumonia (circled area), and hepatomegaly are noted. (Image courtesy Dr. Prakash V Vaidya, Child Health Clinic, Mumbai and Sr. Consultant Pediatrician, Fortis Hospital, Mumbai 400080).
Figure 7 Computed tomography image.
A: Childhood tuberculosis primary Ghons’ complex computed tomography appearance: Laterally positioned Ghon foci with sub-pleural involvement, with their draining lymphatics heading all the way up to right hilar lymphadenopathy. Additionally note mediastinal lymphadenopathy (many lymph nodes reveal necrotic areas); B: Computed tomography image: Large thick-walled tuberculosis cavity communicating with the right main bronchus in a multidrug-resistant patient. (Image courtesy Dr. Anagha Joshi, Prof and HOD, Radiology, LTMMC and LTMGH, Mumbai 400022).
Figure 8 Axial computed tomography images in soft tissue window and lung window in a 59-year-old female, with sputum culture positive for mycobacterium tuberculosis.
A-C: Demonstrates a thick-walled cavity in the right upper lobe. After 4 months of anti-tuberculosis treatment; D-F: Reduced wall thickness of the right upper lobe cavity is noted (from 12.5 mm to 8.14 mm) and the cavity appears much smoother. (Image courtesy Dr. Anagha Joshi, Prof and HOD, Radiology, LTMMC and LTMGH, Mumbai 400022).
Figure 9 Frontal chest radiograph.
A: Frontal chest radiograph of a 14-year-old male presenting with cough and fever. Radiography demonstrates a thin-walled tuberculous cavity in the right upper lobe (white open arrowheads); B: Frontal chest radiograph of 9-year-old male with vomiting and abdominal pain, subsequently diagnosed with MDR tuberculosis (TB). Miliary pattern seen throughout both lung fields along with multiple larger lesions indicating multiple developing cavities (circled area), minimal effusion in the right horizontal fissure (white arrows) and hepatomegaly. Multiple air-fluid levels are seen in the visualized abdomen, signifying intestinal obstruction due to TB adhesions (short black arrows). (Figure 9A and B courtesy Dr. Jairaj Nair. Prof and HOD, Chest Medicine, LTMMC and LTMGH, Mumbai 400022); C: Frontal chest radiograph of 12-year-old boy with secondary hemophagocytic lymphohistiocytosis as a result of disseminated TB. Numerous miliary nodules are noted in both lungs (extensively) with additional patchy airspace opacities noted in the medial aspect of right lower lobe (note air-bronchogram). Calcified right hilar lymph nodes are also noted. He recovered well post treatment for same (including anti-TB treatment). (Image courtesy Dr. Prakash V Vaidya, Child Health Clinic, Mumbai and Sr. Consultant Pediatrician, Fortis Hospital, Mumbai 400080).
- Citation: Merchant SA, Merchant N, Varghese SL, Shaikh MJS. Large language models and large concept models in radiology: Present challenges, future directions, and critical perspectives. World J Radiol 2025; 17(11): 114754
- URL: https://www.wjgnet.com/1949-8470/full/v17/i11/114754.htm
- DOI: https://dx.doi.org/10.4329/wjr.v17.i11.114754
