Large language models in clinical psychiatry: Applications and optimization strategies

doi:10.5498/wjp.v15.i11.108199

Advanced Search

BPG is committed to discovery and dissemination of knowledge

Home / Archive / Volume 15, Issue 11

This Article

Table of Contents

Academic Content and Language Evaluation of This Article

CrossCheck and Google Search of This Article

Academic Rules and Norms of This Article

Citation of this article

Corresponding Author of This Article

Research Domain of This Article

Article-Type of This Article

Open-Access Policy of This Article

Times Cited Counts in Google of This Article

Number of Hits and Downloads for This Article

Total Article Views (219)

All Articles published online

The chart showing PDF series, HTML series, Figures (1-1) series, Tables (1-1) series.

Item

Count

PDF

HTML

Figures (1-1)

Tables (1-1)

Sum=130

Publishing Process of This Article

The chart showing Browse series, Download series.

Item

Count

Browse

Download

Sum=41

Nov 19, 2025 (publication date) through Nov 23, 2025

Times Cited of This Article

Times Cited (0)

Journal Information of This Article

Publication Name

World Journal of Psychiatry

ISSN

2220-3206

Publisher of This Article

Baishideng Publishing Group Inc, 7041 Koll Center Parkway, Suite 160, Pleasanton, CA 94566, USA

Minireviews Open Access

World J Psychiatry. Nov 19, 2025; 15(11): 108199
Published online Nov 19, 2025. doi: 10.5498/wjp.v15.i11.108199

Large language models in clinical psychiatry: Applications and optimization strategies

Yi-Fan Wang, Ming-Da Li, Su-Hong Wang, Yin Fang, Jie Sun, Lin Lu, Wei Yan

Yi-Fan Wang, Lin Lu, National Institute on Drug Dependence and Beijing Key Laboratory of Drug Dependence, Peking University, Beijing 100191, China

Ming-Da Li, Lin Lu, Wei Yan, Peking University Sixth Hospital, Peking University Institute of Mental Health, NHC Key Laboratory of Mental Health (Peking University), National Clinical Research Center for Mental Disorders (Peking University Sixth Hospital), Beijing 100191, China

Su-Hong Wang, College of Future Technology Peking University, Peking University, Beijing 100871, China

Yin Fang, School of Public Health, North China University of Science and Technology, Tangshan 063210, Hebei Province, China

Jie Sun, Pain Medicine Center, Peking University Third Hospital, Beijing 100191, China

Lin Lu, Peking-Tsinghua Center for Life Sciences and PKU-IDG/McGovern Institute for Brain Research, Peking University, Beijing 100871, China

ORCID number: Yi-Fan Wang (0009-0009-0578-2017); Lin Lu (0000-0003-0742-9072); Wei Yan (0000-0002-5866-6230).

Co-first authors: Yi-Fan Wang and Ming-Da Li.

Co-corresponding authors: Lin Lu and Wei Yan.

Author contributions: Wang YF and Li MD conducted the literature review, interpretation of data and drafted the original manuscript, they contributed equally to this article, they are the co-first authors of this manuscript; Wang SH, Fang Y, and Sun J revised the manuscript; Lu L and Yan W conceptualized and designed the study, supervised, and made critical revisions they contributed equally to this article, they are the co-corresponding authors of this manuscript; and all authors prepared the draft and approved the submitted version.

Supported by the STI2030-Major Projects, No. 2021ZD0203400 and No. 2021ZD0200800; and the National Natural Science Foundation of China, No. 82171477.

Conflict-of-interest statement: All the authors report no relevant conflicts of interest for this article.

Open Access: This article is an open-access article that was selected by an in-house editor and fully peer-reviewed by external reviewers. It is distributed in accordance with the Creative Commons Attribution NonCommercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: https://creativecommons.org/Licenses/by-nc/4.0/

Corresponding author: Wei Yan, Peking University Sixth Hospital, Peking University Institute of Mental Health, NHC Key Laboratory of Mental Health (Peking University), National Clinical Research Center for Mental Disorders (Peking University Sixth Hospital), Huayuan Bei Road, Beijing 100191, China. weiyan@bjmu.edu.cn

Received: April 14, 2025
Revised: May 27, 2025
Accepted: September 3, 2025
Published online: November 19, 2025
Processing time: 206 Days and 1.1 Hours

Abstract

Psychiatric disorders constitute a complex health issue, primarily manifesting as significant disturbances in cognition, emotional regulation, and behavior. However, due to limited resources within health care systems, only a minority of patients can access effective treatment and care services, highlighting an urgent need for improvement. large language models (LLMs), with their natural language understanding and generation capabilities, are gradually penetrating the entire process of psychiatric diagnosis and treatment, including outpatient reception, diagnosis and therapy, clinical nursing, medication safety, and prognosis follow-up. They hold promise for improving the current severe shortage of health system resources and promoting equal access to mental health care. This article reviews the application scenarios and research progress of LLMs. It explores optimization methods for LLMs in psychiatry. Based on the research findings, we propose a clinical LLM for mental health using the Mixture of Experts framework to improve the accuracy of psychiatric diagnosis and therapeutic interventions.

Key Words: Large language models; Clinical psychiatry; Mixture of experts; Mental health; Research progress

Core Tip: This article comprehensively reviews the application scenarios and research advancements of large language models (LLMs) in psychiatry, ranging from outpatient reception, diagnosis and therapy, clinical nursing, medication safety, to prognosis tracking. It explores optimization methods for LLMs in psychiatry. These methods combine the techniques such as pre-training, supervised fine-tuning, retrieval-augmented generation, agent systems, and prompt engineering. Based on the research findings, we propose a clinical LLM for mental health using the Mixture of Experts framework. This approach addresses the shortcomings of single LLM system and aims to improve the accuracy of psychiatric diagnosis and therapeutic interventions.

Citation: Wang YF, Li MD, Wang SH, Fang Y, Sun J, Lu L, Yan W. Large language models in clinical psychiatry: Applications and optimization strategies. World J Psychiatry 2025; 15(11): 108199
URL: https://www.wjgnet.com/2220-3206/full/v15/i11/108199.htm
DOI: https://dx.doi.org/10.5498/wjp.v15.i11.108199

INTRODUCTION

Psychiatric disorders are complex and multidimensional health conditions characterized by persistent disturbances in thinking, emotion, behavior, or cognitive function, often accompanied by significant personal distress or social dysfunction[1,2]. The global prevalence of psychiatric disorders is approximately 12.5%, affecting nearly 970 million individuals worldwide[3]. From 2010 to 2021, the age-standardized disability-adjusted life years attributable to mental illness have increased substantially, indicating a significant rise in the impact of psychiatric disorders on the global health burden[4]. Despite the existence of effective preventive and therapeutic strategies-such as the World Health Organization’s Mental Health Gap Action Programme-around 71% of individuals still suffering from psychiatric[2,5-7].

The application of large language models (LLMs) offers a promising avenue to address this issue. LLMs are primarily neural network models based on the Transformer architecture, with billions or even tens of billions of parameters. After being trained on massive datasets, these models are capable of interpreting complex biomedical concepts and generate context- appropriate responses, effectively addressing various challenges in the biomedical field, such as medical image analysis and case report writing[8-12]. Recent studies have demonstrated that LLMs have considerable potential in clinical tasks by passing multiple medical licensing exams or professional tests[13-16]. Optimized through techniques such as pre-training (PT), supervised fine-tuning (SFT), retrieval-augmented generation (RAG), agents, and prompt engineering (PE), LLMs have shown exceptional performance in diverse clinical tasks[17-27].

In recent years, LLMs have become a research hotspot in psychiatry, particularly for disease prediction, auxiliary diagnosis, and clinical decision support[19,28-36]. Studies show that LLMs outputs align with human psychological inference behaviors, revealing their ability to track and comprehend mental states, a discovery that expands their psychiatric applications[19,32-36]. Psychiatric clinical workflows encompass triage, diagnosis, treatment, nursing, and follow-up, requiring comprehensive assessments of patients’ language, emotions, behavior, and social functioning for integrated physical-mental care[37,38].

The 2022 release of Generative Pre-trained Transformer (GPT)-3.5 advanced LLMs development through instruction tuning and reinforcement learning based on human feedback, significantly improved instruction comprehension and output reliability[39]. This article explores the value of LLMs in the clinical treatment of mental illnesses by reviewing relevant literature on the use of LLMs to assist in the clinical diagnosis and treatment of mental illnesses since the release of GPT-3.5. The search was conducted using key terms such as “mental disorders”, “mental health”, and “large language models”. The article examines the application scenarios and research progress of LLMs in various aspects of medical care, including reception, diagnosis, treatment, and nursing, aiming to provide valuable insights for improving the clinical diagnosis and treatment of mental illnesses.

CLINICAL APPLICATION SCENARIOS OF LLMS

Outpatient reception

The outpatient reception process is characterized by its dual labor-intensive and knowledge-intensive nature[17]. It exerts a direct and profound influence on patient satisfaction and medical adherence, ultimately impacting patients’ health outcomes. This makes it one of the critical components of the overall healthcare delivery system[40,41]. In developed countries, the exorbitant costs associated with outpatient reception constitute a significant economic burden for patients[42], while in developing countries, inefficient management practices contribute significantly to the increased workload and psychological stress experienced by medical professionals[43].

According to the research conducted by Taylor et al[44], LLMs can precisely triage patients to appropriate specialized mental health teams through the analysis of their Electronic Health Record data, providing them with more professional mental health services. Compared to traditional classification systems, LLMs can reduce the triage time per case from 10 minutes to 8 minutes, saving 329 person-weeks of human resources per month. This, in turn, enhances the empathy support provided by triage nurses and improves patient compliance[17]. The incidence of depressive symptoms among outpatient clinic patients is as high as 27%[45], highlighting the importance of empathy support during outpatient visits. Wan et al[17] established a Nurse-Site-Specific Prompt Engineering Chatbot triage mode in collaboration with LLMs. Results from a randomized controlled trial showed that under the Nurse-Site-Specific Prompt Engineering Chatbot cooperation model, patient satisfaction feedback was higher, the rate of repeated questions was significantly reduced, negative emotions decreased, and nurses’ empathy and quality of answers to patient questions significantly improved. A study in the United States indicated that psychiatric outpatient clinics face challenges such as low accessibility, long waiting times, and significant geographical differences[46], emphasizing the need for LLMs to improve the situation. In summary, the rational utilization of LLMs in the triage process can effectively enhance triage accuracy, optimize doctor-patient communication paradigms, benefit both patients and nurses, and provide new paths and methods for improving medical service quality.

Diagnosis, treatment, and care

Diagnosis, treatment, and care are three closely interrelated clinical processes whose synergistic operation plays a decisive role in achieving ideal therapeutic outcomes[47-51]. The core task of diagnosis lies in the precise identification of the patient’s disease or health issues, clarifying the underlying causes, and laying a solid foundation for subsequent treatment strategies and care plans[48,49]. The fundamental goal of treatment is to effectively control disease progression, alleviate clinical symptoms, promote bodily recovery, or prevent the development of complications[47,50]. Care, which permeates the entire medical process, bears significant responsibilities such as monitoring patients’ health status, strictly implementing medical advice, providing psychological support, and conducting health education. High-quality nursing services contribute to enhancing treatment effectiveness, accelerating recovery processes, and significantly reducing the risk of complications[51]. Together, diagnosis, treatment, and care constitute the key elements of clinical decision-making.

Research indicates that GPT-4 has successfully passed the psychiatry professional exams in the United States and Europe[13], achieving an overall performance superior to humans (85.8% > 73.8%, P < 0.01). Specifically, GPT-4 demonstrates higher accuracy in behavioral, cognitive, and psychological aspects (89.8% > 76%, P < 0.01)[13]. This suggests that GPT-4 has the capability to manage complex medical cases and holds potential for real-world clinical application. Currently, LLMs are primarily applied in clinical diagnosis. For instance, Gargari et al[52] conducted a case study based on the Diagnostic and Statistical Manual of Mental Disorders-5, revealing GPT-4’s significant superiority in diagnosing bipolar disorder. Kim et al’s case study[29] found that LLMs excel in identifying obsessive-compulsive disorder compared to mental health and healthcare professionals. Ohse et al’s research[53] demonstrated GPT-4’s ability to recognize the degree of social anxiety symptoms from semi-structured clinical interview texts. Levkovich[54] found that Gemini achieved a 100% accuracy rate in identifying simulated cases based on schizophrenia diagnosis codes, whereas GPT-4 had a 55% accuracy rate. There are significant differences in the diagnostic capabilities of various LLMs for schizophrenia. However, it is important to interpret Gemini’s diagnostic results critically, as standardized simulated cases may not reflect real-world clinical scenarios. A notable characteristic of mental disorders is their reliance on nuanced and subjective information, such as linguistic subtleties, emotional expressions, social cognition, and behavioral cues, often embedded in unstructured patient narratives. Conceptual disorganization, poor content, language style, and logical thinking disorders have been identified as predictors of psychotic episodes[55,56]. We anticipate that future LLMs will automatically extract complex linguistic features from natural language, facilitating the development of biomarkers to aid in the diagnosis of mental illnesses. Language is not only a biomarker but also has the potential to become a biosocial marker for psychosis[57]. LLMs may view language as a product of intertwined biological and social processes, incorporating sociolinguistic analyses, such as power relations and identity construction, to achieve more accurate and equitable diagnoses of mental disorders. Furthermore, LLMs have the capability to integrate linguistic features with multidimensional biomarkers, including abnormalities in sleep cycles, variations in drug responses, neuroimaging findings, and levels of peripheral inflammatory factors. This integration enables the construction of cross-modal joint diagnostic models for mental illnesses. This combination strategy not only captures pathological language patterns from subjective narratives (e.g., deviations in semantic coherence) but also enhances diagnostic effectiveness using objective quantitative indicators of neuroimmune mechanisms.

In clinical treatment, LLMs prompted by evidence-based guidelines represent a promising and scalable strategy for clinical decision support. Perlis et al[28] conducted a controlled study across 50 clinical vignettes of bipolar disorder, finding that guideline-augmented LLMs show high consistency with experts and are significantly better than the baseline model and community clinicians experienced in treating bipolar disorder. In the nursing process, LLMs primarily assist in answering patient inquiries and providing advisory care. Lin et al[58] employed LLMs to analyze conversation records from psychotherapy sessions for anxiety, depression, schizophrenia, and other mental illnesses. This approach enables the identification of distinct thematic characteristics across different psychiatric conditions, allowing therapists to promptly assess the quality of the therapeutic relationship and offering clear, actionable insights to enhance the effectiveness of psychological treatments. Roy et al[59] developed ProKnow LLMs for the mental health nursing domain, utilizing professionally annotated datasets and algorithms to enhance the safety and interpretability of Natural Language Generation, achieving an 82% improvement in performance compared to simple pre-trained LLMs. Additionally, LLMs can aid in repetitive tasks during clinical diagnosis and treatment, such as classifying unstructured data from Electronic Health Records, generating case notes and discharge summaries, thereby reducing the workload of healthcare professionals[60,61]. Thus, the rational utilization of LLMs can not only enhance the efficiency of medical staff but also assist them in collaborative decision-making, ultimately improving the quality of medical services.

Medication safety

Medication errors, which are preventable mistakes that can occur during the prescribing, dispensing, and administration of medications, represent a significant challenge in healthcare. In the United States, community pharmacies experience approximately 51.5 million dispensing errors annually, resulting in at least 1.5 million adverse drug events and nearly 3.5 billion dollars in economic losses[62-66]. While not all medication errors lead to harm, approximately 1% of them result in adverse consequences[67], and the outcomes can be particularly severe in psychiatry[68]. These consequences include the exacerbation of psychiatric symptoms, impairment of cognitive function, and increased risk of suicide.

To address this challenge, researchers developed the Medication Direction Copilot (MEDIC) system[25], which mimics the reasoning process of pharmacists to accurately convey critical prescription information. The MEDIC system fine-tuned LLMs using 1000 expert annotations on drug usage from Amazon Pharmacy, enabling it to extract core medical instructions and generate comprehensive explanations[25]. Compared to two benchmark models (one utilizing 1.5 million drug labels and the other employing state-of-the-art LLMs), MEDIC demonstrated superior performance. Among 1200 prescriptions reviewed by experts, the near-miss events (errors detected and corrected before reaching patients) of the other benchmark systems were 1.51 times and 4.38 times higher than those of MEDIC, respectively[25]. Separate research found that ChatGPT can also improve pharmacy dispensing errors to some extent, promptly and effectively identifying incorrect drug dosages, formulations, routes of administration, unit conversions, and labeling errors related to medication use[69]. Evidently, the rational application of LLMs in pharmacies can significantly reduce medication errors and prevent adverse outcomes.

Follow-up and long-term support for prognosis

Although there are no studies specifically examining the use of LLMs in prognosis follow-up and long-term support, their potential can be inferred from existing research. On the one hand, LLMs combined with online follow-up can establish timely and effective communication between healthcare providers and patients[70]. On the other hand, LLMs can analyze social media texts to conduct timely mental health assessments, such as detecting intimate partner violence and predicting mental health outcomes[71,72]. It should be noted that the potential risks entailed by relying on social media data with privacy breaches and false positives require resolution. Simultaneously, the stigma associated with mental illness has been identified as a barrier to mental health treatment and recovery[73], and the use of LLMs offers a novel approach for long-term support for this population. Looking ahead to future developments, its application scenarios may expand to more forward-looking domains, including but not limited to building personalized relapse prediction models based on multimodal data integration, and enabling in-depth health management such as dynamic emotional tracking using intelligent dialogue systems.

OPTIMIZATION METHODS FOR THE USE OF LLMS IN CLINICAL PSYCHIATRIC DIAGNOSIS AND TREATMENT

Based on their involvement in the medical process, LLMs can be categorized as follows: (1) Dialogue-based[16,20,23,29,52]: These are used for question-and-answer dialogues, responding to doctors’ queries in natural language and providing answers to patients under the supervision of doctors; (2) Data query-based[20,25,28,74]: These assist doctors in querying and summarizing relevant databases or knowledge bases, providing knowledge or data support for doctors’ decision-making; (3) Report interpretation-based[21,23]: These aid in the interpretation of laboratory reports, imaging studies, etc; (4) Data analysis-based[18,32,53]: These extract structured data from multimodal cases, feed the structured data into interpretable machine learning or deep learning models for data analysis, and transmit the analysis results back to LLMs for decision support based on comprehensive information; and (5) Case integration-based[27,60]: These assist in case classification, writing, summarization, and more. However, to achieve accurate application of LLMs in the clinical process, general LLMs typically require optimization. Research has shown that LLMs without enhancement by professional medical data possess a certain degree of clinical decision support capability but cannot accurately diagnose all conditions. The main issues manifest in three aspects: Information fabrication, non-compliance with diagnostic and treatment guidelines, and inclusion of harmful information, which can pose serious risks to patients’ health[75].

The functional enhancement of general LLMs can be categorized into two forms: Knowledge enhancement and data-driven enhancement. Knowledge enhancement refers to strengthening the medical capabilities of the model itself through the establishment of medical datasets and the adoption of methods such as PT and SFT. Alternatively, RAG techniques can be employed to provide LLMs with relevant professional knowledge, limiting their responses within a predefined knowledge framework and thus enhancing their professionalism[20,27,28]. Data-driven model enhancement, on the other hand, involves extracting data from medical cases and analyzing it to further aid clinical decision-making[21]. There are two primary approaches to extracting data from medical cases. The first involves entity extraction from natural language text and encapsulating it as structured data for use in existing clinical prediction models, which can be achieved through intelligent agents[15,18,32]. The second approach involves correlating multi-modal data such as medical images, sounds, and texts through methods like multi-modal alignment, a process that can be implemented via intelligent agents or fine-tuning multi-modal LLMs[21,23,25]. Table 1 summarizes technical features, benefits, and trade-offs of common LLMs optimization strategies used in clinical settings.

Table 1 Technical features of large language models.

Technologies	Characteristics
PT	Large-scale medical knowledge learning: Through training on massive medical literature and case data, the model captures medical language patterns and basic pathological features
	Reducing annotation dependency: The model can utilize unannotated medical texts (such as electronic health records and papers) for initial training
	Versatility foundation: It provides a general medical semantic understanding capability for subsequent tasks (such as diagnosis and report generation)
SFT	Precise task adaptation: Optimize model performance for specific medical tasks, such as disease classification and image recognition
	High accuracy: Enhance the reliability of the model in specialized areas through professionally annotated data, such as cases labeled by doctors
	Enhanced compliance: Adjust model outputs to meet privacy or ethical requirements, such as anonymization processes
Agent	Automated processes: Performing repetitive tasks such as medical record organization and appointment reminders to enhance healthcare efficiency
	Multimodal interaction: Enabling patient-doctor communication and report interpretation through a combination of voice, text, and image
	Real-time decision support: Dynamically providing diagnostic and treatment suggestions, such as drug titration, in conjunction with a rule engine
RAG	Real-time knowledge integration: Incorporate the latest medical databases, such as PubMed and clinical guidelines, to prevent outdated knowledge within the model
	Evidence traceability: Generate results accompanied by references to facilitate verification of reliability by medical professionals
	Mitigation of hallucination risk: Generate content based on authoritative knowledge bases to minimize the likelihood of the model fabricating medical information
PE	Output controllability: Structured instructions guide the model to generate standardized results
	Flexible domain adaptation: Adjusting prompt words can quickly switch application scenarios
	Reduced training costs: Optimizing performance on specific tasks (such as improving the accuracy of rare disease descriptions) without retraining the model

PT: Pre-training; SFT: Supervised fine-tuning; RAG: Retrieval-augmented generation; PE: Prompt engineering.

Open in New Tab Full Size Table

OPTIMIZATION STRATEGIES FOR THE CLINICAL APPLICATION OF LLMS BASED ON THE MIXTURE OF EXPERTS ARCHITECTURE

LLMs have been widely adopted in various clinical psychiatric scenarios, effectively improving medical decision-making, enhancing psychotherapy techniques, and providing mental health services[76,77]. However, due to the inherent hallucinations of LLM technology, LLMs that have not been enhanced with medical data or knowledge still face issues such as non-compliance with guidelines in practical applications, limiting the safe use of general LLMs in medical settings[75]. While PT, SFT, and PE can enhance the generalization performance of LLMs in specific domains, there is still a need for intelligent agents and RAG as means of data and knowledge enhancement to increase the interpretability of LLMs. Furthermore, despite the wide range of application scenarios for LLMs, there are currently no LLMs specifically designed for the entire process of mental illness diagnosis and treatment. We speculate that there are two main reasons for this. Firstly, the etiologies of most mental illnesses are highly complex, involving biological, social, and psychological factors, with significant clinical heterogeneity. Additionally, the symptoms of mental illnesses often overlap strongly in diagnostic descriptions, making clinical decision-making relatively complex[76]. Secondly, the entire process of clinical diagnosis and treatment involves a vast amount of content, and building an LLM suitable for all clinical tasks requires a significant workload and faces difficulties in validation. Based on these considerations, we propose a comprehensive mental health support system based on a large model with a Mixture of Experts (MoE) architecture (Figure 1).

Open in New Tab Full Size Figure Download Figure

Figure 1 Comprehensive mental health support system based on a large model with Mixture of Experts architecture. Some of the design elements within it are officially licensed from the “Gaoding Design” platform (Licensee ID: 8032451674776512569). The integrated mental health support system relies on large language models that are trained to handle diverse clinical duties, including case structuring and interpretation of medical test results. These large language models operate in unison to manage various stages from outpatient reception to diagnosis, treatment, nursing care, and subsequent monitoring. Once established, this clinical framework can utilize internet-based methods to build a comprehensive mental health support system that integrates “hospital-society-family-school”, extending the hospital’s educational department structure into a unified management system.

The MoE architecture represents a divide-and-conquer approach to machine learning model design. Its central idea is to decompose complex tasks into multiple subtasks, which are then handled by different “expert” models. Finally, a “gating mechanism” integrates the outputs from these experts to produce the ultimate prediction[78]. The MoE architecture aims to enhance the model’s capacity and flexibility while minimizing computational resource wastage[78]. In our clinical diagnosis and treatment plan, we first segment clinical work and train distinct LLMs for specific medical tasks. These LLMs are subsequently validated through retrospective and prospective studies to ensure their superior professional capabilities. Secondly, we utilize the MoE mechanism to integrate various specialized models, enabling them to collaborate and jointly complete complex diagnostic and treatment processes.

Once fully integrated, the hospital’s MoE diagnosis and treatment system is poised to expand beyond clinical settings. This framework could extend to “community-based care”, “family-centered interventions”, and “school mental health programs”, creating a collaborative, multi-dimensional support network for adolescents with mental health challenges. By combining online and offline approaches, we can create an efficient and collaborative “hospital-society-family-school” integrated mental health support system for adolescents. This system aims to comprehensively improve the accessibility, continuity, and effectiveness of mental health services.

While exploring the design of the hospital’s MoE diagnosis and treatment system, it is crucial to acknowledge the potential technical challenges and data limitations encountered in this process. Although MoE systems theoretically promise to deliver precise and personalized mental health services, clinical data are often limited by privacy protections, high annotation costs (requiring medical expert involvement), and data fragmentation (dispersed across various institutions), resulting in insufficient practically usable data. For instance, although the MoE system theoretically promises precise and personalized mental health services, most current research and practices rely heavily on retrospective case studies or virtual cases, often sourced from existing public datasets. This implies that the cases used for prediction may already be included in the training datasets of LLMs[74]. Such limitations in data sources can potentially lead to overfitting during the model training phase. Given that virtual or retrospective case studies may have undergone standardized representations by medical professionals, the applicability of these findings to real-world scenarios where LLMs make independent decisions remains questionable. Furthermore, clinical data may be skewed towards specific populations (certain regions or ethnic groups), leading to poor model generalization. Ethical and compliance risks also exist, and the legitimate and compliant use of data is often a topic of technical discussion. Even if technically feasible, data issues may hinder LLM from passing medical regulatory approvals. Potential solutions could include utilizing generative models (such as Generative Adversarial Networks) to create simulated clinical data that meets privacy requirements when advancing the expansion and application of MoE systems[79,80]. Additionally, Federated Learning techniques could be employed to facilitate cross-institution collaborative training without sharing raw data[81]. Few-shot learning, combined with PE, can reduce reliance on large-scale data[82]. Multimodal data fusion, integrating language, imaging, case histories, and other multidimensional data, can enhance model robustness.

It is important to clarify that LLMs cannot replace the central role of experts in the clinical diagnosis and treatment process; their primary function is to assist experts in their clinical work[83-86]. In other words, the application prospect of LLMs lies in liberating experts, enabling them to focus on more complex clinical supervision tasks, and thereby expanding the human resources available to provide quality mental health services. Simultaneously, the collaborative partnership between experts and LLMs can effectively alleviate the severe shortage of resources in the health system, enhancing overall service efficiency and quality. This “human-machine collaboration” model not only optimizes the allocation of medical resources but also provides patients with more precise, continuous, and efficient mental health support.

The application of LLMs in the medical field, while demonstrating transformative potential, is accompanied by multiple ethical challenges. The primary concern lies in patient privacy protection, which necessitates the use of technical measures such as data desensitization and encrypted storage to prevent the leakage of sensitive medical information[87,88]. Secondly, data biases may lead to misjudgments of minority groups or special cases by the models, highlighting the need to establish balanced datasets covering multi-dimensional populations and introduce fairness evaluation mechanisms[76,89]. The lack of decision transparency arising from the “black box” characteristics of the models demands the development of interpretability tools to enable traceable logical chains for diagnostic and treatment suggestions[90,91]. Clear attribution of responsibility requires defining the boundaries of rights and responsibilities among technology developers, medical institutions, and users, and establishing error tracing mechanisms and risk-sharing frameworks[92]. Over-reliance on technology may undermine the autonomy of medical decision-making, necessitating a balance between artificial intelligence assistance and professional judgment through human-machine collaborative design[76,89]. Furthermore, vigilance is needed to prevent the uneven distribution of technical resources from widening the medical gap, and policy guidance should ensure that less developed regions receive inclusive technical support[93]. Addressing these issues requires the construction of an interdisciplinary governance system that integrates technical ethics reviews, dynamic regulatory frameworks, and industry operation guidelines. This approach aims to strike a balance between technological innovation and humanistic care, ultimately achieving sustainable development of medical artificial intelligence that is “usable, trustworthy, and controllable”.

CONCLUSION

LLMs have demonstrated significant potential in the diagnosis and treatment of clinical psychiatry globally. The advantages of the MoE LLMs also enable it to adapt to different healthcare systems, languages, or diagnostic cultures. On the one hand, they can effectively enhance the quality of medical services, and on the other hand, they facilitate the provision of mental health services through more convenient means, promoting equity in mental health care. However, there are still substantial challenges to overcome for their practical application, including technical issues with LLMs (such as algorithmic bias and hallucinated outputs), socio-ethical concerns (like data privacy and racial bias), and decision accuracy in the real world. Technical issues related to LLMs can be optimized through knowledge and data augmentation, while social ethics and privacy concerns can be regulated through legislation. Nevertheless, the usability of LLMs in the real world still requires validation through prospective experiments. With reasonable LLMs design, comprehensive legislation, and effective prospective clinical validation, the collaborative model between LLMs and experts is expected to significantly enhance the clinical application of psychiatry, thereby improving patient outcomes and advancing global mental health progress.

Footnotes

Provenance and peer review: Invited article; Externally peer reviewed.

Peer-review model: Single blind

Specialty type: Psychiatry

Country of origin: China

Peer-review report’s classification

Scientific Quality: Grade B, Grade B

Novelty: Grade B, Grade B

Creativity or Innovation: Grade A, Grade A

Scientific Significance: Grade A, Grade C

P-Reviewer: He R, PhD, Post Doctoral Researcher, Spain; Turan S, MD, PhD, Associate Professor, Türkiye S-Editor: Bai Y L-Editor: A P-Editor: Wang CH

References

1.	The Lancet. Brain health and its social determinants. Lancet. 2021;398:1021. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 5] [Cited by in RCA: 11] [Article Influence: 2.8] [Reference Citation Analysis (0)]

Charlson F, van Ommeren M, Flaxman A, Cornett J, Whiteford H, Saxena S. New WHO prevalence estimates of mental disorders in conflict settings: a systematic review and meta-analysis. Lancet. 2019;394:240-248. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 576] [Cited by in RCA: 782] [Article Influence: 130.3] [Reference Citation Analysis (0)]

GBD 2019 Mental Disorders Collaborators. Global, regional, and national burden of 12 mental disorders in 204 countries and territories, 1990-2019: a systematic analysis for the Global Burden of Disease Study 2019. Lancet Psychiatry. 2022;9:137-150. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 436] [Cited by in RCA: 3091] [Article Influence: 1030.3] [Reference Citation Analysis (0)]

GBD 2021 Risk Factors Collaborators. Global burden and strength of evidence for 88 risk factors in 204 countries and 811 subnational locations, 1990-2021: a systematic analysis for the Global Burden of Disease Study 2021. Lancet. 2024;403:2162-2203. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 633] [Cited by in RCA: 1162] [Article Influence: 1162.0] [Reference Citation Analysis (0)]

Ubaid M, Jadba G, Mughari H, Tabash H, Yaghi M, Aljaish A, Shahin U. Integration of mental health and psychosocial support services into primary health care in Gaza: a cross-sectional evaluation. Lancet. 2021;398 Suppl 1:S51. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 1] [Cited by in RCA: 2] [Article Influence: 0.5] [Reference Citation Analysis (0)]

Brohan E, Chowdhary N, Dua T, Barbui C, Thornicroft G, Kestel D; WHO mhGAP guideline team. The WHO Mental Health Gap Action Programme for mental, neurological, and substance use conditions: the new and updated guideline recommendations. Lancet Psychiatry. 2024;11:155-158. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 3] [Cited by in RCA: 13] [Article Influence: 13.0] [Reference Citation Analysis (0)]

de Jesus Mari J, Tófoli LF, Noto C, Li LM, Diehl A, Claudino AM, Juruena MF. Pharmacological and psychosocial management of mental, neurological and substance use disorders in low- and middle-income countries: issues and current strategies. Drugs. 2013;73:1549-1568. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 14] [Cited by in RCA: 16] [Article Influence: 1.3] [Reference Citation Analysis (0)]

8.	Thirunavukarasu AJ, Ting DSJ, Elangovan K, Gutierrez L, Tan TF, Ting DSW. Large language models in medicine. Nat Med. 2023;29:1930-1940. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 7] [Cited by in RCA: 1070] [Article Influence: 535.0] [Reference Citation Analysis (0)]

9.	Simon E, Swanson K, Zou J. Language models for biological research: a primer. Nat Methods. 2024;21:1422-1429. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 1] [Cited by in RCA: 23] [Article Influence: 23.0] [Reference Citation Analysis (0)]

10.

Singhal K, Azizi S, Tu T, Mahdavi SS, Wei J, Chung HW, Scales N, Tanwani A, Cole-Lewis H, Pfohl S, Payne P, Seneviratne M, Gamble P, Kelly C, Babiker A, Schärli N, Chowdhery A, Mansfield P, Demner-Fushman D, Agüera Y Arcas B, Webster D, Corrado GS, Matias Y, Chou K, Gottweis J, Tomasev N, Liu Y, Rajkomar A, Barral J, Semturs C, Karthikesalingam A, Natarajan V. Large language models encode clinical knowledge. Nature. 2023;620:172-180. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 1698] [Cited by in RCA: 1049] [Article Influence: 524.5] [Reference Citation Analysis (0)]

11.

Dunn C, Hunter J, Steffes W, Whitney Z, Foss M, Mammino J, Leavitt A, Hawkins SD, Dane A, Yungmann M, Nathoo R. Artificial intelligence-derived dermatology case reports are indistinguishable from those written by humans: A single-blinded observer study. J Am Acad Dermatol. 2023;89:388-390. [RCA] [PubMed] [DOI] [Full Text] [Cited by in RCA: 30] [Reference Citation Analysis (0)]

12.

Kim SH, Wihl J, Schramm S, Berberich C, Rosenkranz E, Schmitzer L, Serguen K, Klenk C, Lenhart N, Zimmer C, Wiestler B, Hedderich DM. Human-AI collaboration in large language model-assisted brain MRI differential diagnosis: a usability study. Eur Radiol. 2025;35:5252-5263. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 9] [Cited by in RCA: 5] [Article Influence: 5.0] [Reference Citation Analysis (0)]

13.

Schubert MC, Wick W, Venkataramani V. Performance of Large Language Models on a Neurology Board-Style Examination. JAMA Netw Open. 2023;6:e2346721. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 21] [Cited by in RCA: 44] [Article Influence: 22.0] [Reference Citation Analysis (0)]

14.

Bernstein IA, Zhang YV, Govil D, Majid I, Chang RT, Sun Y, Shue A, Chou JC, Schehlein E, Christopher KL, Groth SL, Ludwig C, Wang SY. Comparison of Ophthalmologist and Large Language Model Chatbot Responses to Online Patient Eye Care Questions. JAMA Netw Open. 2023;6:e2330320. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 66] [Cited by in RCA: 129] [Article Influence: 64.5] [Reference Citation Analysis (0)]

15.

Irmici G, Cozzi A, Della Pepa G, De Berardinis C, D'Ascoli E, Cellina M, Cè M, Depretto C, Scaperrotta G. How do large language models answer breast cancer quiz questions? A comparative study of GPT-3.5, GPT-4 and Google Gemini. Radiol Med. 2024;129:1463-1467. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 1] [Cited by in RCA: 8] [Article Influence: 8.0] [Reference Citation Analysis (0)]

16.

Adams LC, Truhn D, Busch F, Dorfner F, Nawabi J, Makowski MR, Bressem KK. Llama 3 Challenges Proprietary State-of-the-Art Large Language Models in Radiology Board-style Examination Questions. Radiology. 2024;312:e241191. [RCA] [PubMed] [DOI] [Full Text] [Cited by in RCA: 12] [Reference Citation Analysis (0)]

17.

Wan P, Huang Z, Tang W, Nie Y, Pei D, Deng S, Chen J, Zhou Y, Duan H, Chen Q, Long E. Outpatient reception via collaboration between nurses and a large language model: a randomized controlled trial. Nat Med. 2024;30:2878-2885. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 2] [Cited by in RCA: 23] [Article Influence: 23.0] [Reference Citation Analysis (0)]

18.

Williams CYK, Zack T, Miao BY, Sushil M, Wang M, Kornblith AE, Butte AJ. Use of a Large Language Model to Assess Clinical Acuity of Adults in the Emergency Department. JAMA Netw Open. 2024;7:e248895. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 22] [Cited by in RCA: 56] [Article Influence: 56.0] [Reference Citation Analysis (0)]

19.	McCoy TH Jr, Perlis RH. Dimensional Measures of Psychopathology in Children and Adolescents Using Large Language Models. Biol Psychiatry. 2024;96:940-947. [RCA] [PubMed] [DOI] [Full Text] [Cited by in RCA: 6] [Reference Citation Analysis (0)]

20.

Ge J, Sun S, Owens J, Galvez V, Gologorskaya O, Lai JC, Pletcher MJ, Lai K. Development of a liver disease-specific large language model chat interface using retrieval-augmented generation. Hepatology. 2024;80:1158-1168. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 86] [Cited by in RCA: 55] [Article Influence: 55.0] [Reference Citation Analysis (0)]

21.

Zhao Z, Wang S, Gu J, Zhu Y, Mei L, Zhuang Z, Cui Z, Wang Q, Shen D. ChatCAD+: Toward a Universal and Reliable Interactive CAD Using LLMs. IEEE Trans Med Imaging. 2024;43:3755-3766. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 5] [Cited by in RCA: 11] [Article Influence: 11.0] [Reference Citation Analysis (0)]

22.

Benary M, Wang XD, Schmidt M, Soll D, Hilfenhaus G, Nassir M, Sigler C, Knödler M, Keller U, Beule D, Keilholz U, Leser U, Rieke DT. Leveraging Large Language Models for Decision Support in Personalized Oncology. JAMA Netw Open. 2023;6:e2343689. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 45] [Cited by in RCA: 120] [Article Influence: 60.0] [Reference Citation Analysis (0)]

23.	Hu X, Gu L, Kobayashi K, Liu L, Zhang M, Harada T, Summers RM, Zhu Y. Interpretable medical image Visual Question Answering via multi-modal relationship graph learning. Med Image Anal. 2024;97:103279. [RCA] [PubMed] [DOI] [Full Text] [Cited by in RCA: 5] [Reference Citation Analysis (0)]

24.

Zhou J, He X, Sun L, Xu J, Chen X, Chu Y, Zhou L, Liao X, Zhang B, Afvari S, Gao X. Pre-trained multimodal large language model enhances dermatological diagnosis using SkinGPT-4. Nat Commun. 2024;15:5649. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in RCA: 37] [Reference Citation Analysis (0)]

25.

Pais C, Liu J, Voigt R, Gupta V, Wade E, Bayati M. Large language models for preventing medication direction errors in online pharmacies. Nat Med. 2024;30:1574-1582. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 1] [Cited by in RCA: 22] [Article Influence: 22.0] [Reference Citation Analysis (0)]

26.

Rau A, Rau S, Zoeller D, Fink A, Tran H, Wilpert C, Nattenmueller J, Neubauer J, Bamberg F, Reisert M, Russe MF. A Context-based Chatbot Surpasses Trained Radiologists and Generic ChatGPT in Following the ACR Appropriateness Guidelines. Radiology. 2023;308:e230970. [RCA] [PubMed] [DOI] [Full Text] [Cited by in RCA: 89] [Reference Citation Analysis (0)]

27.

Van Veen D, Van Uden C, Blankemeier L, Delbrouck JB, Aali A, Bluethgen C, Pareek A, Polacin M, Reis EP, Seehofnerová A, Rohatgi N, Hosamani P, Collins W, Ahuja N, Langlotz CP, Hom J, Gatidis S, Pauly J, Chaudhari AS. Adapted large language models can outperform medical experts in clinical text summarization. Nat Med. 2024;30:1134-1142. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 38] [Cited by in RCA: 215] [Article Influence: 215.0] [Reference Citation Analysis (0)]

28.

Perlis RH, Goldberg JF, Ostacher MJ, Schneck CD. Clinical decision support for bipolar depression using large language models. Neuropsychopharmacology. 2024;49:1412-1416. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 2] [Cited by in RCA: 18] [Article Influence: 18.0] [Reference Citation Analysis (0)]

29.

Kim J, Leonte KG, Chen ML, Torous JB, Linos E, Pinto A, Rodriguez CI. Large language models outperform mental and medical health care professionals in identifying obsessive-compulsive disorder. NPJ Digit Med. 2024;7:193. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 18] [Cited by in RCA: 20] [Article Influence: 20.0] [Reference Citation Analysis (0)]

30.	Wang Y. Large language models for depression prediction. Proc Natl Acad Sci U S A. 2024;121:e2409757121. [RCA] [PubMed] [DOI] [Full Text] [Cited by in RCA: 2] [Reference Citation Analysis (0)]

31.

Ferrario A, Sedlakova J, Trachsel M. The Role of Humanization and Robustness of Large Language Models in Conversational Artificial Intelligence for Individuals With Depression: A Critical Analysis. JMIR Ment Health. 2024;11:e56569. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 14] [Cited by in RCA: 10] [Article Influence: 10.0] [Reference Citation Analysis (0)]

32.

Bauer B, Norel R, Leow A, Rached ZA, Wen B, Cecchi G. Using Large Language Models to Understand Suicidality in a Social Media-Based Taxonomy of Mental Health Disorders: Linguistic Analysis of Reddit Posts. JMIR Ment Health. 2024;11:e57234. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 5] [Cited by in RCA: 8] [Article Influence: 8.0] [Reference Citation Analysis (0)]

33.

Strachan JWA, Albergo D, Borghini G, Pansardi O, Scaliti E, Gupta S, Saxena K, Rufo A, Panzeri S, Manzi G, Graziano MSA, Becchio C. Testing theory of mind in large language models and humans. Nat Hum Behav. 2024;8:1285-1295. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 10] [Cited by in RCA: 32] [Article Influence: 32.0] [Reference Citation Analysis (0)]

34.

Qu Y, Du P, Che W, Wei C, Zhang C, Ouyang W, Bian Y, Xu F, Hu B, Du K, Wu H, Liu J, Liu Q. Promoting interactions between cognitive science and large language models. Innovation (Camb). 2024;5:100579. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in RCA: 1] [Reference Citation Analysis (0)]

35.	Hagendorff T. Deception abilities emerged in large language models. Proc Natl Acad Sci U S A. 2024;121:e2317967121. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 2] [Cited by in RCA: 7] [Article Influence: 7.0] [Reference Citation Analysis (0)]

36.

Lawrence HR, Schneider RA, Rubin SB, Matarić MJ, McDuff DJ, Jones Bell M. The Opportunities and Risks of Large Language Models in Mental Health. JMIR Ment Health. 2024;11:e59479. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 36] [Cited by in RCA: 29] [Article Influence: 29.0] [Reference Citation Analysis (0)]

37.	Cox JA. Systems thinking approach to mental health services. Nat Rev Psychol. 2024;3:445. [PubMed] [DOI] [Full Text]

38.

Min W, Sun X, Tang N, Zhang Y, Luo F, Zhu M, Xia W, Zhou B. A new model for the treatment of type 2 diabetes mellitus based on rhythm regulations under the framework of psychosomatic medicine: a real-world study. Sci Rep. 2023;13:1047. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 3] [Cited by in RCA: 4] [Article Influence: 2.0] [Reference Citation Analysis (0)]

39.	Burns C, Izmailov P, Kirchner JH, Baker B, Gao L, Aschenbrenner L, Chen Y, Ecoffet A, Joglekar M, Leike J, Sutskever I, and Wu J. Weak-to-Strong generalization: eliciting strong capabilities with weak supervision. 2023 Preprint. Available from: arXiv: 2312.09390. [PubMed] [DOI] [Full Text]

40.

Kwame A, Petrucka PM. A literature-based study of patient-centered care and communication in nurse-patient interactions: barriers, facilitators, and the way forward. BMC Nurs. 2021;20:158. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 80] [Cited by in RCA: 380] [Article Influence: 95.0] [Reference Citation Analysis (0)]

41.	Sharkiya SH. Quality communication can improve patient-centred health outcomes among older patients: a rapid review. BMC Health Serv Res. 2023;23:886. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in RCA: 126] [Reference Citation Analysis (0)]

42.

Himmelstein DU, Jun M, Busse R, Chevreul K, Geissler A, Jeurissen P, Thomson S, Vinet MA, Woolhandler S. A comparison of hospital administrative costs in eight nations: US costs exceed all others by far. Health Aff (Millwood). 2014;33:1586-1594. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 107] [Cited by in RCA: 108] [Article Influence: 9.8] [Reference Citation Analysis (0)]

43.	Guo SY, Yang TT, Dong SP. [Research advances on cost-efficiency measurement and evaluation of public hospitals in China]. Zhonguo Weisheng Zhengce Yanjiu. 2020;13:45-51. [PubMed] [DOI] [Full Text]

44.

Taylor N, Kormilitzin A, Lorge I, Nevado-Holgado A, Cipriani A, Joyce DW. Model development for bespoke large language models for digital triage assistance in mental health care. Artif Intell Med. 2024;157:102988. [RCA] [PubMed] [DOI] [Full Text] [Cited by in RCA: 3] [Reference Citation Analysis (0)]

45.

Wang J, Wu X, Lai W, Long E, Zhang X, Li W, Zhu Y, Chen C, Zhong X, Liu Z, Wang D, Lin H. Prevalence of depression and depressive symptoms among outpatients: a systematic review and meta-analysis. BMJ Open. 2017;7:e017173. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 200] [Cited by in RCA: 277] [Article Influence: 34.6] [Reference Citation Analysis (0)]

46.

Sun CF, Correll CU, Trestman RL, Lin Y, Xie H, Hankey MS, Uymatiao RP, Patel RT, Metsutnan VL, McDaid EC, Saha A, Kuo C, Lewis P, Bhatt SH, Lipphard LE, Kablinger AS. Low availability, long wait times, and high geographic disparity of psychiatric outpatient care in the US. Gen Hosp Psychiatry. 2023;84:12-17. [RCA] [PubMed] [DOI] [Full Text] [Cited by in RCA: 27] [Reference Citation Analysis (0)]

47.	. Gabbard's Treatments of Psychiatric Disorders, Fourth Edition. Am J Psychiatry. 2007;164:1620-1621. [RCA] [PubMed] [DOI] [Full Text] [Cited by in RCA: 1] [Reference Citation Analysis (0)]

48.	Symons AB, Seller RH. Differential diagnosis of common complaints. 7^th ed. Amsterdam: Elsevier, 2017. [PubMed] [DOI]

49.	Casey P, Kelly B. Fish's Clinical Psychopathology. 5^th ed. Cambridge: Cambridge University Press, 2024. [PubMed] [DOI]

50.	Taylor DM, Barnes TRE, Young AH. The Maudsley Prescribing Guidelines in Psychiatry. Hoboken: John Wiley & Sons, 2021. [PubMed] [DOI]

51.	Vieira LC, Bocchi SCM, Macphee M, Spiri WC. Psychiatric Care Setting from the Perspective of Psychiatric Nursing Managers. Open Nurs J. 2025;19:e18744346363723. [PubMed] [DOI] [Full Text]

52.	Gargari OK, Fatehi F, Mohammadi I, Firouzabadi SR, Shafiee A, Habibi G. Diagnostic accuracy of large language models in psychiatry. Asian J Psychiatr. 2024;100:104168. [RCA] [PubMed] [DOI] [Full Text] [Cited by in RCA: 14] [Reference Citation Analysis (0)]

53.

Ohse J, Hadžić B, Mohammed P, Peperkorn N, Fox J, Krutzki J, Lyko A, Mingyu F, Zheng X, Rätsch M, Shiban Y. GPT-4 shows potential for identifying social anxiety from clinical interview data. Sci Rep. 2024;14:30498. [RCA] [PubMed] [DOI] [Full Text] [Cited by in RCA: 1] [Reference Citation Analysis (0)]

54.

Levkovich I. Evaluating Diagnostic Accuracy and Treatment Efficacy in Mental Health: A Comparative Analysis of Large Language Model Tools and Mental Health Professionals. Eur J Investig Health Psychol Educ. 2025;15:9. [RCA] [PubMed] [DOI] [Full Text] [Cited by in RCA: 3] [Reference Citation Analysis (0)]

55.

Corcoran CM, Mittal VA, Bearden CE, E Gur R, Hitczenko K, Bilgrami Z, Savic A, Cecchi GA, Wolff P. Language as a biomarker for psychosis: A natural language processing approach. Schizophr Res. 2020;226:158-166. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 84] [Cited by in RCA: 106] [Article Influence: 21.2] [Reference Citation Analysis (0)]

56.

Hartnagel LM, Ebner-Priemer UW, Foo JC, Streit F, Witt SH, Frank J, Limberger MF, Horn AB, Gilles M, Rietschel M, Sirignano L. Linguistic style as a digital marker for depression severity: An ambulatory assessment pilot study in patients with depressive disorder undergoing sleep deprivation therapy. Acta Psychiatr Scand. 2025;151:348-357. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 1] [Cited by in RCA: 3] [Article Influence: 3.0] [Reference Citation Analysis (0)]

57.	Palaniyappan L. More than a biomarker: could language be a biosocial marker of psychosis? NPJ Schizophr. 2021;7:42. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 22] [Cited by in RCA: 40] [Article Influence: 10.0] [Reference Citation Analysis (0)]

58.

Lin B, Bouneffouf D, Landa Y, Jespersen R, Corcoran C, Cecchi G. COMPASS: Computational mapping of patient-therapist alliance strategies with language modeling. Transl Psychiatry. 2025;15:166. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in RCA: 1] [Reference Citation Analysis (0)]

59.

Roy K, Gaur M, Soltani M, Rawte V, Kalyan A, Sheth A. ProKnow: Process knowledge for safety constrained and explainable question generation for mental health diagnostic assistance. Front Big Data. 2022;5:1056728. [RCA] [PubMed] [DOI] [Full Text] [Cited by in RCA: 9] [Reference Citation Analysis (0)]

60.

Cardamone NC, Olfson M, Schmutte T, Ungar L, Liu T, Cullen SW, Williams NJ, Marcus SC. Classifying Unstructured Text in Electronic Health Records for Mental Health Prediction Models: Large Language Model Evaluation Study. JMIR Med Inform. 2025;13:e65454. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 1] [Cited by in RCA: 5] [Article Influence: 5.0] [Reference Citation Analysis (0)]

61.

Schwieger A, Angst K, de Bardeci M, Burrer A, Cathomas F, Ferrea S, Grätz F, Knorr M, Kronenberg G, Spiller T, Troi D, Seifritz E, Weber S, Olbrich S. Large language models can support generation of standardized discharge summaries - A retrospective study utilizing ChatGPT-4 and electronic health records. Int J Med Inform. 2024;192:105654. [RCA] [PubMed] [DOI] [Full Text] [Cited by in RCA: 13] [Reference Citation Analysis (0)]

62.	Bates DW, Cullen DJ, Laird N, Petersen LA, Small SD, Servi D, Laffel G, Sweitzer BJ, Shea BF, Hallisey R. Incidence of adverse drug events and potential adverse drug events. Implications for prevention. ADE Prevention Study Group. JAMA. 1995;274:29-34. [PubMed] [DOI]

63.

Barker KN, Flynn EA, Pepper GA, Bates DW, Mikeal RL. Medication errors observed in 36 health care facilities. Arch Intern Med. 2002;162:1897-1903. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 476] [Cited by in RCA: 442] [Article Influence: 19.2] [Reference Citation Analysis (0)]

64.

Flynn EA, Barker KN, Carnahan BJ. National observational study of prescription dispensing accuracy and safety in 50 pharmacies. J Am Pharm Assoc (Wash). 2003;43:191-200. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 86] [Cited by in RCA: 98] [Article Influence: 4.5] [Reference Citation Analysis (0)]

65.

Campbell PJ, Patel M, Martin JR, Hincapie AL, Axon DR, Warholak TL, Slack M. Systematic review and meta-analysis of community pharmacy error rates in the USA: 1993-2015. BMJ Open Qual. 2018;7:e000193. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 13] [Cited by in RCA: 32] [Article Influence: 4.6] [Reference Citation Analysis (0)]

66.

Odukoya OK, Stone JA, Chui MA. E-prescribing errors in community pharmacies: exploring consequences and contributing factors. Int J Med Inform. 2014;83:427-437. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 44] [Cited by in RCA: 63] [Article Influence: 5.7] [Reference Citation Analysis (0)]

67.

Bates DW, Boyle DL, Vander Vliet MB, Schneider J, Leape L. Relationship between medication errors and adverse drug events. J Gen Intern Med. 1995;10:199-205. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 567] [Cited by in RCA: 580] [Article Influence: 19.3] [Reference Citation Analysis (0)]

68.

Alshehri GH, Keers RN, Ashcroft DM. Frequency and Nature of Medication Errors and Adverse Drug Events in Mental Health Hospitals: a Systematic Review. Drug Saf. 2017;40:871-886. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 26] [Cited by in RCA: 40] [Article Influence: 5.7] [Reference Citation Analysis (0)]

69.	Shin E, Hartman M, Ramanathan M. Performance of the ChatGPT large language model for decision support in community pharmacy. Br J Clin Pharmacol. 2024;90:3320-3333. [RCA] [PubMed] [DOI] [Full Text] [Cited by in RCA: 6] [Reference Citation Analysis (0)]

70.

Ang BH, Gollapalli SD, Du M, Ng SK. Unraveling Online Mental Health Through the Lens of Early Maladaptive Schemas: AI-Enabled Content Analysis of Online Mental Health Communities. J Med Internet Res. 2025;27:e59524. [RCA] [PubMed] [DOI] [Full Text] [Cited by in RCA: 1] [Reference Citation Analysis (0)]

71.	Al-Garadi MA, Kim S, Guo Y, Warren E, Yang YC, Lakamana S, Sarker A. Natural language model for automatic identification of Intimate Partner Violence reports from Twitter. Array (N Y). 2022;15:100217. [RCA] [PubMed] [DOI] [Full Text] [Cited by in RCA: 6] [Reference Citation Analysis (0)]

72.

Xu X, Yao B, Dong Y, Gabriel S, Yu H, Hendler J, Ghassemi M, Dey AK, Wang D. Mental-LLM: Leveraging Large Language Models for Mental Health Prediction via Online Text Data. Proc ACM Interact Mob Wearable Ubiquitous Technol. 2024;8:31. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 23] [Cited by in RCA: 27] [Article Influence: 27.0] [Reference Citation Analysis (0)]

73.

Pinto-Foltz MD, Logsdon MC. Reducing stigma related to mental disorders: initiatives, interventions, and recommendations for nursing. Arch Psychiatr Nurs. 2009;23:32-40. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 42] [Cited by in RCA: 43] [Article Influence: 2.7] [Reference Citation Analysis (0)]

74.

Luo X, Rechardt A, Sun G, Nejad KK, Yáñez F, Yilmaz B, Lee K, Cohen AO, Borghesani V, Pashkov A, Marinazzo D, Nicholas J, Salatiello A, Sucholutsky I, Minervini P, Razavi S, Rocca R, Yusifov E, Okalova T, Gu N, Ferianc M, Khona M, Patil KR, Lee PS, Mata R, Myers NE, Bizley JK, Musslick S, Bilgin IP, Niso G, Ales JM, Gaebler M, Ratan Murty NA, Loued-Khenissi L, Behler A, Hall CM, Dafflon J, Bao SD, Love BC. Large language models surpass human experts in predicting neuroscience results. Nat Hum Behav. 2025;9:305-315. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 13] [Cited by in RCA: 20] [Article Influence: 20.0] [Reference Citation Analysis (0)]

75.

Hager P, Jungmann F, Holland R, Bhagat K, Hubrecht I, Knauer M, Vielhauer J, Makowski M, Braren R, Kaissis G, Rueckert D. Evaluation and mitigation of the limitations of large language models in clinical decision-making. Nat Med. 2024;30:2613-2622. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 8] [Cited by in RCA: 159] [Article Influence: 159.0] [Reference Citation Analysis (0)]

76.	Volkmer S, Meyer-Lindenberg A, Schwarz E. Large language models in psychiatry: Opportunities and challenges. Psychiatry Res. 2024;339:116026. [RCA] [PubMed] [DOI] [Full Text] [Cited by in RCA: 19] [Reference Citation Analysis (0)]

77.

van der Schyff EL, Ridout B, Amon KL, Forsyth R, Campbell AJ. Providing Self-Led Mental Health Support Through an Artificial Intelligence-Powered Chat Bot (Leora) to Meet the Demand of Mental Health Care. J Med Internet Res. 2023;25:e46448. [RCA] [PubMed] [DOI] [Full Text] [Cited by in RCA: 20] [Reference Citation Analysis (0)]

78.	Li Y, Jiang S, Hu B, Wang L, Zhong W, Luo W, Ma L, Zhang M. Uni-MoE: Scaling Unified Multimodal LLMs With Mixture of Experts. IEEE Trans Pattern Anal Mach Intell. 2025;47:3424-3439. [RCA] [PubMed] [DOI] [Full Text] [Cited by in RCA: 3] [Reference Citation Analysis (0)]

79.	Xu J, Zhang Z, Hu X. Extracting Semantic Knowledge From GANs With Unsupervised Learning. IEEE Trans Pattern Anal Mach Intell. 2023;45:9654-9668. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 5] [Cited by in RCA: 2] [Article Influence: 1.0] [Reference Citation Analysis (0)]

80.

Sorin V, Barash Y, Konen E, Klang E. Creating Artificial Images for Radiology Applications Using Generative Adversarial Networks (GANs) - A Systematic Review. Acad Radiol. 2020;27:1175-1185. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 136] [Cited by in RCA: 83] [Article Influence: 16.6] [Reference Citation Analysis (0)]

81.	Hanser T. Federated learning for molecular discovery. Curr Opin Struct Biol. 2023;79:102545. [RCA] [PubMed] [DOI] [Full Text] [Cited by in RCA: 11] [Reference Citation Analysis (0)]

82.	Li W, Wang L, Zhang X, Qi L, Huo J, Gao Y, Luo J. Defensive Few-Shot Learning. IEEE Trans Pattern Anal Mach Intell. 2023;45:5649-5667. [RCA] [PubMed] [DOI] [Full Text] [Cited by in RCA: 3] [Reference Citation Analysis (0)]

83.	van Heerden AC, Pozuelo JR, Kohrt BA. Global Mental Health Services and the Impact of Artificial Intelligence-Powered Large Language Models. JAMA Psychiatry. 2023;80:662-664. [RCA] [PubMed] [DOI] [Full Text] [Cited by in RCA: 26] [Reference Citation Analysis (0)]

84.

McCoy LG, Ci Ng FY, Sauer CM, Yap Legaspi KE, Jain B, Gallifant J, McClurkin M, Hammond A, Goode D, Gichoya J, Celi LA. Understanding and training for the impact of large language models and artificial intelligence in healthcare practice: a narrative review. BMC Med Educ. 2024;24:1096. [RCA] [PubMed] [DOI] [Full Text] [Cited by in RCA: 9] [Reference Citation Analysis (0)]

85.

Reverberi C, Rigon T, Solari A, Hassan C, Cherubini P; GI Genius CADx Study Group, Cherubini A. Experimental evidence of effective human-AI collaboration in medical decision-making. Sci Rep. 2022;12:14952. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 55] [Cited by in RCA: 69] [Article Influence: 23.0] [Reference Citation Analysis (0)]

86.	Ahuja AS. The impact of artificial intelligence in medicine on the future role of the physician. PeerJ. 2019;7:e7702. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 153] [Cited by in RCA: 299] [Article Influence: 49.8] [Reference Citation Analysis (0)]

87.

Tangsrivimol JA, Darzidehkalani E, Virk HUH, Wang Z, Egger J, Wang M, Hacking S, Glicksberg BS, Strauss M, Krittanawong C. Benefits, limits, and risks of ChatGPT in medicine. Front Artif Intell. 2025;8:1518049. [RCA] [PubMed] [DOI] [Full Text] [Cited by in RCA: 16] [Reference Citation Analysis (0)]

88.

Malgaroli M, Schultebraucks K, Myrick KJ, Andrade Loch A, Ospina-Pinillos L, Choudhury T, Kotov R, De Choudhury M, Torous J. Large language models for the mental health community: framework for translating code to care. Lancet Digit Health. 2025;7:e282-e285. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 2] [Cited by in RCA: 9] [Article Influence: 9.0] [Reference Citation Analysis (0)]

89.

Kagan BJ, Mahlis M, Bhat A, Bongard J, Cole VM, Corlett P, Gyngell C, Hartung T, Jupp B, Levin M, Lysaght T, Opie N, Razi A, Smirnova L, Tennant I, Wade PT, Wang G. Toward a nomenclature consensus for diverse intelligent systems: Call for collaboration. Innovation (Camb). 2024;5:100658. [RCA] [PubMed] [DOI] [Full Text] [Cited by in RCA: 2] [Reference Citation Analysis (0)]

90.	Guo Z, Lai A, Thygesen JH, Farrington J, Keen T, Li K. Large Language Models for Mental Health Applications: Systematic Review. JMIR Ment Health. 2024;11:e57400. [RCA] [PubMed] [DOI] [Full Text] [Cited by in RCA: 26] [Reference Citation Analysis (0)]

91.

Sun J, Dong QX, Wang SW, Zheng YB, Liu XX, Lu TS, Yuan K, Shi J, Hu B, Lu L, Han Y. Artificial intelligence in psychiatry research, diagnosis, and therapy. Asian J Psychiatr. 2023;87:103705. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 1] [Cited by in RCA: 55] [Article Influence: 27.5] [Reference Citation Analysis (0)]

92.

Shumway DO, Hartman HJ. Medical malpractice liability in large language model artificial intelligence: legal review and policy recommendations. J Osteopath Med. 2024;124:287-290. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 12] [Cited by in RCA: 11] [Article Influence: 11.0] [Reference Citation Analysis (0)]

93.

Timilsina M, Buosi S, Razzaq MA, Haque R, Judge C, Curry E. Harmonizing foundation models in healthcare: A comprehensive survey of their roles, relationships, and impact in artificial intelligence's advancing terrain. Comput Biol Med. 2025;189:109925. [RCA] [PubMed] [DOI] [Full Text] [Cited by in RCA: 1] [Reference Citation Analysis (0)]