Validation of a generative artificial intelligence tool for the critical appraisal of articles on the epidemiology of mental health: Its application in the Middle East and North Africa - 03/04/25

Doi : 10.1016/j.jeph.2025.202990

Moussa Cheima ^a, Altayyar Sarah ^a, Vergonjeanne Marion ^a,^b, Gelle Thibaut ^a,^⁎ , Preux Pierre-Marie ^a,^b
^a Inserm U1094, IRD UMR270, University Limoges, CHU Limoges, EpiMaCT - Epidemiology of Chronic Diseases in Tropical Zone, Institute of Epidemiology and Tropical Neurology, Limoges, France
^b CHU Limoges, Clinical Data and Research Center CDCR, Limoges, France

^⁎Corresponding author.

Abstract

Mental health disorders have a high disability-adjusted life years in the Middle East and North Africa. This rise has led to a surge in related publications, prompting researchers to use AI tools like ChatGPT to reduce time spent on routine tasks. Our study aimed to validate an AI-assisted critical appraisal (CA) tool by comparing it with human raters.

We developed customized GPT models using ChatGPT-4. These models were tailored to evaluate studies using the Newcastle-Ottawa Scale (NOS) or the Jadad Scale in one model, while another model evaluated STROBE or CONSORT guidelines.

Our results showed a moderate to good agreement between human CA and our GPTs for the NOS for cohort, case control and cross-sectional studies and for the Jadad scale, with an ICC of 0.68 [95 %CI: 0.24–0.82], 0.69 [95 %CI: 0.31–0.88], 0.76 [95 %CI: 0.47–0.90] and 0.84 [95 %CI: 0.57–0.94] respectively. There was also a moderate to substantial agreement between the two methods for STROBE in cross sectional, cohort, case control studies, and for CONSORT in trial design, with a K of 0.63 [95 %CI: 0.56–0.70], 0.57 [95 %CI: 0.47–0.66], 0.48 [95 %CI: 0.38–0.50] and 0.70 [95 %CI: 0.63–0.77] respectively. Our custom GPT models produced hallucinations in 6.5 % and 4.9 % of cases, respectively. Human raters took an average of 19.6 ± 4.3 min per article, whereas our customized GPTs took only 1.4.

ChatGPT could be a useful tool for handling repetitive tasks yet its effective application relies on the critical expertise of researchers.

Le texte complet de cet article est disponible en PDF.

Keywords : Artificial intelligence, ChatGPT, Critical appraisal, Mental health, MENA

Plan

Introduction

Materials and methods

Study type

Study period and material

Development of customized GPT models

Validation process

Statistical analysis

Results

Characteristics of studies

Evaluating GPT models in CA

Agreement between manual and customized GPT CA for the Jadad scale

Agreement between manual and customized GPT CA of articles for the NOS adapted to cross sectional studies

Agreement between manual and customized GPT CA for the NOS for cohort studies

Agreement between manual and customized GPT CA for the NOS for case control studies

Agreement between manual and customized GPT evaluations across STROBEs and CONSORT guidelines

Discussion

Role of ChatGPT in preliminary assessment

Effectiveness with simplified tools such as the Jadad scale

Importance of vigilance and human expertise

Hallucinations of ChatGPT

Ethics and transparency

Strengths and limitations

Perspectives

Conclusion

Export

Vol 73 - N° 2

Article 202990- avril 2025 Retour au numéro

Article précédent

One-year post-intervention effectiveness of a proportionate universal intervention in reducing social inequalities of weight status among adolescents
Marcel Uwizeye, Mohamed Dakin, Florian Manneville, Johanne Langlois, Karine Legrand, Elisabeth Spitz, Philip Böhme, Edith Lecomte, Francis Guillemin, Serge Briançon, Abdou Omorou

| Article suivant

Agenda

Bienvenue sur EM-consulte, la référence des professionnels de santé.
L’accès au texte intégral de cet article nécessite un abonnement.

Déjà abonné à cette revue ?

connectez-vous ou créez un compte

Validation of a generative artificial intelligence tool for the critical appraisal of articles on the epidemiology of mental health: Its application in the Middle East and North Africa - 03/04/25

Abstract

Plan

Export citations

Fichier

Contenu

Accès rapides

Mon compte

Aide & support

Plateformes Elsevier Masson

Déclaration CNIL