S'abonner

Deep learning-based natural language processing for detecting medical symptoms and histories in emergency patient triage - 12/02/24

Doi : 10.1016/j.ajem.2023.11.063 
Siryeol Lee, B.S. a, Juncheol Lee, Ph.D. b, Juntae Park, B.S. c, Jiwoo Park, B.S. d, Dohoon Kim, B.S. e, Joohyun Lee, Ph.D. a, c, , Jaehoon Oh, Ph.D. b, ⁎⁎
a Department of Applied Artificial Intelligence, Hanyang University ERICA, Ansan, Republic of Korea 
b Department of Emergency Medicine, College of Medicine, Hanyang University, Seoul, Republic of Korea 
c School of Electrical Engineering, Hanyang University ERICA, Ansan, Republic of Korea 
d Department of Emergency Medicine, Hanyang University Hospital, Seoul, Republic of Korea 
e Department of Translational Medicine, Biomedical Science and Engineering Hanyang University, Seoul, Republic of Korea 

Correspondence to: Joohyun Lee, School of Electrical Engineering, Hanyang University, 55 Hanyangdaehak-ro, Sangnok-gu, Ansan 15588, Republic of Korea.School of Electrical EngineeringHanyang University55 Hanyangdaehak-ro, Sangnok-guAnsan15588Republic of Korea⁎⁎Correspondence to: Jaehoon Oh, Department of Emergency Medicine, College of Medicine, Hanyang University, 222-1 Wangsimni-ro, Seongdong-gu, Seoul 04763, Republic of Korea.Department of Emergency Medicine, College of MedicineHanyang University222-1 Wangsimni-ro, Seongdong-guSeoul04763Republic of Korea

Abstract

Objective

The manual recording of electronic health records (EHRs) by clinicians in the emergency department (ED) is time-consuming and challenging. In light of recent advancements in large language models (LLMs) such as GPT and BERT, this study aimed to design and validate LLMs for automatic clinical diagnoses. The models were designed to identify 12 medical symptoms and 2 patient histories from simulated clinician–patient conversations within 6 primary symptom scenarios in emergency triage rooms.

Materials and method

We developed classification models by fine-tuning BERT, a transformer-based pre-trained model. We subsequently analyzed these models using eXplainable artificial intelligence (XAI) and the Shapley additive explanation (SHAP) method. A Turing test was conducted to ascertain the reliability of the XAI results by comparing them to the outcomes of tasks performed and explained by medical workers. An emergency medicine specialist assessed the results of both XAI and the medical workers.

Results

We fine-tuned four pre-trained LLMs and compared their classification performance. The KLUE-RoBERTa-based model demonstrated the highest performance (F1-score: 0.965, AUROC: 0.893) on human-transcribed script data. The XAI results using SHAP showed an average Jaccard similarity of 0.722 when compared with explanations of medical workers for 15 samples. The Turing test results revealed a small 6% gap, with XAI and medical workers receiving the mean scores of 3.327 and 3.52, respectively.

Conclusion

This paper highlights the potential of LLMs for automatic EHR recording in Korean EDs. The KLUE-RoBERTa-based model demonstrated superior classification performance. Furthermore, XAI using SHAP provided reliable explanations for model outputs. The reliability of these explanations was confirmed by a Turing test.

Le texte complet de cet article est disponible en PDF.

Highlights

The data was collected from simulated clinician-patient conversations.
The fine-tuned large language model identifies medical information included in electronic health records.
The outcomes of the model were interpreted through eXplainable AI.
The Turing test was conducted to demonstrate the reliability of the eXplainable AI results.

Le texte complet de cet article est disponible en PDF.

Keywords : Natural language processing, Electronic health record, Large language models, eXplainable artificial intelligence, Turing test


Plan


© 2023  Elsevier Inc. Tous droits réservés.
Ajouter à ma bibliothèque Retirer de ma bibliothèque Imprimer
Export

    Export citations

  • Fichier

  • Contenu

Vol 77

P. 29-38 - mars 2024 Retour au numéro
Article précédent Article précédent
  • Methodological quality of systematic reviews on sepsis treatments: A cross-sectional study
  • Leonard Ho, Xi Chen, Yan Ling Kwok, Irene X.Y. Wu, Chen Mao, Vincent Chi Ho Chung
| Article suivant Article suivant
  • The TriAGe + score for vertigo or dizziness: A validation study in a university hospital emergency department in Hong Kong
  • Adrian Ho-Kun Yu, Ling Yan Leung, Thomas W.H. Leung, Jill M. Abrigo, Koon Ho Cheung, Chi Hung Cheng, Colin A. Graham

Bienvenue sur EM-consulte, la référence des professionnels de santé.
L’accès au texte intégral de cet article nécessite un abonnement.

Déjà abonné à cette revue ?

Mon compte


Plateformes Elsevier Masson

Déclaration CNIL

EM-CONSULTE.COM est déclaré à la CNIL, déclaration n° 1286925.

En application de la loi nº78-17 du 6 janvier 1978 relative à l'informatique, aux fichiers et aux libertés, vous disposez des droits d'opposition (art.26 de la loi), d'accès (art.34 à 38 de la loi), et de rectification (art.36 de la loi) des données vous concernant. Ainsi, vous pouvez exiger que soient rectifiées, complétées, clarifiées, mises à jour ou effacées les informations vous concernant qui sont inexactes, incomplètes, équivoques, périmées ou dont la collecte ou l'utilisation ou la conservation est interdite.
Les informations personnelles concernant les visiteurs de notre site, y compris leur identité, sont confidentielles.
Le responsable du site s'engage sur l'honneur à respecter les conditions légales de confidentialité applicables en France et à ne pas divulguer ces informations à des tiers.


Tout le contenu de ce site: Copyright © 2024 Elsevier, ses concédants de licence et ses contributeurs. Tout les droits sont réservés, y compris ceux relatifs à l'exploration de textes et de données, a la formation en IA et aux technologies similaires. Pour tout contenu en libre accès, les conditions de licence Creative Commons s'appliquent.