Development and Validation of a Natural Language Processing Model to Identify Low-Risk Pulmonary Embolism in Real Time to Facilitate Safe Outpatient Management - 18/07/24
Abstract |
Study objective |
This study aimed to (1) develop and validate a natural language processing model to identify the presence of pulmonary embolism (PE) based on real-time radiology reports and (2) identify low-risk PE patients based on previously validated risk stratification scores using variables extracted from the electronic health record at the time of diagnosis. The combination of these approaches yielded an natural language processing-based clinical decision support tool that can identify patients presenting to the emergency department (ED) with low-risk PE as candidates for outpatient management.
Methods |
Data were curated from all patients who received a PE-protocol computed tomography pulmonary angiogram (PE-CTPA) imaging study in the ED of a 3-hospital academic health system between June 1, 2018 and December 31, 2020 (n=12,183). The “preliminary” radiology reports from these imaging studies made available to ED clinicians at the time of diagnosis were adjudicated as positive or negative for PE by the clinical team. The reports were then divided into development, internal validation, and temporal validation cohorts in order to train, test, and validate an natural language processing model that could identify the presence of PE based on unstructured text. For risk stratification, patient- and encounter-level data elements were curated from the electronic health record and used to compute a real-time simplified pulmonary embolism severity (sPESI) score at the time of diagnosis. Chart abstraction was performed on all low-risk PE patients admitted for inpatient management.
Results |
When applied to the internal validation and temporal validation cohorts, the natural language processing model identified the presence of PE from radiology reports with an area under the receiver operating characteristic curve of 0.99, sensitivity of 0.86 to 0.87, and specificity of 0.99. Across cohorts, 10.5% of PE-CTPA studies were positive for PE, of which 22.2% were classified as low-risk by the sPESI score. Of all low-risk PE patients, 74.3% were admitted for inpatient management.
Conclusion |
This study demonstrates that a natural language processing-based model utilizing real-time radiology reports can accurately identify patients with PE. Further, this model, used in combination with a validated risk stratification score (sPESI), provides a clinical decision support tool that accurately identifies patients in the ED with low-risk PE as candidates for outpatient management.
Le texte complet de cet article est disponible en PDF.Plan
Supervising editor: Stephen Schenkel, MD, MPP. Specific detailed information about possible conflict of interest for individual editors is available at editors. |
|
Author contributions: AS and SJ developed and wrote the initial proposal for this project. KA, EHW, WR, BH, MG, and MN contributed to data gathering, data curation, and model development. KA drafted and incorporated revisions for all versions of the manuscript. Edits to the manuscript were provided by all authors. AS, TH, CB, SF, BJT, and SJ served as clinical liaisons for the project and provided suggestions and feedback for model and project development. MS takes responsibility for the study as a whole. |
|
Data sharing statement: The data dictionary and analytic code for this investigation are available on request from the date of article publication by contacting Mark Sendak, MD, at mark.sendak@duke.edu. |
|
Authorship: All authors attest to meeting the four ICMJE.org authorship criteria: (1) Substantial contributions to the conception or design of the work; or the acquisition, analysis, or interpretation of data for the work; AND (2) Drafting the work or revising it critically for important intellectual content; AND (3) Final approval of the version to be published; AND (4) Agreement to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. |
|
Funding and support: By Annals’ policy, all authors are required to disclose any and all commercial, financial, and other relationships in any way related to the subject of this article as per ICMJE conflict of interest guidelines (see www.icmje.org/). Funding for this study was provided by the Duke Institute for Health Innovation Award (2020). The authors have declared that no competing interests exist. |
|
Presentation information: This study was presented at the Machine Learning for Healthcare Conference (virtual) on August 6, 2021. |
|
Please see page 119 for the Editor’s Capsule Summary of this article. |
|
A podcast for this article is available at www.annemergmed.com. |
|
Readers: click on the link to go directly to a survey in which you can provide DYTD8GY to Annals on this particular article. |
Vol 84 - N° 2
P. 118-127 - août 2024 Retour au numéroBienvenue sur EM-consulte, la référence des professionnels de santé.
L’accès au texte intégral de cet article nécessite un abonnement.
Déjà abonné à cette revue ?