Developing and Evaluating Large Language Model-Generated Emergency Medicine Handoff Notes.

TitleDeveloping and Evaluating Large Language Model-Generated Emergency Medicine Handoff Notes.
Publication TypeJournal Article
Year of Publication2024
AuthorsHartman V, Zhang X, Poddar R, McCarty M, Fortenko A, Sholle E, Sharma R, Campion T, Steel PAD
JournalJAMA Netw Open
Volume7
Issue12
Paginatione2448723
Date Published2024 Dec 02
ISSN2574-3805
KeywordsAdult, Aged, Cohort Studies, Documentation, Electronic Health Records, Emergency Medicine, Female, Humans, Male, Middle Aged, Natural Language Processing, Patient Handoff, Patient Safety
Abstract

IMPORTANCE: An emergency medicine (EM) handoff note generated by a large language model (LLM) has the potential to reduce physician documentation burden without compromising the safety of EM-to-inpatient (IP) handoffs.

OBJECTIVE: To develop LLM-generated EM-to-IP handoff notes and evaluate their accuracy and safety compared with physician-written notes.

DESIGN, SETTING, AND PARTICIPANTS: This cohort study used EM patient medical records with acute hospital admissions that occurred in 2023 at NewYork-Presbyterian/Weill Cornell Medical Center. A customized clinical LLM pipeline was trained, tested, and evaluated to generate templated EM-to-IP handoff notes. Using both conventional automated methods (ie, recall-oriented understudy for gisting evaluation [ROUGE], bidirectional encoder representations from transformers score [BERTScore], and source chunking approach for large-scale inconsistency evaluation [SCALE]) and a novel patient safety-focused framework, LLM-generated handoff notes vs physician-written notes were compared. Data were analyzed from October 2023 to March 2024.

EXPOSURE: LLM-generated EM handoff notes.

MAIN OUTCOMES AND MEASURES: LLM-generated handoff notes were evaluated for (1) lexical similarity with respect to physician-written notes using ROUGE and BERTScore; (2) fidelity with respect to source notes using SCALE; and (3) readability, completeness, curation, correctness, usefulness, and implications for patient safety using a novel framework.

RESULTS: In this study of 1600 EM patient records (832 [52%] female and mean [SD] age of 59.9 [18.9] years), LLM-generated handoff notes, compared with physician-written ones, had higher ROUGE (0.322 vs 0.088), BERTScore (0.859 vs 0.796), and SCALE scores (0.691 vs 0.456), indicating the LLM-generated summaries exhibited greater similarity and more detail. As reviewed by 3 board-certified EM physicians, a subsample of 50 LLM-generated summaries had a mean (SD) usefulness score of 4.04 (0.86) out of 5 (compared with 4.36 [0.71] for physician-written) and mean (SD) patient safety scores of 4.06 (0.86) out of 5 (compared with 4.50 [0.56] for physician-written). None of the LLM-generated summaries were classified as a critical patient safety risk.

CONCLUSIONS AND RELEVANCE: In this cohort study of 1600 EM patient medical records, LLM-generated EM-to-IP handoff notes were determined superior compared with physician-written summaries via conventional automated evaluation methods, but marginally inferior in usefulness and safety via a novel evaluation framework. This study suggests the importance of a physician-in-loop implementation design for this model and demonstrates an effective strategy to measure preimplementation patient safety of LLM models.

DOI10.1001/jamanetworkopen.2024.48723
Alternate JournalJAMA Netw Open
PubMed ID39625719
PubMed Central IDPMC11615705
Grant ListR01 AG076998 / AG / NIA NIH HHS / United States

Mailing Address
New York-Presbyterian Hospital
Weill Cornell Medical Center
Department of Emergency Medicine
525 E. 68th St., Box 179
New York, NY 10065

Office of the Chair
Emergency Medicine
525 E. 68th St., M-130
New York, NY 10065
(212) 746-0780

Residency Offices
Physician Residency
530 E. 70th St., M-127

New York, NY 10021
May2004@med.cornell.edu
(212) 746-0892

Physician Assistant
empa_residency@med.cornell.edu

Nurse Practitioner
ldm4001@med.cornell.edu

Research Office
525 E. 68th St., M-130
New York, NY 10065
EMResearch@med.cornell.edu

Leading Emergency Care