TY - JOUR
T1 - Enhancing Coronary Revascularization Decisions
T2 - The Promising Role of Large Language Models as a Decision-Support Tool for Multidisciplinary Heart Team
AU - Sudri, Karin
AU - Motro-Feingold, Iris
AU - Ramon-Gonen, Roni
AU - Barda, Noam
AU - Klang, Eyal
AU - Fefer, Paul
AU - Amunts, Sergei
AU - Attia, Zachi Itzhak
AU - Alkhouli, Mohamad
AU - Segev, Amitai
AU - Cohen-Shelly, Michal
AU - Barbash, Israel Moshe
N1 - Publisher Copyright:
© 2024 American Heart Association, Inc.
PY - 2024/11/1
Y1 - 2024/11/1
N2 - BACKGROUND: While clinical practice guidelines advocate for multidisciplinary heart team (MDHT) discussions in coronary revascularization, variability in implementation across health care settings remains a challenge. This variability could potentially be addressed by language learning models like ChatGPT, offering decision-making support in diverse health care environments. Our study aims to critically evaluate the concordance between recommendations made by MDHT and those generated by language learning models in coronary revascularization decision-making. METHODS: From March 2023 to July 2023, consecutive coronary angiography cases (n=86) that were referred for revascularization (either percutaneous or surgical) were analyzed using both ChatGPT-3.5 and ChatGPT-4. Case presentation formats included demographics, medical background, detailed description of angiographic findings, and SYNTAX score (Synergy Between Percutaneous Coronary Intervention With Taxus and Cardiac Surgery; I and II), which were presented in 3 different formats. The recommendations of the models were compared with those of an MDHT. RESULTS: ChatGPT-4 showed high concordance with decisions made by the MDHT (accuracy 0.82, sensitivity 0.8, specificity 0.83, and kappa 0.59), while ChatGPT-3.5 (0.67, 0.27, 0.84, and 0.12, respectively) showed lower concordance. Entropy and Fleiss kappa of ChatGPT-4 were 0.09 and 0.9, respectively, indicating high reliability and repeatability. The best correlation between ChatGPT-4 and MDHT was achieved when clinical cases were presented in a detailed context. Specific subgroups of patients yielded high accuracy (>0.9) of ChatGPT-4, including those with left main disease, 3 vessel disease, and diabetic patients. CONCLUSIONS: The present study demonstrates that advanced language learning models like ChatGPT-4 may be able to predict clinical recommendations for coronary artery disease revascularization with reasonable accuracy, especially in specific patient groups, underscoring their potential role as a supportive tool in clinical decision-making.
AB - BACKGROUND: While clinical practice guidelines advocate for multidisciplinary heart team (MDHT) discussions in coronary revascularization, variability in implementation across health care settings remains a challenge. This variability could potentially be addressed by language learning models like ChatGPT, offering decision-making support in diverse health care environments. Our study aims to critically evaluate the concordance between recommendations made by MDHT and those generated by language learning models in coronary revascularization decision-making. METHODS: From March 2023 to July 2023, consecutive coronary angiography cases (n=86) that were referred for revascularization (either percutaneous or surgical) were analyzed using both ChatGPT-3.5 and ChatGPT-4. Case presentation formats included demographics, medical background, detailed description of angiographic findings, and SYNTAX score (Synergy Between Percutaneous Coronary Intervention With Taxus and Cardiac Surgery; I and II), which were presented in 3 different formats. The recommendations of the models were compared with those of an MDHT. RESULTS: ChatGPT-4 showed high concordance with decisions made by the MDHT (accuracy 0.82, sensitivity 0.8, specificity 0.83, and kappa 0.59), while ChatGPT-3.5 (0.67, 0.27, 0.84, and 0.12, respectively) showed lower concordance. Entropy and Fleiss kappa of ChatGPT-4 were 0.09 and 0.9, respectively, indicating high reliability and repeatability. The best correlation between ChatGPT-4 and MDHT was achieved when clinical cases were presented in a detailed context. Specific subgroups of patients yielded high accuracy (>0.9) of ChatGPT-4, including those with left main disease, 3 vessel disease, and diabetic patients. CONCLUSIONS: The present study demonstrates that advanced language learning models like ChatGPT-4 may be able to predict clinical recommendations for coronary artery disease revascularization with reasonable accuracy, especially in specific patient groups, underscoring their potential role as a supportive tool in clinical decision-making.
KW - artificial intelligence
KW - coronary angiography
KW - coronary artery diseases
KW - delivery of health care
KW - natural language processing
UR - http://www.scopus.com/inward/record.url?scp=85209220943&partnerID=8YFLogxK
U2 - 10.1161/circinterventions.124.014201
DO - 10.1161/circinterventions.124.014201
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
C2 - 39502077
AN - SCOPUS:85209220943
SN - 1941-7640
VL - 17
SP - e014201
JO - Circulation: Cardiovascular Interventions
JF - Circulation: Cardiovascular Interventions
IS - 11
ER -