Battle of the (Chat)Bots: Comparing Large Language Models to Practice Guidelines for Transfusion-Associated Graft-Versus-Host Disease Prevention

Laura D. Stephens; Jeremy W. Jacobs; Brian D. Adkins; Garrett S. Booth

doi:10.1016/j.tmrv.2023.150753

Battle of the (Chat)Bots: Comparing Large Language Models to Practice Guidelines for Transfusion-Associated Graft-Versus-Host Disease Prevention

Laura D. Stephens, Jeremy W. Jacobs, Brian D. Adkins, Garrett S. Booth

Research output: Contribution to journal › Article › peer-review

1 Scopus citations

Abstract

Published guidelines and clinical practices vary when defining indications for irradiation of blood components for the prevention of transfusion-associated graft-versus-host disease (TA-GVHD). This study assessed irradiation indication lists generated by multiple artificial intelligence (AI) programs, or chatbots, and compared them to 2020 British Society for Haematology (BSH) practice guidelines. Four chatbots (ChatGPT-3.5, ChatGPT-4, Bard, and Bing Chat) were prompted to list the indications for irradiation to prevent TA-GVHD. Responses were graded for concordance with BSH guidelines. Chatbot response length, discrepancies, and omissions were noted. Chatbot responses differed, but all were relevant, short in length, generally more concordant than discordant with BSH guidelines, and roughly complete. They lacked several indications listed in BSH guidelines and notably differed in their irradiation eligibility criteria for fetuses and neonates. The chatbots variably listed erroneous indications for TA-GVHD prevention, such as patients receiving blood from a donor who is of a different race or ethnicity. This study demonstrates the potential use of generative AI for transfusion medicine and hematology topics but underscores the risk of chatbot medical misinformation. Further study of risk factors for TA-GVHD, as well as the applications of chatbots in transfusion medicine and hematology, is warranted.

Original language	English (US)
Article number	150753
Journal	Transfusion Medicine Reviews
Volume	37
Issue number	3
DOIs	https://doi.org/10.1016/j.tmrv.2023.150753
State	Published - Jul 2023
Externally published	Yes

Keywords

Artificial intelligence
Blood transfusion
Medical ethics
Transfusion medicine

ASJC Scopus subject areas

Hematology
Clinical Biochemistry
Biochemistry, medical

Access to Document

10.1016/j.tmrv.2023.150753

Cite this

@article{16c3133d9eb14c96a467704fe31b99d2,

title = "Battle of the (Chat)Bots: Comparing Large Language Models to Practice Guidelines for Transfusion-Associated Graft-Versus-Host Disease Prevention",

abstract = "Published guidelines and clinical practices vary when defining indications for irradiation of blood components for the prevention of transfusion-associated graft-versus-host disease (TA-GVHD). This study assessed irradiation indication lists generated by multiple artificial intelligence (AI) programs, or chatbots, and compared them to 2020 British Society for Haematology (BSH) practice guidelines. Four chatbots (ChatGPT-3.5, ChatGPT-4, Bard, and Bing Chat) were prompted to list the indications for irradiation to prevent TA-GVHD. Responses were graded for concordance with BSH guidelines. Chatbot response length, discrepancies, and omissions were noted. Chatbot responses differed, but all were relevant, short in length, generally more concordant than discordant with BSH guidelines, and roughly complete. They lacked several indications listed in BSH guidelines and notably differed in their irradiation eligibility criteria for fetuses and neonates. The chatbots variably listed erroneous indications for TA-GVHD prevention, such as patients receiving blood from a donor who is of a different race or ethnicity. This study demonstrates the potential use of generative AI for transfusion medicine and hematology topics but underscores the risk of chatbot medical misinformation. Further study of risk factors for TA-GVHD, as well as the applications of chatbots in transfusion medicine and hematology, is warranted.",

keywords = "Artificial intelligence, Blood transfusion, Medical ethics, Transfusion medicine",

author = "Stephens, {Laura D.} and Jacobs, {Jeremy W.} and Adkins, {Brian D.} and Booth, {Garrett S.}",

note = "Publisher Copyright: {\textcopyright} 2023 The Author(s)",

year = "2023",

month = jul,

doi = "10.1016/j.tmrv.2023.150753",

language = "English (US)",

volume = "37",

journal = "Transfusion Medicine Reviews",

issn = "0887-7963",

publisher = "W.B. Saunders Ltd",

number = "3",

}

TY - JOUR

T1 - Battle of the (Chat)Bots

T2 - Comparing Large Language Models to Practice Guidelines for Transfusion-Associated Graft-Versus-Host Disease Prevention

AU - Stephens, Laura D.

AU - Jacobs, Jeremy W.

AU - Adkins, Brian D.

AU - Booth, Garrett S.

PY - 2023/7

Y1 - 2023/7

N2 - Published guidelines and clinical practices vary when defining indications for irradiation of blood components for the prevention of transfusion-associated graft-versus-host disease (TA-GVHD). This study assessed irradiation indication lists generated by multiple artificial intelligence (AI) programs, or chatbots, and compared them to 2020 British Society for Haematology (BSH) practice guidelines. Four chatbots (ChatGPT-3.5, ChatGPT-4, Bard, and Bing Chat) were prompted to list the indications for irradiation to prevent TA-GVHD. Responses were graded for concordance with BSH guidelines. Chatbot response length, discrepancies, and omissions were noted. Chatbot responses differed, but all were relevant, short in length, generally more concordant than discordant with BSH guidelines, and roughly complete. They lacked several indications listed in BSH guidelines and notably differed in their irradiation eligibility criteria for fetuses and neonates. The chatbots variably listed erroneous indications for TA-GVHD prevention, such as patients receiving blood from a donor who is of a different race or ethnicity. This study demonstrates the potential use of generative AI for transfusion medicine and hematology topics but underscores the risk of chatbot medical misinformation. Further study of risk factors for TA-GVHD, as well as the applications of chatbots in transfusion medicine and hematology, is warranted.

AB - Published guidelines and clinical practices vary when defining indications for irradiation of blood components for the prevention of transfusion-associated graft-versus-host disease (TA-GVHD). This study assessed irradiation indication lists generated by multiple artificial intelligence (AI) programs, or chatbots, and compared them to 2020 British Society for Haematology (BSH) practice guidelines. Four chatbots (ChatGPT-3.5, ChatGPT-4, Bard, and Bing Chat) were prompted to list the indications for irradiation to prevent TA-GVHD. Responses were graded for concordance with BSH guidelines. Chatbot response length, discrepancies, and omissions were noted. Chatbot responses differed, but all were relevant, short in length, generally more concordant than discordant with BSH guidelines, and roughly complete. They lacked several indications listed in BSH guidelines and notably differed in their irradiation eligibility criteria for fetuses and neonates. The chatbots variably listed erroneous indications for TA-GVHD prevention, such as patients receiving blood from a donor who is of a different race or ethnicity. This study demonstrates the potential use of generative AI for transfusion medicine and hematology topics but underscores the risk of chatbot medical misinformation. Further study of risk factors for TA-GVHD, as well as the applications of chatbots in transfusion medicine and hematology, is warranted.

KW - Artificial intelligence

KW - Blood transfusion

KW - Medical ethics

KW - Transfusion medicine

UR - http://www.scopus.com/inward/record.url?scp=85171741387&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85171741387&partnerID=8YFLogxK

U2 - 10.1016/j.tmrv.2023.150753

DO - 10.1016/j.tmrv.2023.150753

M3 - Article

C2 - 37704461

AN - SCOPUS:85171741387

SN - 0887-7963

VL - 37

JO - Transfusion Medicine Reviews

JF - Transfusion Medicine Reviews

IS - 3

M1 - 150753

ER -

Battle of the (Chat)Bots: Comparing Large Language Models to Practice Guidelines for Transfusion-Associated Graft-Versus-Host Disease Prevention

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this