Performance of Google bard and ChatGPT in mass casualty incidents triage

Gan RK; Ogbodo JC; Wee YZ; Gan AZ; González PA

doi:10.1016/j.ajem.2023.10.034

Fulltext

Performance of Google bard and ChatGPT in mass casualty incidents triage

Gan RK ¹ , Ogbodo JC ² , Wee YZ ³ , Gan AZ ⁴ , González PA ⁵

Affiliations

¹ Unit for Research in Emergency and Disaster, Faculty of Medicine and Health Sciences, University of Oviedo, Oviedo 33006, Spain. Electronic address: [email protected]
² Unit for Research in Emergency and Disaster, Faculty of Medicine and Health Sciences, University of Oviedo, Oviedo 33006, Spain; Department of Primary Care and Population Health, Medical School, University of Nicosia, Nicosia 2408, Cyprus
³ Faculty of Computing & Informatics, Multimedia University, 63100 Cyberjaya, Selangor, Malaysia
⁴ Tenghilan Health Clinic, Tuaran 89208, Sabah, Malaysia; Hospital Universiti Sains Malaysia, 16150 Kota Bharu, Malaysia
⁵ Unit for Research in Emergency and Disaster, Faculty of Medicine and Health Sciences, University of Oviedo, Oviedo 33006, Spain

Am J Emerg Med, 2024 Jan;75:72-78.

PMID: 37967485 DOI: 10.1016/j.ajem.2023.10.034

Abstract

AIM: The objective of our research is to evaluate and compare the performance of ChatGPT, Google Bard, and medical students in performing START triage during mass casualty situations.

METHOD: We conducted a cross-sectional analysis to compare ChatGPT, Google Bard, and medical students in mass casualty incident (MCI) triage using the Simple Triage And Rapid Treatment (START) method. A validated questionnaire with 15 diverse MCI scenarios was used to assess triage accuracy and content analysis in four categories: "Walking wounded," "Respiration," "Perfusion," and "Mental Status." Statistical analysis compared the results.

RESULT: Google Bard demonstrated a notably higher accuracy of 60%, while ChatGPT achieved an accuracy of 26.67% (p = 0.002). Comparatively, medical students performed at an accuracy rate of 64.3% in a previous study. However, there was no significant difference observed between Google Bard and medical students (p = 0.211). Qualitative content analysis of 'walking-wounded', 'respiration', 'perfusion', and 'mental status' indicated that Google Bard outperformed ChatGPT.

CONCLUSION: Google Bard was found to be superior to ChatGPT in correctly performing mass casualty incident triage. Google Bard achieved an accuracy of 60%, while chatGPT only achieved an accuracy of 26.67%. This difference was statistically significant (p = 0.002).

* Title and MeSH Headings from MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine.

MeSH terms

Similar publications