A recent research paper titled “Assessing the Utility of ChatGPT Throughout the Entire Clinical Workflow: Development and Usability Study” published in the Journal of Medical Internet Research evaluates the utility of ChatGPT in clinical decision-making. ChatGPT, a large language model (LLM) based on OpenAI’s Generative Pre-trained Transformer-3.5, was tested using 36 clinical vignettes from the Merck Sharpe & Dohme (MSD) Clinical Manual. The study aimed to assess its performance in providing clinical decision support, encompassing differential diagnoses, diagnostic testing, final diagnosis, and management based on patient demographics and case specifics.
The findings showed that ChatGPT achieved an overall accuracy of 71.7% across all vignettes, excelling in final diagnoses with a 76.9% accuracy rate. However, it had a lower performance in generating initial differential diagnoses, with a 60.3% accuracy rate. The accuracy was consistent across patient age and gender, indicating a broad applicability in various clinical contexts. This performance was measured without ChatGPT’s access to the internet, relying solely on its training data up until 2021.
ChatGPT’s utility was evaluated by presenting each clinical workflow component as a successive prompt, allowing the model to integrate information from earlier parts of the conversation into later responses. This approach mirrors the iterative nature of clinical medicine, where new information continuously updates prior hypotheses.
The study is significant as it presents first-of-its-kind evidence on the potential use of AI tools like ChatGPT throughout the entire clinical workflow. It highlights the model’s ability to adapt and respond to changing clinical scenarios, a crucial aspect of patient care. This research opens new possibilities for AI assistance in healthcare, potentially enhancing decision-making, treatment, and care in various medical settings.
Image source: Shutterstock