[ad_1]
Discussion
AI is becoming more and more involved in everyday lives and in clinical practice. It is fundamental to teach healthcare professionals how to deal with this tool, understand its advantages and risks and learn how to integrate it into their everyday work.
Applications of AI and chatbots, such as ChatGPT, range from clinical to surgical to patient counselling. Recent studies show that ChatGPT could help draft responses to patients’ questions that physicians could edit. It has been shown that a chatbot could generate responses to patient questions in an online forum that were considered more empathetic than physicians’ responses.15 ChatGPT could also help give patients dietary advice, even if a professional check should always be considered, because in complex situations with the need for a tailored approach the efficacy appears reduced.16 Recent studies also showed that AI could be a useful tool in surgery as a guide to identify anatomical structures during laparoscopic procedures, which has the potential to help reduce adverse events during surgery.17
Our study compared the performance of paediatric surgery residents versus ChatGPT in answering multiple-choice questions and then analysing the residents’ perception of AI before and after being confronted with the test results. We observed a statistically significant difference between the results obtained by surgical residents and both versions of ChatGPT and between ChatGPT-3.5 and ChatGPT-4.0, with the latter outperforming the former. The accuracy seemed to decline when confronted with more complex or specialised questions, particularly those involving clinical cases. In fact, most ChatGPT errors occur when there are multiple data to consider and especially when clinical scenarios are complex. However, residents showed lower accuracy in responses to these questions as well and were outperformed by ChatGPT. ChatGPT-3.5 showed lower accuracy than ChatGPT-4.0 when answering questions related to medical definitions, while both versions performed better than residents. Our results align with other findings: one study showed that AI demonstrated high accuracy in different medical specialities, achieving 97% in multiple-choice questions when tested with The New England Journal of Medicine quiz. In another study, ChatGPT outperformed residents nationally on Plastic Surgery In-Service Examinations.18 Furthermore, it has been shown that when ChatGPT is provided with specific knowledge, its performance improves, approaching human-level accuracy.19
The limited number of participants in our study makes it difficult to draw definitive conclusions, and this outcome should not lead us to assume that AI can outperform healthcare professionals in everyday clinical practice. It is essential to recognise that the randomly selected questions aim to primarily assess participants’ punctual knowledge: this method may not accurately esteem AI’s performance in real-life scenarios, but it indicates that residents and doctors could rely on this tool for guidance and support, while being mindful of its limitations. It must also be acknowledged that, to account for the limitations of ChatGPT-3.5, image-based questions were excluded. This further restricted the ability to simulate real-life scenarios, limiting the accuracy of assessing ChatGPT’s efficacy. We must remember that ChatGPT was not explicitly designed for this purpose and was trained using different sources, some of which may not always be reliable. Consequently, the information provided could be inaccurate or even misleading, and the system needs continuous updating. It must also be kept in mind that AI has the potential to reproduce biases in the training data.20
In our study, participants were also asked to fill in the UTAUT2 Questionnaire both before and after being confronted with their results on the test and those of ChatGPT. The UTAUT2 Questionnaire is an acceptance model designed by Venkatesh et al11 12 to understand attitudes towards technology. It is based on evaluating several aspects that can define acceptance of technology. Interestingly, all items of the UTAUT2 Questionnaire showed higher ratings at T2 than T1, except for fear of technology, which decreased. Among the included residents, the predominant aspect that emerged was their high perception of the potential of using AI: PE was the highest mean value at both T1 and T2. PE refers to the degree to which technology will benefit users in performing certain activities. Its rating was notably higher at T2, almost reaching 4 out of 5 points. Participants perceived AI as simple to use, as reflected by the high value of effort expectancy, which represents the degree of ease associated with consumers’ use of technology. Interestingly, its value increased from 3.15 to 3.42 at T2, indicating that residents found AI even easier to use after being presented with the test results. Price value, which shows how consumers balance the benefits with the cost, also increased significantly from T1 to T2: its rating reaches a mean of 2.63/5 at T1 and a mean of 3.06 at T2 with a difference that appears to be statistically significant. Furthermore, technology fear also appears lower at T2, although not statistically significant. These results suggest that participants’ initial beliefs regarding the high price of AI compared with its benefits were challenged by the efficacy demonstrated by ChatGPT in answering the test questions and its performance compared with the residents. Interestingly, this knowledge also decreased the fear of AI at T2.
The limitations of this study include a low response rate among residents (42%), which may introduce selection bias by predominantly including individuals with a more favourable opinion of AI. The relatively small sample size of the cohort must also be acknowledged, as it may reduce the statistical power of the findings. However, despite its limited size, our cohort represents nearly half of all Italian Pediatric Surgery residents, making it a strong representation of this small population. Additionally, the residents who participated were from various residency years, making training level a variable that could influence their education and ability to respond to the questions. Unfortunately, the limited number of participants prevented a subgroup analysis based on residency year. It would be interesting in the future to propose this study aiming to include a larger number of residents and even specialists to compare performances and results. Moreover, although the proposed questions were primarily drawn from the European Board of Pediatric Surgery 2017 and EPSITE tests, they were randomly selected from larger question pools without validation for difficulty level. This may raise concerns about the consistency and reliability of the assessment.
The results of this survey highlight that learning about the capabilities of ChatGPT and the potential associated with integrating AI in clinical care could lead to a shift in perception, resulting in a more positive attitude toward AI. The statistically significant modification in the residents’ perceptions of AI post exposure to comparative performance data with ChatGPT highlights the fundamental role of direct experience and evidence in shaping attitudes towards technology. Integrating AI into paediatric surgery residency programmes could significantly impact residents’ training and preparedness for future clinical practice: residents can gain valuable experience in utilising advanced technologies for patient care, decision-making and research.
AI can also serve as an educational tool and allow objective evaluation of residents’ skills and performance, particularly in minimally invasive surgery training through video-based assessments and as an aid in laparoscopic and robotic surgery or in virtual reality simulation.21 22 By incorporating motion-tracking systems, AI can analyse factors such as dissection speed and precision, as well as study biometric data to assess surgical performance.23–26 Additionally, some studies have explored artificial systems designed to provide feedback and tutoring to surgical residents, while others have examined the use of AI to clarify and break down the steps of specific surgical procedures for training purposes.27–30 AI systems such as ChatGPT can also serve as valuable support tools for students and residents preparing for examinations. They can provide quick insights into specific questions, reducing the time needed for extensive research while enhancing understanding of complex topics.18 Moreover, AI could serve as a valuable tool in various disciplines and contexts, including the development of educational programmes for healthcare workers and patients to address vaccine hesitancy, a critical issue that must be addressed, aiming to improve awareness and attitudes.31 A recent review highlights the role of chatbots in fostering a vaccine-literate environment by combating misinformation and enhancing communication with healthcare professionals.32 The integration of AI in the educational field is even more justified by the fact that it is estimated that the doubling time of medical knowledge went from 50 years in the 1950s to 3.5 years in the 2010s. This expansion of knowledge will force medical schools and residency programmes to redefine the essential core of what students must learn and how they have to do it.33 This integration of AI in different medical fields should also be associated with creating guidelines regarding ethical aspects, privacy, data security and patient autonomy. Professionals such as clinicians, computer scientists, and ethicists must develop AI tools that prove ethically safe and scientifically accurate. Paediatric surgery residents’ inputs and ideas will be invaluable in creating a future where AI enhances, rather than supplants, the work of paediatric surgeons.
[ad_2]
Source link



