Since the late 2022 release of OpenAI’s ChatGPT, an open-source natural language processing tool that can generate human-like intellectual content and responses, journal editors, including those at the Journal of Graduate Medical Education (JGME), have intensified debates around fair, appropriate use of this technology for submitted manuscripts.1-4 We have recently expanded JGME’s overall policy of transparency to include artificial intelligence (AI): “Authors must communicate any use of artificial intelligence technology and similar tools, whether for writing assistance, storage, interpretation of qualitative research data, statistical analysis of numerical data, creation of visual imagery, or other uses.”5 In addition, listed JGME authors must be human. Why? Only humans can be accountable for all aspects of the work, a critical authorship criterion.6 Beyond these considerations of authorship and transparency, we are curious about potential AI uses in graduate medical education (GME) for both the process and content of training.7-9
In 2020, we asked 40 thought leaders in medical education what they believed would be the most significant transformation in GME over the next decade: AI was a major theme.10 Three years later, AI use in daily life is everywhere, from cell phones and credit checks to weather and climate predictions. AI examples include natural language processing, speech recognition, and creating written works (See if you can tell the difference between human- and AI-generated text in Box 1). The volume of health information has increased beyond the capacity of humans to sift and digest: in 1950 medical information doubled in 50 years; by 2010, it took 3.5 years, and the estimate for 2020 was 73 days.11 However, only a few undergraduate medical schools have added AI experiences, and the Accreditation Council for Graduate Medical Education Milestones do not include AI competencies.11 Currently AI is being introduced into learning and clinical diagnosis support, automated assessments of clinical skills, setting educational goals, and designing curriculum and performance assessments.12,13
Briefly describe the differences between formative and summative feedback in residency education
Formative feedback and summative feedback are 2 distinct types of assessment in residency education.
Formative feedback is designed to provide ongoing support and guidance to learners during their educational journey. It focuses on identifying strengths and areas for improvement, with the goal of helping residents develop and grow. Formative feedback is usually given throughout the learning process and is more subjective and descriptive in nature. It emphasizes constructive criticism, suggestions for improvement, and actionable recommendations. The primary purpose of formative feedback is to enhance learning and facilitate skill development.
On the other hand, summative feedback is given at the end of a specific period or task, such as an examination, rotation, or residency program. It aims to evaluate the overall performance and achievement of learners against predetermined standards or objectives. Summative feedback tends to be more objective and evaluative in nature, focusing on measuring competency or proficiency levels. It often involves grading or rating systems and provides a comprehensive assessment of a resident’s abilities and readiness for advancement or certification.
In summary, formative feedback is ongoing, developmental, and serves to guide and support learners, whereas summative feedback is final, evaluative, and provides an overall judgment of performance. Both types of feedback are valuable in residency education as they help foster growth, identify areas of improvement, and measure competence and progress.
Formative feedback in graduate medical education (GME) occurs throughout clinical and other rotations, often in the form of immediate feedback regarding performance. For example, after a difficult patient interview or procedure, the attending physician may ask the resident what they think were aspects that went well, or less well. The attending physician can corroborate, enhance, or correct the resident’s self-evaluation, while also setting a few goals for future encounters. Formative feedback also occurs at scheduled times, such as a mid-rotation meeting, with attention to specific competencies assigned to the current experience or assignments. Thus, formative feedback provides ongoing information to the resident to improve performance as well as reinforce strong performances.
Summative feedback is associated with end-of-rotation assessments or biannual milestones assessments, akin to “grades” in school. Most assessment experts now consider that all assessments should primarily align with formative, rather than summative feedback approaches, to be more effective and support a growth mindset in GME trainees.
Scholars have hailed the potential benefits of AI, such as improving manuscripts for authors writing in a second language, saving researchers time when creating research documents or conducting literature searches, and freeing trainees from memorization to focus on reasoning, counseling, and shared decision-making with patients.11,14 Other scholars are concerned that AI tools in GME present serious risks. These include loss of trainee and patient anonymity in confidential information, failure to develop key competencies as those tasks are outsourced to AI tools, erroneous evidence and assessment summaries, and “paper mills,” or fabrication of fake research to pad resumes and grant applications.15-18 As AI is wholly dependent upon the quality and included biases of the available training data, it may generate inaccurate syntheses, which could be missed by inexperienced trainees and beleaguered faculty. In addition, AI could negatively affect trainees’ ability to develop mental models or learn key foundational skills, such as clinical reasoning in treating complex patients. Many patients, with their unique values, social supports, finances, and medical histories, may be hard to fit into AI’s derived diagnoses and management plans. In 2023, the optimal interactions between physicians and AI, patients and AI, and trainees and AI are unknown, yet graduating residents and fellows will increasingly manage these interactions and confront numerous ethical sequelae.
Medical education is notorious for diving headlong into the next innovation, often with scanty evidence. But as Dr Rachel Ellaway points out, “the genie is out of the bottle.”3 We need research that examines the best uses of AI in GME. This research should include exploration of both the use of AI during training and a greater understanding of the ethical, beneficial uses of AI in medical education research. Which research tasks remain uniquely human? Which AI educational functions require minimal human oversight? Will AI replace some physician skills, rendering them obsolete and no longer taught in GME? Will AI technology transform cognitive skills, similar to robotics in surgery? We invite reports of how GME trainees, programs, faculty, and institutions are learning about and using AI tools (Box 2).
Curriculum design, implementation, and evaluation
Analysis of faculty clinical teaching performance to provide feedback
Formative feedback on communication and interpersonal skills, based on AI analysis of videos of learner interactions with patients or colleagues
Analysis of EHR data to understand trainee clinical diagnostic reasoning
Visual summary representation of resident assessments—text, audio, image, video—for an assessment portfolio, for competency committee decisions and individual learning goals
Creating faculty avatars for coaching, advising, or assessment of learners linked to specific competencies and performance standards
Resident AI technology skills that increase time with patients without lowering patient and resident satisfaction, patient quality of care, and resident learning
Resident and faculty skills in detecting and avoiding chatbots or other AI-generated “hallucinations” such as fake citations, resident history and physical examinations, and other materials
Interactive AI for simulations
Faculty and trainee skills to efficiently verify evidence complied by AI technology
Effects of AI technology on trust in resident-patient relations
Effects of AI on professional identity formation in GME trainees
Of substantial concern is whether for-profit entities, which have contributed to the growing disparities in US health care, will encourage problematic AI use in medical education and patient care. We prefer that ethics and evidence lead the way. The US health care system is increasingly a “flailing medical industry shaped by profit rather than ethics.”19,20 Thus we need AI tools that can enhance health equity. AI is hyped as a way to generate faster, more accurate diagnoses; reduce physician error; reduce or eliminate repetitive, mundane physician tasks such as the electronic health record (EHR); and reduce the high cost of health care.11 Some of us are wary of these claims. They are reminiscent of those touted for EHRs, now shown to produce increased physician documentation time and burnout, especially for the endangered species known as primary care physicians.21
What do most GME educators want from AI? Our guess is they desire more time for teaching and tools that will help more learners reach “mastery” competency levels. They also want their trainees well prepared for future AI uses, in a medical utopia rather than a dystopian Brave New World.22
Answer to BOX 1: #1: Chatbot; #2: Human, JGME editor