The year 2024 has been full of turbulence, with a constant barrage of serious world events. Meanwhile, the Journal of Graduate Medical Education (JGME) published a Climate Change and Graduate Medical Education supplement and welcomed a host of new editors, including 7 new resident editors. Amid this change and turmoil, JGME senior editors invite you to consider these notable non-JGME articles for a quiet read, in a calm moment (Table). As before, we have not applied a rigorous scientific method for selection and, due to publication deadlines, articles published in December 2024 could not be considered. Nevertheless, we think these are worthy of your attention. Let us know your favorite 2024 articles and what you think of ours by emailing [email protected] or posting on Bluesky and tagging @jgmejournal.bsky.social.

Table

JGME Editors’ Picks for Best Non-JGME Articles of 2024

JGME Editors’ Picks for Best Non-JGME Articles of 2024
JGME Editors’ Picks for Best Non-JGME Articles of 2024

Some of the educational problems that intrigue me the most concern complex phenomena involving the perceptions, emotions, and experiences of our trainees. Understanding these phenomena is crucial to meeting our residents’ educational needs and improving their performance. Often the process of understanding is a journey best guided by qualitative methods.1  Two such problems are (1) how can we harness the ability of feedback to improve performance in graduate medical education (GME) and (2) which factors most threaten well-being in modern GME?

This year Ginsburg et al contributed to our evolving understanding of question one.2  Prior studies have explored factors that affect feedback-seeking behavior and suggest that learner-initiated feedback is more likely to be received and incorporated.3  However, much of what we know about feedback-seeking behavior relates to the experience of the learner, rather than how faculty perceive this behavior, which logically also impacts feedback conversations. In this constructivist grounded theory study, the authors interviewed faculty to explore how supervisors perceive trainee requests for feedback. Participants perceived 4 motivations for feedback-seeking behavior: affirmation or praise, desire to improve, to fulfill an administrative requirement, and hidden purposes (eg, making a good impression). Further, factors such as timing of the request, relationship with the learner, and learner reactions to the feedback influenced faculty perceptions of learner motivations as well as their emotional responses to requests. For me this article highlights that well-meaning but reactive solutions to the feedback problem—such as administrative requirements that learners ask for feedback—may not always result in feedback that benefits learner performance. The study also reinforces the complexity of this crucial component of medical education. The article is a great read because of the importance of the topic, beautiful writing, rigorous methods, and thought-provoking recommendations for learners and supervisors regarding feedback-seeking behavior.

In the literature, at meetings, and in conversations with trainees and colleagues, the topic of moral distress as a contributor to well-being keeps coming up.4  In a qualitative hermeneutic phenomenology (HP) interview study, Chang and colleagues explore what physicians experience when they are unable to take the course of action that feels right and how those experiences influence their health care interactions.5  Although this study was done with practicing Canadian physicians, the insights gleaned are relevant to GME. Participants described moral distress that affected their sense-making and well-being for decades, including difficult emotions and often intrusive memories. HP interprets lived experiences to understand phenomena, such as emotions, while considering the participant’s role in groups or environments.6  HP studies often construct phenomenological examples by weaving together quotes and meaning from the dataset, to create a composite example that pulls the reader into the participant experience. Educators can get a sense of the composite participant experience of moral injury from the hermeneutic example provided in the article—an experience that will resonate with many of us. This article is a great example of how HP can be applied to education research and may prompt further discussions of how moral distress may affect trainees. A table in this article presents ways physicians find consolation and meaning, in the midst of moral distress, which may guide educators in supporting trainees.

Finally, in addition to qualitative articles, I have a soft spot in my heart for a well-written research methods article, particularly when the method is relevant and underutilized in GME. An important part of any educational intervention, from procedure labs to competency-based medical education, is program evaluation.7  Whether for internal quality improvement or for research purposes, educators may find that traditional methods of program evaluation, such as Kirkpatrick’s model of outcomes, do not provide sufficient understanding, of why and how interventions work or fail, to inform improvement efforts.8  Realist evaluation seeks to understand what works, for whom, and in what contexts, and is particularly well suited to educational interventions that may have variable success among different programs or specialties.9  In these situations program evaluation that incorporates realist interviews may illuminate opportunities for revising the intervention, improving implementation, and supporting sustainability. Rees et al present a roadmap to realist interviews in health professions education through a critical analysis that defines realist evaluation and provides recommendations for conducting realist interviews.10  Program directors will find this approach helpful, whether applying it in a rigorous fashion for research studies or considering the framework for program evaluation.

In June 2020 we envisioned GME in 2030 based on interviews with GME thought leaders.11  Big data, artificial intelligence (AI), and technology were among our major drivers of change. We were spot on but a little late! In 2024, AI/large language model (LLM) medical education articles exploded. These articles demonstrated the power of AI to outperform or equivalently perform humans on various tasks. Other articles focused on using AI/LLM to create assessments and provide automated feedback to learners, including use of virtual reality and simulations. Guides and practical strategies12-14 ; a checklist for reporting, reading, and evaluating AI research15 ; and review articles were also published.16-18 

An article from Mollick et al stands out for its vision and practical applications to education.19  This article describes a prototype for an AI-developed simulation in which the entirety of learning is through AI agents, from an instructional video outlining key task elements to an AI mentor who provides personalized guidance in preparation for the task. This instruction is then actively practiced with 1 of 3 AI agent roles and concludes with an AI evaluator offering specific, actionable feedback. An AI “progress agent” provides an overview of each learner’s progress during the tasks (like a teaching assistant) and passes it on to a human “insights” agent, through concise summaries of learner’s actions, decisions, strengths, and opportunities.

This AI simulation is strongly grounded in good pedagogy and ethical considerations for AI use in education. It includes detailed technical specifics, including agent roles, ethics, and prompts, for tech geeks. However, the article does not overwhelm those of us newer to the field. This AI prototype should excite medical educators as we think about learner opportunities for safe, deliberate practice on complex entities, as well as for faculty to use simulation to practice their new roles as AI-informed educators. As outlined by GME thought leaders in 2020, this presents an energizing future if we have the will and the courage to engage.

Also worthy is a study by Cabral et al.20  In this study internal medicine residents (n=18) and attending physicians (n=21) at 2 academic medical centers and a chatbot (GPT-4 from Open AI) worked through a case not previously seen or published. Some humans were randomly assigned to have the option to use the AI chatbot. The most intriguing results, besides AI being better than physicians in processing medical data and clinical reasoning, were that humans, independent of whether they used the chatbot, performed about the same, as few knew how to effectively use the AI tool. The good news? Our medical education colleagues are working to educate current and future physicians on AI as a co-intelligence.

If you are seeking to understand potential impacts and opportunities for AI for medical education, the article by Boscardin et al provides a nice introduction to Chat GPT, chatbots, and generative AI.21  The authors discuss AI competencies and AI literacy: understanding AI capabilities and tools, integrating AI tools into teaching, and examining AI and equity. They review the aims, potential impacts and opportunities for generative AI for admissions, learning, assessment, and medical education research. It’s an easy, informative AI read.

With the advent of competency-based medical education (CBME), articles and conference plenaries have debated the value of time-defined medical training, for undergraduate medical education (UME) and GME. With a growth mindset in mind, maintaining the expected (and contracted) time in GME, even if minimum competencies are achieved earlier, encourages further improvement in skills to aspirational levels. Alternatively, with an efficiency and cost mindset, the shortest possible path through medical education attracts adherents. Related to this debate, medical schools are creating 3-year accelerated programs.

A historical context is helpful. Three-year accelerated medical school training has occurred before in the United States, during WWII and a second peak in the 1970s, related to physician shortage fears. By 1973, one-third of all US medical students were in a 3-year program, which also included 6-year combined college and medical school programs, supported in part by government capitated funding.22  Despite some evidence of similar student performances, these 3-year accelerated MD programs were soon abandoned due to faculty and student dissatisfaction and the end of government funding.22  Students reported exhaustion, and faculty noted the loss of key elements, such as ethics and patient safety.22  Also, one-quarter of 3-year students extended their time by 1 to 2 years.22  Because these 3-year accelerated programs added curricula in place of vacations, electives, and residency interview time, schools required more faculty; consequently, for most students the costs were the same for a 3-year as for a 4-year program. Some experimentation continued in the 1980s, in collaboration with GME: family medicine and internal medicine 3+3, UME+GME combined school and residencies. However, these programs experienced GME accreditation issues.22 

In 2010 a third wave began: currently 32 US schools offer a 3-year program.23  Initially, most of these programs guaranteed placement in primary care residencies, such as family medicine, to increase physicians for underserved areas. Now programs may offer placement in any residency sponsored by the institution. With the elimination of residency interview and elective time, the goal is to select students who are certain of their career choice on matriculation. The 3-year curriculum is also entirely distinct from the 4-year curriculum.

Satyamorthi et al report preliminary results on 7 classes of 3-year program medical students (n=136), from 2013 through 2019, in comparison to their 4-year student counterparts (n=681), from the New York University Grossman School of Medicine (NYU).24  NYU guarantees a position in a NYU residency for 3-year program graduates. The authors also compared performances of the 3-year graduates as first-year residents to 4-year NYU graduates and graduates from any other 4-year medical school, in NYU residency programs. For the UME comparisons, 3-year students performed similarly to their 4-year counterparts in a multitude of outcomes, with some statistically significant but very small magnitude differences. These included pre-clerkship examination scores, National Board of Medical Examiners subject examination scores, United States Medical Licensing Examination (USMLE) Step 1 and 2 scores, the NYU clinical skills examination scores, peer assessments, patient logs, clerkship grades, and more. For the first-year resident performance comparisons, outcomes were also similar, with some statistically significant but very small differences. These comparisons included USMLE Step 3 scores, internal medicine milestones, rates of becoming chief residents, and evaluations of teaching and clinical reasoning. Of note, demographics on entry to NYU, such as gender, underrepresented in medicine status, MCAT percentiles, or college grade point average, did not differ between the groups; however, 3-year students were on average 5 months older than their counterparts. The 3-year and 4-year students also evaluated the overall quality of their medical education similarly. For NYU the yearly tuition costs are the same for both programs. The authors reported that bias on the part of NYU program directors and residents against opening their programs to 3-year students appears to have dissipated as the program has matured: the number of NYU residency positions open to 3-year students has increased.

My take on this comparison using numerous metrics is that the 3-year medical school option is feasible for some institutions, for well-qualified, perhaps older students, who are certain of their specialty choice. I sincerely regret that the initial approach, to use this pathway to facilitate more graduates entering practice in shortage specialties and for underserved populations, has been transformed to a faster track into any specialty. There are certainly lessons here for GME.

Also worthy of your consideration is the Lancet Countdown on Health and Climate Change (Countdown).25  To monitor the health effects of climate change, the Countdown was established in 2015, the year the Paris Agreement to limit global multiyear mean heating to 1.5°C (2.7°F) by 2100 was signed by world leaders. The Countdown reports are created annually by more than 300 international researchers and health professionals and funded by Wellcome, a charitable foundation supporting health research. In 2023, the annual mean surface temperature reached 1.45°C above the pre-industrial mark. This 2024 Countdown report was issued before the conclusion of 2024: between May 2023 and April 2024 the global mean surface temperatures reached a record-breaking 1.61°C above pre-industrial times.25  The report tracks 15 indicators with enormous impacts on human health; 10 reached concerning levels at the time of the report.

The report is long, but the executive summary and introduction are short and helpful to bolster justification for program or institutional quality improvement, sustainable, or advocacy projects. Many medical trainees care deeply about global warming and need scholarship opportunities. The Countdown data is scary but essential information for those who care about the health of people and our planet.26 

When not editing for JGME, I am a student affairs dean. I love the work because it allows me to build systems and structures that support hundreds of learners at a time, while at the same time offering very practical advice and help through many individual student meetings. If that mix excites you too, then you’ll find this article to be interesting and inspiring. Sebok-Syer et al provide a concrete framework for developing ways to use large volumes of data obtained about our graduate medical trainees.27  They supplement the framework with their “action-oriented blueprint” with specific tips from 4 case examples from the authors’ institutions. Lest you fear that using big data in this way requires a host of resources, both financial and expertise, the 4 case studies highlight a variety of levels of complexity and organizational design.

The authors walk the reader through the various considerations in designing and implementing a process to take advantage of data we already acquire about our trainees and break down the steps in a clear way. A helpful glossary of terms defines commonly used jargon in this domain, which allows those of us trying to bridge departments and constituents to better speak one another’s languages.

I hate waste, so the article’s exhortation for us to take the next step to fully use the data we already collect really speaks to me. However, the barriers are real, and the authors acknowledge this in a realistic and credible manner. “While we initially conceived this piece as a way of proposing guidelines to facilitate data sharing in medical education, no clear, uncomplicated approach applies in all situations. Data sharing is not a field of absolutes, but rather a field of ‘sort-ofs,’ ‘maybes,’ and ‘it depends.’”27  I congratulate the author team on an inspiring and persuasive piece, which strikes an authentic balance of blue-sky vision and practical guidance.

As a survey methodologist, I am always looking for examples of GME studies that integrate high-quality survey methods into their study design. This year the investigation by Heppe et al did just that: using survey methods to evaluate the effects of a novel 4 + 4 block scheduling model on internal medicine residents’ burnout, well-being, and self-reported professional engagement and clinical preparedness.4  This pre- and post-intervention survey study involved residents in a single academic internal medicine residency program (a study weakness), and compared the new 4 + 4 structure (4-week inpatient call-based rotations followed by 4-week ambulatory non–call-based rotations) to the prior 4 + 1 schedule (4 weeks inpatient, 1 week ambulatory). Data from 216 residents (69% response rate) across 3 years were analyzed and examined burnout via the Maslach Burnout Inventory (MBI). Secondary professional and educational outcomes were collected using a 15-item questionnaire with some validity evidence.

The findings demonstrated significant reductions in emotional exhaustion, with a drop of 6.78 points, and depersonalization, with a decrease of 3.91 points, in the post-intervention cohort. These reductions were considerably larger than those observed in prior interventions aimed at mitigating burnout. Additionally, residents reported improvements in job satisfaction, professional engagement, and overall well-being, with relatively large effect sizes noted in their perceived ability to participate in scholarly activities and maintain work-life balance (eg, time for activities outside of clinic or with family and friends). Importantly, no adverse effects on in-training examination scores or perceived clinical skills acquisition were observed, which aligned with the intervention’s educational goals.

As a survey methodologist, I am particularly impressed by the robust design and application of survey methods in this study. The use of the MBI alongside a secondary questionnaire developed in a prior study enabled a fairly robust evaluation of the intervention’s effects. The authors adhered to the Consensus-Based Checklist for Reporting of Survey Studies (CROSS), which helped to ensure methodological transparency. From my perspective, their approach exemplifies best practices in survey research, by including a focus on survey tools with validity evidence and the use of adjusted statistical models to help minimize bias.

Finally, I appreciate that they integrated self-report survey tools into an intervention study. This allowed for a more nuanced understanding of how schedule restructuring might influence residents’ well-being, as well as their professional and personal lives. This alignment of quantitative rigor with practical application serves as a model for GME researchers. Furthermore, the study offers practical insights into reducing resident burnout through innovative scheduling: these findings have broad implications for residency programs striving to enhance trainee wellness without compromising educational quality.

1. 
Yarris
LM,
Balmer
D,
Gottlieb-Smith
R,
Sullivan
GM.
Editors’ guidance for submitting qualitative research to the Journal of Graduate Medical Education
.
J Grad Med Educ
.
2024
;
16
(3)
:
246
-
250
.
2. 
Ginsburg
S,
Lingard
L,
Sugumar
V,
Watling
CJ.
“I think many of them want to appear to have a growth mindset”: exploring supervisors’ perceptions of feedback-seeking behavior
.
Acad Med
.
2024
;
99
(11)
:
1247
-
1253
.
3. 
Crommelinck
M,
Anseel
F.
Understanding and encouraging feedback‐seeking behaviour: a literature review
.
Med Educ
.
2013
;
47
(3)
:
232
-
141
.
4. 
Heppe
D,
Baduashvili
A,
Limes
JE,
et al.
Resident burnout, wellness, professional development, and engagement before and after new training schedule implementation
.
JAMA Netw Open
.
2024
;
7
(2)
:
e240037
.
5. 
Chang
DC,
Kelly
M,
Eva
KW.
A phenomenological exploration of physicians’ moral distress: situating emotion within lived experiences
.
Acad Med
.
2024
;
99
(11)
:
1215
-
1220
.
6. 
van Manen
M.
Researching Lived Experience: Human Science for an Action Sensitive Pedagogy
.
State University of New York Press
;
1990
.
7. 
Balmer
DF,
Riddle
JM,
Simpson
D.
Program evaluation: getting started and standards
.
J Grad Med Educ
.
2020
;
12
(3)
:
345
-
346
.
8. 
Allen
LM,
Hay
M,
Palermo
C.
Evaluation in health professions education—is measuring outcomes enough?
Med Educ
.
2022
;
56
(1)
:
127
-
136
.
9. 
Ellaway
RH,
Kehoe
A,
Illing
J.
Critical realism and realist inquiry in medical education
.
Acad Med
.
2020
;
95
(7)
:
984
-
988
.
10. 
Rees
CE,
Davis
C,
Nguyen
VN,
Proctor
D,
Mattick
KL.
A roadmap to realist interviews in health professions education research: recommendations based on a critical analysis
.
Med Educ
.
2024
;
58
(6)
:
697
-
712
.
11. 
Simpson
D,
Sullivan
GM,
Artino
AR
,
Deiorio
NM,
Yarris
LM.
Envisioning graduate medical education in 2030
.
J Grad Med Educ
.
2020
;
12
(3)
:
235
-
240
.
12. 
Indran
IR,
Paranthaman
P,
Gupta
N,
Mustafa
N.
Twelve tips to leverage AI for efficient and effective medical question generation: a guide for educators using Chat GPT
.
Med Teach
.
2024
;
46
(8)
:
1021
-
1026
.
13. 
Masters
K,
Benjamin
J,
Agrawal
A,
MacNeill
H,
Pillow
MT,
Mehta
N.
Twelve tips on creating and using custom GPTs to enhance health professions education
.
Med Teach
.
2024
;
46
(6)
:
752
-
756
.
14. 
Noushad
B,
Van Gerven
PW,
De Bruin
AB.
Twelve tips for applying the think-aloud method to capture cognitive processes
.
Med Teach
.
2024
;
46
(7)
:
892
-
897
.
15. 
Masters
K,
Salcedo
D.
A checklist for reporting, reading and evaluating Artificial Intelligence Technology Enhanced Learning (AITEL) research in medical education
.
Med Teach
.
2024
;
46
(9)
:
1175
-
1179
.
16. 
Gordon
M,
Daniel
M,
Ajiboye
A,
et al.
A scoping review of artificial intelligence in medical education: BEME guide no. 84
.
Med Teach
.
2024
;
46
(4)
:
446
-
470
.
17. 
Lucas
HC,
Upperman
JS,
Robinson
JR.
A systematic review of large language models and their implications in medical education
.
Med Educ
.
2024
;
58
(11)
:
1276
-
1285
.
18. 
Elendu
C,
Amaechi
DC,
Okatta
AU,
et al.
The impact of simulation-based training in medical education: a review
.
Medicine (Baltimore
).
2024
;
103
(27)
:
e38813
.
19. 
Mollick
E,
Mollick
L,
Bach
N,
Ciccarelli
LJ,
Przystanski
B,
Ravipinto
D.
AI agents and education: simulated practice at scale
.
The Wharton School. Published June
26
,
2024
.
Accessed December 11, 2024. doi:10.2139/ssrn.4871171
20. 
Cabral
S,
Restrepo
D,
Kanjee
Z,
et al.
Clinical reasoning of a generative artificial intelligence model compared with physicians
.
JAMA Intern Med
.
2024
;
184
(5)
:
581
-
583
.
21. 
Boscardin
CK,
Gin
B,
Golde
PB,
Hauer
KE.
ChatGPT and generative artificial intelligence for medical education: potential impact and opportunity
.
Acad Med
.
2024
;
99
(1)
:
22
-
27
.
22. 
Schwartz
CC,
Ajjarapu
AS,
Stamy
CD,
Schwinn
DA.
Comprehensive history of 3-year and accelerated US medical school programs: a century in review
.
Med Educ Online
.
2018
;
23
(1)
:
1530557
.
23. 
Consortium of Accelerated Medical Pathway Programs (CAMPP)
.
24. 
Satyamorthi
N,
Marin
M,
Ludlow
P
, et al.
Outcomes of accelerated 3-year MD graduates at NYU Grossman School of Medicine during medical school and early residency [published online ahead of print October 15, 2024]
.
Acad Med
.
25. 
Romanello
M,
Walawender
M,
Hsu
S-C,
et al.
The 2024 report of the Lancet Countdown on health and climate change: facing record-breaking threats from delayed action
.
Lancet
.
2024
;
404
(10465)
:
1847
-
1896
.
26. 
Climate Change and Graduate Medical Education
.
J Grad Med Educ
.
2024
;
16
(suppl 1)
.
27. 
Sebok-Syer
SS,
Smirnova
A,
Duwell
E,
et al.
Sharing is caring: helping institutions and health organizations leverage data for educational improvement
.
Perspect Med Educ
.
2024
;
13
(1)
:
486
-
495
.