For several years the Journal of Graduate Medical Education (JGME) senior editors have assembled articles published in other health professions education publications in the past year that we think are worthy of notice. As we have different interests, areas of expertise, and definitions of usefulness, our conversations are animated and the resulting collection is eclectic. There is no scientific approach to this process; we simply like these articles and hope they may prove helpful to you, too (see Tables 1 and 2). Tell us what you think by tagging @JournalofGME on Twitter.

Table 1

Noteworthy Non-JGME Articles From 2022

Noteworthy Non-JGME Articles From 2022
Noteworthy Non-JGME Articles From 2022
Table 2

Honorable Mentions, Non-JGME Articles From 2022

Honorable Mentions, Non-JGME Articles From 2022
Honorable Mentions, Non-JGME Articles From 2022

The last time you took a survey, whether it was about faculty satisfaction or an evaluation of your hotel stay, did the survey ask you to agree or disagree with a set of statements? I would be willing to bet it did, since survey items with agree-disagree response options are the most commonly used format to assess attitudes and opinions. For example, in a medical education study I led a few years ago, we found that 57% of published surveys included at least one agree-disagree item, and across all of the survey items reviewed in our sample, 45% of items used agree-disagree response categories.1  The ubiquity of agree-disagree items is not surprising, because they are easy to write—simply create a list of statements and then ask respondents to agree or disagree with those statements. However, the utility of such items and their psychometric properties have long been debated. In this review by Dykema and colleagues, Towards a Reconsideration of the Use of Agree-Disagree Questions in Measuring Subjective Evaluations,2  the authors focus on the measurement properties and potential limitations of agree-disagree items compared to what they term item-specific questions.

An example of an agree-disagree item compared to a corresponding item-specific question is:

  • Agree-disagree: To what extent do you agree or disagree with the following statement? I feel well prepared to perform laparoscopic surgery without supervision. (Strongly disagree, disagree, neutral, agree, strongly agree)

  • Item-specific: How well prepared are you to perform laparoscopic survey without supervision? (Not at all well prepared, slightly well prepared, moderately well prepared, quite well prepared, extremely well prepared)

Agree-disagree items present respondents with a statement and then ask them to rate their level of agreement, whereas item-specific questions directly ask respondents about the underlying construct being assessed (in this case, perceived preparedness) using response categories tailored to match the construct.

Dykema and colleagues reviewed 20 experimental studies that directly compared agree-disagree and item-specific questions.2  Although mixed, the findings indicate that most studies reporting item-specific questions are associated with greater reliability and validity when compared to agree-disagree items. The authors note several explanations for these results, which correspond to what we know about how respondents work through the cognitive steps needed to answer survey questions. The authors' reasons for survey designers to avoid agree-disagree items include: respondents are more likely to acquiesce (ie, agree) when answering agree-disagree items compared to item-specific questions; agree-disagree items often present respondents with a mismatch between the item's underlying response dimensions and the response options offered by the agree-disagree categories, which adds to cognitive burden; and agree-disagree items use bipolar response options that present both ends of a negative to positive response dimension, whereas item-specific questions can be either bipolar or unipolar.

My take-home

Although experimental studies directly comparing agree-disagree and item-specific question yielded mixed results, more studies found that item-specific questions are associated with desirable data and that agree-disagree items are associated with undesirable data quality. Therefore, I—like the authors of this excellent article—recommend item-specific questions over agree-disagree items for most survey purposes. Read the article for yourself and make your own evidence-informed decisions the next time you are designing a graduate medical education (GME) survey for research or evaluation.

Imagine a world where residency applicants match to a GME program they love, after spending a reasonable amount of money on applications, and are welcomed enthusiastically by the program director as a top choice applicant.

Not possible? We edge closer to this dream as we gather more data around recent Match innovations conducted in the last couple of application cycles. In The Otolaryngology Residency Program Preference Signaling Experience, Pletcher et al describe the first year of the otolaryngology preference signaling trial, in which applicants could signal up to 5 programs at the time of initial application as an indication of special interest in the program.3  In this relatively small competitive specialty, 558 of the total 559 applicants employed the signaling process. Surveyed program directors (52% response rate) reported that the rate of receiving an interview offer was higher from signaled programs (58%) than from non-signaled programs (14%; P<.001) and the next non-signaled program (23%; P<.001; ie, the program an applicant would have signaled given a sixth signal). Interestingly, these differences were seen across the range of applicant competitiveness. Surveyed applicants (42% response rate) and program directors strongly favored continuing the program.

My take-home

This article lays the groundwork for essential outcomes-based research for all the Match innovations from the current and future cycles.

In late July 2022 Wisconsin sunrises and sunsets appeared a beautiful yet unusual hue of red. While breathtaking, the cause was smoke from fires on the West Coast and Canada. In September 2021, more than 200 medical journals published a joint editorial that called climate change “the greatest threat to global public health.”4  Medical education journals have published articles that address training for mass trauma, wildfires, extreme weather events, pandemics, and vector-borne diseases and often link these phenomena to climate change. Currently, 15% of medical schools worldwide teach a climate and health curriculum,5  with a group of students leading an international planetary health report card.6  There are limited reports around GME7  and no student or resident accreditation requirements on this topic.

In 2022, Family Medicine published a commentary by DeMasi and colleagues, Climate Change: A Crisis for Family Medicine Educators, in which the authors issued a similar call to action.8  Yet, this one stands out. After outlining the inequities of climate change on patients and adverse effects on health care clinicians, the authors describe why climate change is “in our lane” as educators, include resources with specifics for GME, and suggest actions we can take. Normally, calls to action don't actually push me into action. This one did, because links to the article were sent to me from individuals in multiple GME programs locally and nationally.

My take-home

This commentary is a short read that has prompted us to take real, climate-related, education actions.

I was drawn to the article, Self-Assessment: With All Its Limitations, Why Are We Still Measuring and Teaching It? Lessons From a Scoping Review, by Yates and colleagues, because we use self-assessment ubiquitously for trainees and faculty.9  We do this despite research showing that external assessment is superior, for physicians in particular, because there is poor correlation of self-assessed learning with measures of competence.10  The authors' methods were also intriguing: from their original search, more than twice as many studies used self-assessment inappropriately vs appropriately, which piqued the authors' curiosity. A special treat in this article is the authors' self-assessment example—teaching a teenager to drive a car. Read this article for the vignette alone!

The authors examined the extent of self-assessment use for medical students in non-evidence-based ways, which was defined as it being a sole or primary outcome measure for assessing a program or intervention, or for self-assessment as a learning goal itself (ie, accuracy of self-assessment), as studies show it is not associated with improving performance or lifelong learning. The authors consider overlapping concepts, such as self-evaluation, self-monitoring, and self-efficacy, which may obscure studies of self-assessment. They found that in 63 of the 207 articles (30%), self-assessment of knowledge or skills was the sole outcome measure for evaluating a program or intervention. In 62 studies (30%), self-assessment of confidence was measured: when confidence and competence were both measured, correlation was variable, as found in prior studies. In 39 studies (19%), the study aim was limited to furthering the accuracy of self-assessment.

The authors were guided by Arksey and O'Malley's framework for scoping reviews, their protocol was registered on the Open Science Framework, and they used strong methods throughout.11 

Despite the focus on medical students in this article, there is abundant overlap with assessment methods used in GME. Many programs routinely measure self-assessed performance without additional measures, and JGME receives many papers in which self-assessed knowledge or skills are the sole outcome measure. It's worrisome that many articles in this review were published in “top tier” medical education journals and that there was no decline in the number of articles over the years of the review.

My take-home

Self-assessments of knowledge, skills, or confidence are rarely helpful, despite being easier to collect. Let's choose better outcomes and reduce our overall data collection load at the same time (table 3).

Table 3

When and When Not to Use Self-Assessment

When and When Not to Use Self-Assessment
When and When Not to Use Self-Assessment

Wellness and burnout continue to be hot topics in GME, yet most studies are descriptive in nature. We know that burnout is an ongoing problem: we have measured its prevalence in many specialties, and we are starting to see studies that either seek to explore factors related to burnout or evaluate interventions aimed at improving burnout. However, deep understandings of the nature of burnout and well-being are still elusive, which limits our ability to design and implement changes that result in a meaningful difference in trainees' experience of burnout. Vexing questions abound: Why do some trainees experience tremendous stress, but not burnout, while others struggle? What trainee, program, and culture factors are associated with resilience and burnout recovery? How do personal characteristics, program factors, workload, and other stressors interact with the experience of well-being?

In this qualitative study, Burnout, Wellbeing and How They Relate: A Qualitative Study in General Practice Trainees, Prentice et al apply a post-positivist epistemology and grounded theory approach to explore the concepts of well-being and burnout from the perspectives of Australian general practice trainees and registrars.12  Their subjects describe burnout as a syndrome that exists on a spectrum. Both trainees and registrars identified 7 relevant themes: altered emotion, compromised performance, disengagement, dissatisfaction, exhaustion, overexertion, and feeling overwhelmed. Well-being involved a complex interaction between factors in personal and professional domains, with an underlying “well-being reservoir” as an important facilitator of the perception of wellness. The authors propose an overarching explanatory model that centers around the observation that burnout occurs when a trainee's values and/or goals are no longer met, thereby depleting the well-being reservoir. The model proposes that unfulfillment of an individual's professional and personal values is the central process by which burnout develops, which has important implications for the role of values in future interventions to promote well-being and address burnout.

This study caught my eye for several reasons. First, because it is a rigorous qualitative exploration of important phenomena. It adheres to published standards for quality and rigor in qualitative work13  and has strong rationales for the research paradigm, approach, data collection, and analysis methods, as well as appropriate justification for controversial methodological decisions. Also, the article takes a step back to question the status quo. Rather than continuing to build off prior models and apply existing tools to inform the development of interventions, the researchers ask an important question: Do we really understand well-being and burnout in terms of how GME trainees experience these phenomena? Finally, this article is timely and relevant. The past 3 years have been hard. For trainees, for educators, for our loved ones, for our patients. Not a day goes by where the topics of well-being and burnout are not front and center in at least one of my conversations with friends, family, learners, or colleagues.

My take-home

The concepts and findings in this article resonate deeply, and the proposed model is a thought-provoking and worthwhile read for medical educators and leaders.

1. 
Artino
 
AR
Phillips
 
AW,
Utrankar
 
A,
Ta
 
AQ,
Durning
 
SJ.
“The questions shape the answers”: assessing the quality of published survey instruments in health professions education research
.
Acad Med
.
2018
;
93
(3)
:
456
-
463
.
2. 
Dykema
 
J,
Schaeffer
 
NC,
Garbarski
 
D,
et al
Towards a reconsideration of the use of agree-disagree questions in measuring subjective evaluations
.
Res Social Adm Pharm
.
2022
;
18
(2)
:
2335
-
2344
.
3. 
Pletcher
 
SD,
Chang
 
CWD,
Thorne
 
MC,
et al
The otolaryngology residency program preference signaling experience
.
Acad Med
.
2022
;
97
(5)
:
664
-
668
.
4. 
Full list of authors and signatories to climate emergency editorial
September
2021
.
BMJ
.
5. 
Omrani
 
OE,
Dafallah
 
A,
Castillo
 
BP,
et al
Envisioning planetary health in every medical curriculum: an international medical student organization's perspective
.
Med Teach
.
2020
;
42
(10)
:
1107
-
1111
.
6. 
Planetary Health Report Card
.
Accessed November 28, 2022. https://phreportcard.org/
7. 
Philipsborn
 
RP,
Sheffield
 
P,
White
 
A,
Osta
 
A,
Anderson
 
MS,
Bernstein
 
A.
Climate change and the practice of medicine: essentials for resident education
.
Acad Med
.
2021
;
96
(3)
:
355
-
367
.
8. 
DeMasi
 
M,
Chekuri
 
B,
Paladine
 
H,
et al
Climate change: a crisis for family medicine educators
.
Fam Med
.
2022
;
54
(9)
:
683
-
687
.
9. 
Yates
 
N,
Gough,
 
S,
Brazil
 
V.
Self-assessment: with all its limitations, why are we still measuring and teaching it? Lessons from a scoping review
.
Med Teach
.
2022
;
44
(11)
:
1296
-
1302
.
10. 
Eva
 
K,
Regehr
 
G,
Gruppen
 
LD.
Blinded by “insight”: self-assessment and its role in performance improvement
.
In:
Brian David
 
H,
Lorelie
 
L,
eds.
The Question of Competence: Reconsidering Medical Education in the Twenty-First Century
.
Cornell University Press
;
2012
:
131
-
154
.
11. 
Arksey
 
H,
O'Malley
 
L.
Scoping studies: towards a methodological framework
.
Int J Soc Res Methodol
.
2005
;
8
(1)
:
19
-
32
.
12. 
Prentice
 
S,
Elliott
 
T,
Dorstyn
 
D,
Burnout
 
Benson J.,
wellbeing and how they relate: a qualitative study in general practice trainees
.
Med Educ.
[Published online ahead of print August 23, 2022]. doi:https://doi.org/10.0000/medu.14931
13. 
O'Brien
 
BC,
Harris
 
IB,
Beckman
 
TJ,
et al
Standards for reporting qualitative research: a synthesis of recommendations
.
Acad Med
.
2014
;
89
(9)
:
1245
-
1251
.