Prior research has shown a gender gap in the evaluations of emergency medicine (EM) residents' competency on the Accreditation Council for Graduate Medical Education (ACGME) milestones, yet the practical implications of this are not fully understood.
To better understand the gender gap in evaluations, we examined qualitative differences in the feedback that male and female residents received from attending physicians.
This study used a longitudinal qualitative content analysis of narrative comments by attending physicians during real-time direct observation milestone evaluations of residents. Comments were collected over 2 years from 1 ACGME-accredited EM training program.
In total, 1317 direct observation evaluations with comments from 67 faculty members were collected for 47 postgraduate year 3 EM residents. Analysis of the comments revealed that the ideal EM resident possesses many stereotypically masculine traits. Additionally, examination of a subset of the residents (those with 15 or more comments, n = 35) showed that when male residents struggled, they received consistent feedback from different attending physicians regarding aspects of their performance that needed work. In contrast, when female residents struggled, they received discordant feedback from different attending physicians, particularly regarding issues of autonomy and assertiveness.
Our study revealed qualitative differences in the kind of feedback that male and female EM residents received from attending physicians. The findings suggest that attending physicians should endeavor to provide male and female residents with consistent feedback and guard against gender bias in their perceptions of residents' capabilities.
Faculty feedback to male and female emergency residents appears to differ, but the educational and practical implications have not been explored.
A longitudinal qualitative content analysis of narrative comments by attending physicians during real-time direct observation milestone evaluations of emergency residents suggested gender bias.
Single site, single specialty study limits generalizability.
There were qualitative differences in the feedback faculty gave to male and female emergency residents, particularly around the domains of authority and assertiveness.
Despite achieving parity in medical school graduations,1 female physicians face barriers to advancement.2 They hold fewer faculty positions at academic institutions, earn lower adjusted incomes, and are in fewer positions of leadership in medical societies and departments than their male counterparts.1–7 A recent systematic review suggested that the greatest attrition in commitment to academia occurs during residency, possibly due to implicit gender bias and lack of support in the workplace, among other contributors.2
Few studies have examined the status of women in emergency medicine (EM),4 a specialty where gender parity has yet to be achieved. Women comprise less than 25% of all faculty positions in emergency departments8 and represent only 38% of EM residents (up from 32% in 2003).1 It is particularly interesting to investigate gender inequality in EM, as it was one of the first specialties to evaluate residents' competency using the Accreditation Council for Graduate Medical Education (ACGME) Next Accreditation System milestones, a nationally standardized, longitudinal evaluation system.9 While assessment of competency is to have a “beneficial effect on learning,”9 a broad body of literature in the social sciences notes that status characteristics—like gender—matter when evaluating someone's competency, even on theoretically objective standards, and when meritocracy is valued in organizations.10–12 Research has also found that evaluators struggle to assess competency or merit independent of gender and gendered expectations.13–16
A recent study by our group found an attainment gap between male and female EM residents in evaluations of performance on the nationally standardized milestones. In a longitudinal, multicenter study, we found that male residents, on average, had a higher rate of milestone attainment throughout residency, with a widening gap in evaluations that is substantial and statistically significant by postgraduate year 3 (PGY-3).17 This attainment gap was not dependent on either the gender of the attending physician doing the evaluation, or the gender pairing between the attending physician and the resident.17 While our prior work demonstrated a gender difference in numerical evaluations, in this study, our aim was to use qualitative data to better understand the lagging performance evaluations of female EM residents in PGY-3.
We conducted a qualitative content analysis of comments attached to numerical ACGME milestone-based evaluations of PGY-3 EM residents by attending physicians. We used a post-positivist research paradigm, wherein we acknowledge that perfect objectivity is never fully attainable, but rather a goal toward which we strive, by recognizing the influence of our characteristics, backgrounds, and values on knowledge production.18 Of the authors, 3 (A.S.M., T.M.J., M.O.) have extensive sociological training in qualitative methods and medical sociology, and 3 (A.D., D.M.O., V.M.A.) are clinicians with qualitative methods experience. Our collective expertise allowed us to approach data analysis rigorously and helped guard against potential disciplinary biases or knowledge gaps. The authors had no contact with the residents or faculty and are not affiliated with the hospital under study. This separation helped prevent subjectivity in the analysis; it also made it impossible to interpret the results with reference to the local institutional context or culture.
Study data were collected from a single 3-year ACGME-accredited EM training program that we call “University Hospital” (a pseudonym) from July 1, 2013, to July 1, 2015. A total of 1317 direct observation evaluations with comments were collected from 2 cohorts of PGY-3 EM residents, and these included evaluations of 47 PGY-3 residents by 67 faculty members.
Text comments were collected using InstantEval V2.0 (Monte Carlo Software LLC, Annandale, VA), a software application available on faculty's mobile devices and computers to facilitate real-time, direct observation milestone evaluations. Faculty could choose when to complete evaluations, whom to evaluate, and the number of evaluations to complete, although most training programs encouraged 1 to 3 evaluations per shift. Each evaluation consisted of an ACGME Emergency Medicine Milestone Project–based performance level19 on 1 of 23 possible individual EM subcompetencies. In certain cases, text comments were provided, which are the focus of this study. All names used in the text are pseudonyms to protect confidentiality.
This study was approved as exempt research by the University of Chicago Institutional Review Board.
Our post-positivist approach was guided by a sequential explanatory analytic design,20 which is when qualitative methods are used to better understand previously established quantitative findings. This approach aided our efficient analysis of a large qualitative dataset. The previously established quantitative findings that guide our research are (1) that a gender gap exists in resident evaluations on ACGME standards in EM,17 and (2) that the gap is most substantial in PGY-3.17 We focused our analysis on PGY-3 and examined qualitative differences in the feedback male and female residents received in order to shed light on why evaluation gaps may exist.
All qualitative data were coded and analyzed in NVivo 11 (QSR International, Burlington, MA), a qualitative analysis software package. To guard against confirmation bias, we suppressed information about residents' and attending physicians' gender during all stages of coding. This process was imperfect, as some comments included gendered pronouns or names.
As a further guard against confirmation bias, we developed a multistage, multianalyst procedure for coding and analyzing the data. In the first stage, 5 team members engaged in simultaneous open coding of all comments to develop themes from the data21 and to ensure accurate understanding of comments. During this stage, we identified characteristics of residents that are valued in EM (Table 1), and identified 4 themes (strong criticism, praise, possesses, and lacks valued personality traits) in Table 2. In the second stage, 3 team members conducted focused coding for these 4 themes. To limit confirmation bias in coding during this stage, 2 of 3 team members coded every comment for our selected themes. Any discrepancies between codes were discussed by all 3 team members, and consensus was reached in all cases. Third, we analyzed gender differences in the comments residents received. For residents who received at least 15 comments (n = 35), we did an in-depth analysis of how different attending physicians rated the same resident over the course of PGY-3.
This multistage coding approach produced a robust and thorough inquiry, allowing themes to be (re)interpreted from the data,18 while also systematically substantiating the validity of findings through rigorous collective analysis.
Between 2013 and 2015, 31% (22 of 71) of all residents (PGY-1 through PGY-3) at University Hospital were female, and 43% (29 of 67) of attending physicians were female. Of the PGY-3 residents with more than 15 comments, 13 of 35 (37%) were female. This suggests a slight underrepresentation of female residents, an overrepresentation of female attending physicians in our study site compared with national averages,1,8 and a slight overrepresentation of female residents in our qualitative data compared with University Hospital's resident population (Table 2).
Characteristics Valued During EM Residency
We examined the 1317 comments made by attending physicians about PGY-3 residents for insights into traits valued in the emergency department. Although procedural and diagnostic skills and knowledge were important, the data revealed that residents must also possess certain personality traits (confident, hardworking, calm); practice styles (documents well, performs under pressure); and management styles (communicative, able to multitask) to be considered superlative, even in procedure-specific evaluations. Many of these characteristics have been identified in prior research as stereotypically masculine traits (Table 1).22–24
While the residents in our sample were repeatedly evaluated on these characteristics, few managed to embody them at all times. One resident at University Hospital who stood out as consistently meeting expectations was Keith, a chief resident who regularly received praise, such as the following example:
“Keith continues to perform at the top of his class with regard to managing the entire [department] and being able to direct and teach interns [without] losing speed. He is also able to work [with] nursing to find ways to increase throughput. His greatest advantage over his peers is his ability to problem solve in difficult practice environments [without] losing patience or getting frustrated.” (John, attending)
In addition to illustrating what high praise sounds like, Keith's evaluations indicated that this ideal was attainable.
Gender Differences in Feedback
To better understand the gender gap in milestone attainment, we examined how different attending physicians evaluated residents' performance on the milestones throughout PGY-3. For this analysis, we focused on residents with at least 15 comments (Table 2). Our primary finding was that female residents received less consistent feedback from attending physicians than male residents, particularly regarding personality traits valued in EM. A total of 62% (8 of 13) of female residents received both strong praise for their performance and strong criticism, compared with 45% (10 of 22) of male residents (Table 2). Half of male residents (11 of 22) received no negative comments regarding their possession of ideal EM personality traits from any attending physician, compared with 23% (3 of 13) of female residents. Only 1 female resident (Zoey) received no strong criticism and no negative personality comments, compared with 50% (11 of 22) of male residents. Sixty-two percent of female residents (8 of 13) were criticized multiple times for lacking valued EM personality traits compared with 36% of male residents (8 of 22).
The substantive content of the feedback also illustrated this gender gap. Male residents received consistent messages about what they needed to improve. As illustrated in the quotes that follow, multiple attending physicians noted the same concern about Matt's diagnostic skills and autonomy:
“I enjoy working with Matt … However, as a third year, I continue to be disappointed … I think he is capable of doing the job. However, I still find his clinical decisions are limited to asking what the staff wants to do (as opposed to even offering a suggestion and then discussion [sic] differences).” (Harrison, attending; emphasis added)
“You [Matt] need to take a more assertive leadership role. You have a lot of military experience, but at this point you should not defer so much to the attending. Confidently craft a plan and lay out to the attending how you will execute it. Not all people need a million [dollar] work up.” (David, attending; emphasis added)
“Overall good job, but keep working on expanding that differential as you head into attendinghood.” (Ken, attending; emphasis added).
Matt, along with other poor(er) performing male residents (like Owen and Tyler; see Table 2) tended to receive reasonably consistent messaging from multiple attending physicians about what they needed to improve. Matt was even praised for his “self-awareness about … [his] limits,” though he never received praise for confidence or autonomy during PGY-3.
The comments directed toward female residents, however, often contained discordant feedback from different attending physicians, which generally focused on their lack of a specific personality characteristic, particularly autonomy and assertiveness. Consider the following comments (presented in chronological order) received by Emma, the female resident with the most instances of strong praise:
“[Emma is] progressing well, very thoughtful, reliable, appropriate confidence and autonomy.” (Harrison, attending; emphasis added)
“I would encourage Emma to be more assertive. During critical resuscitations, she should let those working around her know that she is the team leader.” (Adam, attending; emphasis added)
“[Emma] argues a lot with the attending, is very confident in her diagnosis, and has a hard time entertaining other possibilities.” (Hillary, attending; emphasis added)
As is evident in Emma's case, the discordance in feedback for female residents generally resulted from different attending physicians having dissimilar opinions, and not from the same attending physician providing a different commentary on the same topic on different occasions. Attending physicians were often consistent in their praise or sanctioning of resident personalities. For example, Harrison (quoted earlier) praised Emma twice for her “appropriate autonomy.”
Emma also received mixed reactions with regard to her openness to feedback (which appeared in Hillary's comment in the previous list) and also below (again, in chronological order):
“Receptive with feedback and asks good questions on differential management. Aims at improving herself even though already a great performer.” (Sofia, attending; emphasis added)
“[Emma] can improve by being more open to feedback and constructive criticism and recognizing that others may be able to contribute to her learning and provide another perspective on practice and patient care.” (Frank, attending; emphasis added)
This tension between demonstrating autonomy while still being open to attending physicians' directions was a theme that appeared in feedback for multiple female residents but never appeared for male residents. Gabrielle, Tamara, Jane, and Beth were all both praised and criticized for their receptiveness to feedback and their autonomy. Beth, for example, received the following comments about her handling of constructive feedback (in chronological order):
“I think you [Beth] have a natural ability to communicate well with regard to difficult information/constructive feedback. You develop a very good report [sic] with students and your colleagues because you treat them with respect. It's a wonderful thing to work around. Thanks for the role-modeling of this intangible.” (Steven, attending; emphasis added)
“[Beth is] interested in learning despite of being ready for graduation.” (Martin, attending, emphasis added)
“[Beth] seemed to respond negatively to my input on her plans last shift . . . I know she's late in the third year and needs progressive autonomy, but she seemed to have a negative attitude toward supervision.” (Richard, attending; emphasis added)
By comparison, only 1 male resident (Owen) was criticized for being “sensitive to feedback” by the attending physician (Brian), and 1 male resident was actually praised for the same thing: “Doing well. [Oliver is] sometimes argumentative, but he is trying to assert his confidence” (Michael, attending). This further illustrates the likely salience of gender to interpreting residents' abilities to exhibit appropriate confidence and assertiveness.
Our goal for this study was to examine qualitative differences in the feedback that male and female EM residents received in order to better understand the emergence of a substantial gender gap in ACGME milestone attainment in PGY-3 established by our prior quantitative work.17 To do this, we analyzed text comments written by attending physicians for residents at the time of their evaluation for milestone attainment and identified 2 findings that likely are relevant to females' lagging performance and gender equality in EM graduate medical education.
First, the data revealed that the ideal EM resident must possess many stereotypically masculine traits: he or she must be a calm, decisive, confident leader who efficiently manages scarce resources in order to achieve the best outcomes for patients.22–25 This ideal may be related to the specialty's historical roots in the military,26 which also has highly masculinized norms.27 Of course, it is also the case that in EM, hesitation may mean the difference between life and death and that some of these stereotypically masculine traits may serve EM physicians well.
Second, we found that when males struggled, they received consistent feedback from different attending physicians regarding the aspect of their performance that needed work. In contrast, different attending physicians had discordant views about what female residents needed to improve.
This inconsistent feedback for women was particularly apparent around issues of autonomy and leadership, which are personality characteristics that may be more challenging to improve than procedural skills. Female residents were frequently praised for their performance as an autonomous leader by 1 attending, only to be criticized by another for being argumentative.
Inconsistent feedback is worthy of attention for 2 reasons. First, because inconsistent feedback disproportionately appeared among female residents, and rarely among male residents, female residents may be receiving poorer-quality mentoring and instruction. This could have consequences for their ability to progress, particularly if that feedback is less related to their actual performance than it is for male residents. Prior research has shown that consistent and clear feedback is important for the improvement of performance.28 Future research should examine the consequences this gender gap in consistent feedback may have for actual learning.
Second, these critiques are highly gendered. “Assertiveness” and “autonomy” are more associated with masculinity than femininity; this is potentially an instance when male residents have an easier time meeting expectations than their female colleagues.11 This is consistent with a much broader scholarship showing that women have a harder time being evaluated as competent on traits that are traditionally considered masculine,12,29 particularly in organizations that have been dominated generally by men.11 The emergency department fits both of these criteria, given the underrepresentation of women in EM1,8 and the masculine traits valued in this setting. Although our data did not allow evaluation of whether attending physicians adhere to biased beliefs about female residents, past research suggests that this may be the case. When evaluating grant proposals,14 job applicants and tenure candidates,15 graduate student applicants,30 teaching,31 and mentoring ability,13 women, on average, are deemed less competent or ideal than their male counterparts.
It also is possible that female residents' performance is less consistent than that of male residents, and the comments received may be attending physicians' attempts to encourage improvement. Past research has shown that female residents reported feeling stressed or uncomfortable with having to violate stereotypically feminine behavioral norms, like using directive language and a loud voice while leading a code.32 This discomfort could trigger stereotype threat and harm their performance.33 Future research should continue to examine the complex processes that may undergird the gender gap in competency evaluations that research has found in the emergency department.17
There are limitations to our qualitative study. First, we lack the context in which the comments were made, although both male and female residents were evaluated against the same ACGME milestone criteria. Second, our data were drawn from a single hospital. Finally, it is unfortunate that we lack information in our data about residents' or attending physicians' race or ethnicity, which likely matters, given the underrepresentation of people of color in medicine.34
There are advantages to our analysis. First, faculty members were not being observed by researchers when making their evaluations of residents, diminishing a Hawthorne effect or social desirability bias. Second, previously published findings allow us to be more confident that factors, such as the skill being evaluated (procedural or otherwise), the gender of the attending physician, or the local culture of the hospital, likely do not fully explain the gender gap in evaluations, as those factors were controlled in our statistical models and found to be insignificant.17
To further explore this important area, real-time observational research and interviews with EM residents and attending physicians would contribute important knowledge and complement this study. Examining other EM residency programs would also be informative, particularly if studies considered the role of the local emergency department culture.
Our research contributes to understandings of gender inequality in graduate medical education. Namely, initiatives to reduce gender bias35–38 should sensitize faculty to potential biases in their perceptions of female residents' capabilities and to the importance of providing residents' consistent and clear feedback.
Funding: This project was supported by a grant from the National Center for Advancing Translational Sciences. Additional funding was provided by a University of Chicago Diversity Small Grant. Dr Jenkins received salary support from the Canadian Institutes of Health Research (No. 140811).
Conflict of interest: Drs O'Connor and Dayal co-developed InstantEval, which was used to collect the evaluation data in this study, and have a financial interest in this product. Dr Arora receives honoraria from the American Board of Internal Medicine.
A prior version of this study was presented at the Sociologists for Women in Society Winter Meeting, Albuquerque, New Mexico, February 9–12, 2017.
The authors would like to thank Keith Mausner, MD, FAAEM, for his insightful feedback on earlier drafts of this paper.