ABSTRACT
Background The proportion of women surgeons is increasing, but studies show that women in surgical residency are granted less autonomy than men.
Objective We utilized the Surgical Autonomy Program (SAP), an educational framework, to evaluate gender differences in self-reported autonomy, attending-reported autonomy, and operative feedback among US neurosurgical residents.
Methods The SAP tracks resident progression and guides teaching in neurosurgery. Surgeries are divided into zones of proximal development (opening, exposure, critical portion, and closure). Postoperatively, resident autonomy is rated on a 4-point scale by the resident and the attending for each part of the case, or zone. We utilized data from July 2017 to February 2024 from 8 institutions. Ordinal regression was used to evaluate the odds of self- and attending-evaluated autonomy, accounting for gender, training year, case difficulty, and institution. Differences between attending assessment and self-assessment were calculated across time. Chi-square analyses were used to measure any differences in feedback given to men and women.
Results From 128 residents (32 women, 25%), 11894 cases were included. Women were granted less autonomy (OR 0.81; 95% CI 0.74-0.89; P<.001) and self-evaluated as having less autonomy (OR 0.73; 95% CI 0.67-0.80; P<.001). The odds of women operating at higher autonomy were similar to the odds of operating on a hard case compared to average difficulty (OR 0.77; 95% CI 0.71-0.84; P<.001). Men’s and women’s self-assessment became closer to attending assessment over time, with women improving more quickly for the critical portions of surgeries. Women residents received meaningful postoperative feedback on fewer cases (women: 74.2%, men: 80.5%; X2=31.929; P<.001).
Conclusions Women operated with lower autonomy by both attending and self-assessment, but the assessment gap between genders decreased over time. Women also received less feedback from their attendings.
Introduction
Although more than half of medical students identify as women,1 they continue to be underrepresented within surgical subspecialties. Concerningly, gender disparities in operative training have been documented.2,3 If this is not measured and remedied across all surgical specialties, teaching disparities could lead to women feeling less prepared than men to practice independently at graduation.
Women general surgery residents experience less operative autonomy compared to their counterparts when controlling for level of training and case difficulty,2 and were perceived to need more intraoperative guidance than men, even though technical performance between genders was similar.3 One hypothesized reason for these differences is that women may be less confident than men, leading to them being perceived as less competent. Women trainees underrate their performance in multiple learning environments, including plastic surgery residency,4 dental school,5 and medical school.6 A main factor influencing autonomy in the operating room is prior resident performance.7 However, studies suggest that a trainee’s impression of autonomy impacts their confidence and competence, making it difficult to understand which is the driving force behind the gender disparities in the operating room.8 Implicit bias on the part of teachers may also play a role, with studies suggesting that both men and women attending physicians grant less autonomy to women.9,10
As of 2016, only 16% of neurosurgery residents in the United States identified as women,11,12 and there are even fewer who are board-certified attendings or leaders in the field, pointing to challenges with gender-based representation in the specialty.12-15 Drivers for gender differences include lack of mentorship, competing personal responsibilities including family obligations, and unconscious biases.16
This study aims to investigate the operative autonomy and educational feedback of women and men in neurosurgery residency by using the Surgical Autonomy Program (SAP), an education framework that aims to track and increase resident competency.17,18 The study also investigates gender disparities in metacognitive skill, or self-assessment capability, by comparing residents’ self-assessment to attending assessments.
KEY POINTS
Studies of surgical residents have reported differences between surgical training of women and men residents.
The Surgical Autonomy Program, tracking neurosurgical self-reported autonomy and attending-evaluated autonomy after procedures at 8 institutions, found that women residents reported and received substantially less autonomy as well as less feedback after procedures.
Although the difference in allowed autonomy decreased over time, these differences and decreased feedback suggest additional obstacles to achievement of surgical competence in women trainees.
Methods
Setting, Data, and Participants
Data were extracted from the SAP database for all cases between July 2017 and February 2024 at 8 neurosurgery departments across the United States. These departments were included as they had cases logged for all years of the standard 7-year neurosurgery residency. Resident and attending gender were provided by each residency program director based on individual reporting. SAP data were deidentified and provided to the researchers by a dedicated administrator at the SAP. The lead authors of the study were able to obtain the data through their leadership in the SAP program, and all participating programs in the SAP are able to obtain their institutional data upon request.
Postgraduate year (PGY) 1 residents were excluded, given that the majority of their time is spent managing patients medically. We excluded PGY-7, as this year is increasingly (but inconsistently) being used for fellowship training.19 Residents beyond the standard 7-year neurosurgery curriculum were excluded. Surgical procedures with less than 30 SAP entries logged in the database and all critical care cases were removed to capture only core neurosurgical procedures.
Surgical Autonomy Program
The SAP divides a surgical case into 4 sequential parts, or zones of proximal development (ZPD). ZPD1 is patient positioning and the opening sections of a procedure. ZPD2 includes the steps between opening and ZPD3. ZPD3 is the key and most complex portion of the case. ZPD4 represents the closing steps. As an example, for cerebral tumor resections, ZPD1 is defined as patient positioning and scalp opening, ZPD2 is skull and dura mater opening, ZPD3 is the tumor resection, and ZPD4 is dura, skull, and scalp closure.
Prior to each case, the resident and attending discuss which ZPD the resident intends to focus on. After each case, the resident and attending independently complete an SAP entry on the application by designating the ZPD of focus, assigning a TAGS score (T = Teach and Demonstrate, A = Advise and Scaffold, G = Guide and Monitor, S = Solo and Observe) for each ZPD, and rating the case’s difficulty (easy, average, hard). Residents have the least autonomy in “T”, where they assist. In “A” the attending actively guides the resident. In “G” the resident receives minimal guidance. The resident is fully independent in “S”. The residents rate the faculty’s intraoperative teaching, postoperative feedback, and use of the ZPDs. Screenshots of these questions on the SAP entry form are shown in the online supplementary data.
Residency programs, including their attending and resident physicians, are onboarded by the leadership of the SAP with instructions on how to utilize the ZPDs, TAGS scales, and the application itself. In a pilot study and a follow-up study of almost 5000 logged cases in the SAP (drawn from over 20 residents and 30 faculty), progression in the ZPD of focus and in the TAGS autonomy level independently correlated with the year in residency, providing evidence of consistency with yearly progression in residency.17,18 Additionally, case difficulty correlated with autonomy level, with residents operating at lower autonomy in more difficult cases.17,18 A similar autonomy-based educational framework for general surgery, System for Improving and Measuring Procedural Learning (SIMPL), provides some evidence of generalizability for SAP, and uses a 4-level autonomy scale similar to the TAGS scale, known as the Zwisch scale.20 The Zwisch scale has been correlated with PGY progression with high interrater reliability between attending evaluators, providing further evidence of the utility of these scales as grading methods.21
Outcomes and Analysis
Descriptive statistics were used to summarize surgeon gender, case categories, and case complexity. Ordinal logistic regressions were used to evaluate the odds of increased autonomy by attending assessment and resident self-assessment, accounting for gender, PGY, case difficulty, and ZPD of focus. Ordinal logistic regression was also used to evaluate odds of focusing within a higher ZPD given resident gender, attending gender, PGY, and case difficulty. The attending rating of case difficulty was used for analyses given their level of experience in judging case complexity. Due to potential differences in SAP implementation in each institution, we also accounted for each neurosurgical department in the regression models. Intraoperative feedback and feedback on the SAP form was considered meaningful if the residents reported either “Significant feedback: stretching and affirming” or “Some feedback, but valuable.” “Some feedback, but limited,” and “No significant feedback,” were considered insignificant feedback. ZPDs were considered to be “used” if residents reported complete or partial ZPD use during the case. Chi-square analyses were used to measure differences in the level of intraoperative feedback, feedback on the SAP form, and intraoperative ZPD use.
To assess metacognition, the resident and attending TAGS on the ZPD of focus for each case were recoded as “T”=1, “A”=2, “G”=3, and “S”=4. A difference score was created by subtracting the attending score from the resident score. If the resident rated themselves higher than the attending, the value would be positive, and vice versa, while value of zero indicated matching scores. For example, if the resident rated themselves as “A” but the attending rated them as “G”, the difference score is -1. Independent sample t tests were used to compare the average difference score between men and women in each year of residency and for each ZPD of focus. An ordinal regression analysis was used to evaluate change in the absolute difference between resident score and attending score given ZPD use, time, and gender. Time was calculated for each case as the number of months between the surgery date and the date of the resident’s first surgery. All statistical analyses were run using R statistical software, version 4.1 (The R Project). Due to the exploratory nature of this study, alpha was set to 0.05 for all analyses.
This study received Institutional Review Board approval from Duke University (Pro00102120).
Results
Demographics
There were 11 894 cases logged by 128 residents and 171 attendings. Twenty-five percent of residents (n=32) and 12.3% of attendings (n=21) were women. Most surgeries were categorized as spine (37%, 4375) or cranial tumor (37%, 4367). Most cases were average in difficulty (68%, 8090). For complete demographics, see Table 1.
Resident Autonomy by Gender
After controlling for attending gender, PGY, case difficulty, and department, attendings rated women residents as operating within lower autonomy levels, (OR 0.81; 95% CI 0.74-0.89; P<.0001), and women residents also self-rated as operating with less autonomy (OR 0.73; 95% CI 0.67-0.80; P<.001) (Table 2). Women were about as likely to focus on a higher ZPD, a marker of resident progression, as men (OR 0.96; 95% CI 0.89-1.05; P=.39) (Table 3).
Feedback and ZPD Use
There was a small but significant difference in intraoperative feedback. Women residents reported meaningful intraoperative feedback for 90.1% of cases (1513 of 1678), while men reported this for 94% of cases (6226 of 6620) (X2=31.486, P<.001). Women residents reported receiving meaningful feedback on the SAP form for 74.2% of cases (1245 of 1678), while men reported this for 80.5% of cases (5329 of 6620) (X2=31.929, P<.001). Women residents reported that the attendings used ZPDs throughout the case for 74.6% of cases (1251 of 1678), while men reported this for 87.7% of cases (5805 of 6620) (X2=180.46, P<.001).
Metacognitive Skill
Residents reported lower autonomy than their attendings, but metacognition improved over time for all ZPDs. The Figure illustrates the average difference score across each year of residency by gender, based on ZPD of focus. Women residents appear to improve at a quicker rate than men for ZPD3, the most complex portion of the case. For ZPD3, there was a significant difference in the mean difference during PGY-2 (Men: -0.69, Women: 1.06; t=3.39; P<.001) and during PGY-3 (Men: -0.41, Women: -0.57; t=-2.57; P<.01).
In a multivariable ordinal regression analysis, time and ZPD use alone did not have a significant effect on the absolute difference between resident and attending autonomy assessment (Table 4). Woman gender increased the odds that there would be a larger difference between resident and attending assessment when ZPD is not used and time is zero (OR 1.61; 95% CI 1.17-2.21; P<.001). Utilization of the ZPD framework is a significant factor that narrows the disparity between resident and attending assessment for men and women residents, as seen by the interaction term between gender and ZPD use (OR 0.69; 95% CI 0.51-0.92; P<.05) (Table 4).
Discussion
This is one of the first studies in neurosurgery analyzing gender differences in resident autonomy. After accounting for attending gender, case complexity, year of residency, ZPD, and department, our results suggest that throughout residency, women operate with less autonomy compared to men, both through attending assessment and self-assessment.
Interestingly, the odds ratio for women operating within a higher autonomy level than men is similar to the odds ratio of residents operating on a hard case compared to an average case. Also of note is that while the gender of the attending affected the attending assessment significantly, this variable did not significantly affect resident self-assessment. This suggests that women attendings give women residents less autonomy, but that women residents self-assess similarly regardless of attending gender. Future research could elucidate why this is the case; perhaps some of the intrinsic gender biases that were introduced to women throughout their careers continue to affect their teaching and assessment styles when they become attendings. Overall, these results are consistent with studies in other surgical subspecialties, where women operated with lower autonomy than men, despite no difference in performance.2,3,22
Residents, despite gender, rated their autonomy lower than their attendings’ ratings throughout residency; however, this gap decreased as training progressed. Previous studies have established that surgical trainees’ self-assessment is discordant with attending assessment.23-27 However, the ability for self-assessment has been shown to improve over time in surgical residencies, as demonstrated in our study.23,24,26,27 Although there were only a handful of time points in which the average difference score between men and women residents significantly differed, there is a clear self-assessment gap between genders. Men generally rated themselves closer to their attendings’ assessment of autonomy. Nonetheless, women appeared to have a more rapid improvement in their metacognitive skill when evaluating their performance in ZPD3, the most complex portion of the case. Interpretation of ZPD1 is difficult beyond PGY-2, given that residents rarely focus on opening and positioning after that initial year of surgical training.
The self-assessment gap between men and women might be reduced in the future by providing feedback to all residents. Woman residents reported receiving significantly less feedback than their men counterparts, which could explain some of our findings. Without critical intraoperative and postoperative feedback, residents cannot accurately judge their own performance. Feedback can delineate what is expected from the attending surgeon and provide specific areas to focus on for upcoming surgeries.
We found that using the ZPDs during a case significantly impacted resident self-assessment. Although women progressed through ZPDs similarly to their colleagues, they were less likely to actually have the ZPDs used as a teaching tool during their cases. When the ZPDs were implemented during surgery, the difference between the attending and resident assessment for women improved compared to when the zones were not used. Thus, while women residents have a larger self-assessment gap than men, the use of guided teaching can help reduce such gaps.
These results support the idea that trainees are better able to judge their progression and performance when receiving structured teaching and feedback. Standardized methods of assessment, such as the SAP, may facilitate easier self-assessment by providing objective milestones for focus and improvement.25,28 Previous studies have shown that deliberate training through structured methods improves learners’ self-assessment.28 Additional methods that have been shown to help the development of metacognitive skill is observation of videotaped performance.26 While videotaping procedures for residents to view was not part of this study, it would be an interesting exercise to incorporate into the progression through the SAP.
Limitations and Future Directions
This study contains some key limitations. Departments may have differences in how consistently SAP is implemented, leading to missing data among all surgical cases with resident involvement. There is also no process to monitor SAP implementation at various sites; however, we attempted to address this limitation by accounting for department in the ordinal regression models.
This article considers the attendings’ assessment of autonomy to be the gold standard, given their experience evaluating learners. We did not require more than one attending surgeon to evaluate the resident on each case, precluding interrater variability analysis. As the academic year progresses, familiarity with the SAP, TAGS, and the ZPDs likely influences the assessments. However, given the number of surgical cases and distinct attendings included, we believe these are minimal limitations.
Implicit biases held both by trainees and attending surgeons of gender roles in a specialty traditionally dominated by men might have influenced the autonomy given to different residents, but this cannot be directly measured by a tool such as the SAP. We did not have information on faculty leadership positions such as program director or chairperson, but these individuals might be better attuned to biases in teaching and thus evaluate residents differently. Future studies could also qualitatively analyze intraoperative teaching to delineate any tangible differences in training experience, thus helping surgical departments develop strategies to mitigate differences. Finally, future studies should also expand outside neurosurgery to make these results more generalizable.
Conclusions
Woman neurosurgery residents operated with lower autonomy than man residents as determined by attending and self-assessment in the SAP, received less feedback, and were less likely to be taught in a structured approach through the ZPDs. While women had a larger gap between self and attending assessment, the use of ZPDs was demonstrated to reduce such gaps. Both genders demonstrated improvement in their self-assessment gap over time through training within the SAP.
References
Editor’s Note
The online supplementary data contains visuals from the Surgical Autonomy Program.
Author Notes
Funding: This publication was made possible (in part) by Grant Number TL1 TR002555 from the National Center for Advancing Translational Sciences (NCATS) of the National Institutes of Health (NIH). Its contents are solely the responsibility of the authors and do not necessarily represent the official views of NCATS or NIH.
Conflict of interest: Rajeev Dharmapurikar MS, Shivanand P. Lad, MD, PhD, and Michael M. Haglund, MD, PhD, MACM, all have financial equity in the Surgical Autonomy Program.
Disclaimer: Bradley A. Dengler, MD, participated in this research purely on behalf of the neurosurgical residency program at his institution. This article does not reflect the views and opinions of the United States Military or US Government. This research was not funded or endorsed by the US Military or US Government.
Elayna P. Kirsch, MD, and Vishal Venkatraman, MD, MHSc, are graduates of Duke University School of Medicine and are completing their residency training elsewhere. This research does not reflect the views of their current academic institutions.
This work was previously presented at the Congress of Neurological Surgeons Annual Meeting, October 8-12, 2022, San Francisco, California, USA.