ABSTRACT
It is assumed that there is a need for continuity of supervision within competency-based medical education, despite most evidence coming from the undergraduate medical education rather than the graduate medical education (GME) context. This evidence gap must be addressed to justify the time and effort needed to redesign GME programs to support continuity of supervision.
To examine differences in assessment behaviors of continuous supervisors (CS) versus episodic supervisors (ES), using completed formative assessment forms, FieldNotes, as a proxy.
The FieldNotes CS- and ES-entered for family medicine residents (N=186) across 3 outpatient teaching sites over 3 academic years (2015-2016, 2016-2017, 2017-2018) were examined using 2-sample proportion z-tests to determine differences on 3 FieldNote elements: competency (Sentinel Habit [SH]), Clinical Domain (CD), and Progress Level (PL).
Sixty-nine percent (6104 of 8909) of total FieldNotes were analyzed. Higher proportions of CS-entered FieldNotes indicated SH3 (Managing patients with best practices), z=-3.631, P<.0001; CD2 (Care of adults), z=-8.659, P<.0001; CD3 (Care of the elderly), z=-4.592, P<.0001; and PL3 (Carry on, got it), z=-4.482, P<.0001. Higher proportions of ES-entered FieldNotes indicated SH7 (Communication skills), z=4.268, P<.0001; SH8 (Helping others learn), z=20.136, P<.0001; CD1 (Doctor-patient relationship/ethics), z=14.888, P<.0001; CD9 (Not applicable), z=7.180, P<.0001; and PL2 (In progress), z=5.117, P<.0001.
The type of supervisory relationship impacts assessment: there is variability in which competencies are paid attention to, which contexts or populations are included, and which progress levels are chosen.
Examining differences in assessment behaviors of continuous supervisors versus episodic supervisors, using completed formative assessment forms, FieldNotes, to help determine whether time and effort needed to redesign graduate medical education (GME) programs to support continuity of supervision can be justified.
At the GME level, type of supervisory relationship impacts assessment in terms of competencies paid attention to, contexts or populations included, and progress levels chosen.
Our study was conducted at a single institution and specialty, which may limit generalizing particularly to programs with longer training programs where there is more opportunity to interact with residents over time.
This study provides evidence supporting the benefits of continuity of supervision for competency assessments. Without understanding different supervisory relationships and their effects on workplace-based assessments in GME, conclusions from feedback and assessment in competency-based medical education may be flawed.
Introduction
Continuity of supervision is one of the assumptions of competency-based medical education (CBME),1-4 but is the call for continuity of supervision supported by evidence? Evidence supporting the benefits of continuity of supervision for competency assessments comes primarily from undergraduate medical education (UME),5-12 with little evidence from graduate medical education (GME).13 Thus, it is not clear whether the investment of resources and time to restructure GME to enhance continuity of supervision is justified.
The benefits of integrating continuity of supervision into CBME assessments derives from primarily UME work demonstrating relationships between supervisors and learners can support learners' attitudes toward assessment,10,14,15 willingness to seek out feedback,16 and incorporation of feedback into learning.17 Relationships have also been highlighted for effective coaching,18,19 and supervisor-learner relationships appear integral to the development of trust.20 There is also heterogeneity in the definitions and contexts for continuity of supervision in studies of competency assessments. UME studies focus on yearlong longitudinal integrated clerkships versus clerkship rotations.21 In GME, even 5 sessions with the same supervisor may be considered “high continuity.”22 Without understanding different supervisory relationships and their effects on workplace-based assessments in GME, conclusions from feedback and assessment in CBME may be flawed.
Methods
Setting and Participants
This study was conducted in 2020 in a large university-based Canadian 2-year family medicine residency program (note: 2 years is standard in Canada), with 70 to 80 residents per year (60 to 65 in the urban stream and 12 to 16 in the rural stream). Upon entry, residents are assigned to a continuity supervisor for the 2 years. During the first year, residents are assigned to a 6-month continuous experience with the continuity supervisor and then a scheduled half-day per week in this “home site,” with the same supervisor for the remainder of residency. All other supervisors are non-continuous, “episodic” supervisors, including other family medicine supervisors.
Data Source
The residency program uses electronic forms called FieldNotes for formative, work-based assessments.25,26 Supervisors complete a FieldNote after direct observation of a resident by selecting one each of the following elements: competency (labeled Sentinel Habits [SH]), which is based on the Assessment Objectives for Certification in Family Medicine (provided as online supplementary data)27 ; clinical context or population (labeled Clinical Domain [CD] of family medicine; Table); and judgement of the level of competence demonstrated (Progress Level [PL]; Table). We looked at SH, CD, and PL on all FieldNotes entered online for all residents (N=186) by their supervisors (N=79) across 3 academic teaching sites and 3 consecutive cohorts within academic years 2015-2016, 2016-2017, and 2017-2018. FieldNotes were de-identified by replacing resident and supervisor names with a unique identifier code. FieldNotes were also coded according to the type of supervision (continuous supervisor [CS] or episodic supervisor [ES]).
Statistical Analysis
Our initial exploratory data analysis consisted of visualizations and 2-sample proportion z-tests (using Bonferroni correction for Type I error, ie, α/n=10, with α=.05 and n=10, the number of comparisons we made) to compare frequencies of SHs, CDs, and PLs between FieldNotes completed by CS versus ES.28 The Bonferroni correction provided a more rigorous correction than the Tukey test which tolerates Type I errors, but more generous than the conservative Scheffé's method, providing a balanced control for Type I errors. A test of multicollinearity on the variables SH, CD, and PL using the variance inflation factor was performed. None of the variance inflation factor values were large enough to warrant a concern that these variables are highly correlated with each other, supporting our assumption of independence. All statistical analyses were conducted using RStudio.
This study was approved by the University of Alberta Health Research Ethics Board.
Results
The original dataset included 8909 FieldNotes. We excluded 2175 notes that were self-entered by a resident or by a supervisor who did not serve as a CS for any resident, and where there was uncertainty about which supervisor was assigned to a resident. The analyses included 6104 FieldNotes (69% of total FieldNotes; Figure 1).
Flow Diagram of Inclusion and Exclusion of FieldNotes
Abbreviations: ES, episodic supervisor, CS, continuous supervisor.
Flow Diagram of Inclusion and Exclusion of FieldNotes
Abbreviations: ES, episodic supervisor, CS, continuous supervisor.
Analysis of 6104 FieldNotes showed differences in the proportion of SHs, CDs, and PLs between FieldNotes entered by CS and ES. Using the 2-sample test of proportions, higher proportions of CS-entered FNs indicated SH3 (Managing patients with best practices), z=-3.631, P<.0001; CD2 (Care of adults), z=-8.659, P<.0001; CD3 (Care of the elderly), z=-4.592, P<.0001); and PL3 (Carry on, got it), z=-4.482, P<.0001 (Figures 2-4). Higher proportions of ES-entered FieldNotes indicated SH7 (Communication skills), z=4.268, P<.0001; SH8 (Helping others learn), z=20.136, P<.0001; CD1 (Doctor-patient relationship/ethics), z=14.888, P<.0001; CD9 (Not applicable), z=7.180, P<.0001; and PL2 (In progress), z=5.117, P<.0001.
Distribution of Proportion of Sentinel Habits Between Continuous and Episodic Supervisor-Entered FieldNotes
aP<.0001.
Distribution of Proportion of Sentinel Habits Between Continuous and Episodic Supervisor-Entered FieldNotes
aP<.0001.
Distribution of Proportion of Clinical Domains Between Continuous and Episodic Supervisor-Entered FieldNotes
aP<.0001.
Distribution of Proportion of Clinical Domains Between Continuous and Episodic Supervisor-Entered FieldNotes
aP<.0001.
Distribution of Proportion of Progress Levels Between Continuous and Episodic Supervisor-Entered FieldNotes
aP<.0001.
Distribution of Proportion of Progress Levels Between Continuous and Episodic Supervisor-Entered FieldNotes
aP<.0001.
Discussion
Although both continuous and episodic supervisors completed FieldNotes across the range of Sentinel Habits, Clinical Domains, and Progress Levels, we found differences in specific competencies, clinical context or population, and competence levels assessed by CS versus ES.
Assessment information collected with FieldNotes is used in a similar way to the information collected by entrustable professional activities (EPAs) assessments in many CBME programs.26,29 With both FieldNotes and EPAs, residents are assessed by multiple assessors over multiple instances, and these assessments lead to summative decisions. The study results suggest that different supervisory relationships, by focusing on different resident behaviors, may enhance the assessment program. The differences found between CS and ES suggest that supervisory relationship may affect what assessors choose to assess. Thus here, for ES relationships, assessors were more likely to assess SHs where competency can be demonstrated in one or a few observations, such as Verbal or written communications (SH7) or Doctor-patient relationships/ethics (CD1).
In contrast, CS-entered FieldNotes were more likely to assess managing patients using available best practices (SH3), the application of medical knowledge in the management of patients. This is a complex competency that incorporates multiple elements and likely requires that a resident be observed repeatedly over time across multiple contexts. This is more likely to occur when there is a continuous supervisory relationship. Further, the higher proportion of FieldNotes from CS with the highest judgement of competence (PL3: Carry on, got it) likely reflects trust that supervisors develop when they have a sustained, longitudinal relationship with a resident.30
The difference in judgement of competence is particularly important as it suggests supervisors are less likely to indicate that a resident has demonstrated competence when they only work with a resident a few times or intermittently. This finding is important to explore further. Currently, many GME programs incorporate primarily ES because of curricular and workplace-based structures.31-33 If our finding is found to be common across programs, residents may be less likely to be deemed competent or entrusted because they are being assessed by an ES who has not had the opportunity to develop a relationship that allows for understanding of a resident's progression toward competence and for the development of trust over multiple entrustment experiences. Assessment of competence in GME can be high stakes, even leading to shortening or lengthening of training. It is important that we have a clear understanding of how different supervisory relationships impact assessment behavior.
This study is limited by use of a single institution and specialty, which may limit generalizing particularly to programs with longer training programs. We looked for patterns of differences in assessment between CS and ES and did not control for other factors, such as the type of clinical experience, assessor training or years in program, and other factors. In addition to workplace-based assessment, the FieldNotes included assessments of simulated patient care sessions and resident teaching sessions, which likely present fewer domains for assessment. Also, this study compared the frequency of assessment areas, not the quality of supervisory assessments. Future research will use learning analytics to further examine similarities and differences between CS and ES assessments, including amount and quality of feedback captured on the FieldNotes, and whether there is evidence that feedback loops are closed.34 We will also collaborate with other residency programs who have similar datasets to look at whether our findings can be replicated using their datasets.
Conclusions
The type of supervisory relationship impacts assessment: for example, supervisors who had a continuous relationship with a learner were more likely to comment on complex competencies, while episodic supervisors were more likely to comment on competencies that require fewer observations. Depending on the type of supervisory relationship, there is also variability in the contexts or populations included and in the progress levels chosen on FieldNotes.
References
The authors would like to thank Delane Linkiewich for support in preparing data for analyses.
Author notes
Editor's Note: The online version of this article contains further data from the study.
Funding: This work was funded by the Social Sciences and Humanities Research Council of Canada.
Competing Interests
Conflict of interest: The authors declare they have no competing interests.
This work was previously presented as a poster abstract at the virtual International Conference on Residency Education (ICRE), September 25, 2020, and as an education free standing paper at the virtual Family Medicine Forum, November 4, 2020.