Background

A vital element of the Next Accreditation System is measuring and reporting educational Milestones. Little is known about changes in Milestones levels during the transition from residency to fellowship training.

Objective

Evaluate the Accreditation Council for Graduate Medical Education (ACGME) Milestones' ability to provide a linear trajectory of professional development from general pediatrics residency to neonatal-perinatal medicine (NPM) fellowship training.

Methods

We identified 11 subcompetencies that were the same for general pediatrics residency and NPM fellowship. We then extracted the last residency Milestone level and the first fellowship Milestone level for each subcompetency from the ACGME's Accreditation Data System on 89 subjects who started fellowship training between 2014 and 2018 at 6 NPM fellowship programs. Mixed-effects models were used to examine the intra-individual changes in Milestone scores between residency and fellowship after adjusting for the effects of the individual programs.

Results

A total of 1905 subcompetency Milestone levels were analyzed. The average first fellowship Milestone levels were significantly lower than the last residency Milestone levels (residency, mean 3.99 [SD = 0.48] vs fellowship 2.51 [SD = 0.56]; P < .001). Milestone levels decreased by an average of -1.49 (SD = 0.65) from the last residency to the first fellowship evaluation. Significant differences in Milestone levels were seen in both context-dependent subcompetencies (patient care and medical knowledge) and context-independent subcompetencies (professionalism).

Conclusions

Contrary to providing a linear trajectory of professional development, we found that Milestone levels were reset when trainees transitioned from general pediatrics residency to NPM fellowship.

Objectives

We sought to compare the last assigned Milestone levels upon graduation from general pediatrics residency to the first assigned Milestone levels as first-year neonatal-perinatal medicine (NPM) fellows.

Findings

Average first fellowship Milestone levels were significantly lower than the last residency Milestone levels, and differences in Milestone levels were seen in both context-dependent subcompetencies (patient care and medical knowledge) and context-independent subcompetencies (professionalism).

Limitations

The study is limited by a retrospective observational design with a small sample size within a single speciality.

Bottom Line

Our results suggest Milestone levels are reset during the transition from general pediatric residency to NPM fellowship and do not create a logical trajectory of professional development in essential elements of competency.

Preparing residents and fellows for independent practice is the primary goal of graduate medical education (GME). Competency-based medical education (CBME), an outcomes-based approach to training, was introduced to achieve this goal.14  In 2013, the Accreditation Council for Graduate Medical Education (ACGME) implemented the Next Accreditation System (NAS) in an effort to advance CBME. A key element of the NAS is measuring and reporting educational Milestones.5  Milestones are defined by specialty-specific narrative descriptions of the trajectory of professional development.5  The goal of the Milestones framework is to “create a logical trajectory of professional development in essential elements of competency”5  and to connect medical education programs from undergraduate to graduate, and into practice and maintenance of certification.6  The Milestone levels are numeric ratings based on a developmental framework from “beginning resident” (level 1) to “aspirational” (level 5).612  Level 4 generally denotes a level at which trainees are “ready for unsupervised practice.”612 

Kogan et al identified 4 themes influencing how faculty judge and numerically rate residents' clinical skills. The factors include variable frames of reference, a high degree of inference, variable approaches to synthesizing data into numerical ratings, and factors external to resident performance.13  According to Kogan and colleagues, these themes highlight the variability in the process of assigning numerical ratings based on clinical observations. They also challenge the internal structure (an important source of validity) of numerically based assessments of clinical skills.14,15  Despite these concerns, several studies have shown a predictable upward trend of Milestone levels during residency training.8,9,1621  Reports from the ACGME suggest a similar upward trend of Milestone levels during subspecialty fellowship training.22  There is an evidence gap, however, regarding changes in Milestone levels during the transition from residency to subspecialty fellowship training.

Our study aim was to compare the last assigned Milestone levels of trainees upon graduation from general pediatrics residency to their first assigned Milestone levels as first-year neonatal-perinatal medicine (NPM) fellows. We hypothesized that Milestone level changes would show a linear trajectory of professional development as the trainees advanced from graduating residents to first-year fellows.

Setting and Participants

Study participants included a convenience sample of NPM fellows who started training between 2014 and 2018 at 6 NPM fellowship programs. Fellowship programs were selected to represent a range of sizes (small to large) and geographic diversity: Northeast, Midwest, South, and West.

Interventions

The study followed a multicenter retrospective cohort design. From the 21 general pediatrics subcompetencies11  and the 21 pediatrics subspecialty subcompetencies,12  we identified 11 subcompetencies that were identical between the 2 training contexts (provided as online supplementary data). The 11 subcompetencies included a combination of “context-independent” and “context-dependent” subcompetencies.23 

Context-independent subcompetencies are those in which the assessment of Milestone levels should not change based on the context of the training environment. As noted by Heath et al, the competencies of professionalism and interpersonal and communication skills are examples of context-independent subcompetencies.23  In contrast, the subcompetencies of patient care and medical knowledge are examples of context-dependent subcompetencies for which the assessment of Milestone levels may change based on the context of the training environment.23  Other subcompetencies in systems-based practice and practice-based learning and improvement are more ambiguous.23  There were no identical interpersonal and communication skills subcompetencies between general pediatrics residency and NPM fellowship. Therefore, the only context-independent subcompetencies we were able to analyze were for professionalism.

Outcomes Measured

Each site investigator created a list of all trainees in their NPM fellowship program from 2014 to 2018. They then retrieved each of those fellows' Milestones levels on the 11 studied subcompetencies from the ACGME Accreditation Data System website at 2 time points: last assessment during residency and first assessment during fellowship. Milestone scores were deidentified after extraction and uploaded to a central database for analysis. Data were collected between March and November 2019 and managed using REDCap electronic data capture tools hosted at the University of Washington.24,25  Subjects who completed only 2 years of pediatrics residency before entering NPM fellowship were excluded from the study.

Analysis of the Outcomes

As suggested by the ACGME, Milestone levels were treated as ordinal data and reported as means and standard deviation (SD).26,27  Mixed-effects models were used to examine the intra-individual changes in Milestone scores between residency and fellowship after adjusting for the effects of the individual programs. For each individual, a difference in Milestone performance was calculated for each Milestone, and then an average taken across all Milestones. Taking the average difference in Milestones performance for each individual, linear models were used to examine the difference in Milestones performance between residency and fellowship across programs, as well as differences in Milestones performance across years after adjusting for the program as a factor. Upper and lower quartiles were determined based on the top 75th percentile and bottom 25th percentile Milestone levels. Milestones changes were calculated as the numerical change between the last residency Milestone level and the first fellowship Milestone level. For example, if a trainee subcompetency Milestone level was 3.0 on their last residency evaluation and 4.0 on their first fellowship evaluation, the Milestone level change was +1; if a trainee was at level 4.0 on their last residency evaluation and then level 3.0 on their first fellowship evaluation, the change was -1. A P value of < .05 was considered statistically significant. Statistical analyses were conducted using RStudio version 1.2.5033.

The study was approved by the Institutional Review Board at each participating site with a waiver of signed informed consent.

Demographic data on the study subjects and the fellowship programs are provided in Table 1. Last residency and first fellowship Milestone levels were available for 89 trainees. The initial data set contained 1957 subcompetency Milestone levels. Of these, 52 Milestone levels were rated as “not yet assessable.” The final number of subcompetency Milestone levels analyzed was 1905.

Table 1

Subject (n = 89) and Program (n = 6) Demographics

Subject (n = 89) and Program (n = 6) Demographics
Subject (n = 89) and Program (n = 6) Demographics

The average last residency Milestone levels were significantly higher than the first fellowship Milestone levels (residency, mean Milestone level 3.99 (SD = 0.48) vs fellowship 2.51 (SD = 0.56); P < .001). Average Milestone levels decreased with a mean of -1.49 (SD = 0.65) from last residency to first fellowship assessment. The Figure shows the differences in average last residency and first fellow Milestone levels in the 5 ACGME core competencies. Table 2 shows the differences in last residency and first fellowship Milestone levels for the 11 subcompetencies analyzed in this study after adjusting for the fellowship training program in the mixed-effects model.

Figure

Comparison of Average Milestone Rating in Each ACGME Core Competency for 89 Trainees Based on Last Assessment During Resident and First Assessment During Fellowship

Note: Patient care included 3 subcompetencies; medical knowledge included 1 subcompetency; systems-based practice included 2 subcompetencies; practice-based learning and improvement included 2 subcompetencies; professionalism included 3 subcompetencies.

Figure

Comparison of Average Milestone Rating in Each ACGME Core Competency for 89 Trainees Based on Last Assessment During Resident and First Assessment During Fellowship

Note: Patient care included 3 subcompetencies; medical knowledge included 1 subcompetency; systems-based practice included 2 subcompetencies; practice-based learning and improvement included 2 subcompetencies; professionalism included 3 subcompetencies.

Close modal
Table 2

Changes in Milestone Levels From Last Evaluation as Graduating Resident to First Evaluation as Neonatal-Perinatal Medicine Fellow

Changes in Milestone Levels From Last Evaluation as Graduating Resident to First Evaluation as Neonatal-Perinatal Medicine Fellow
Changes in Milestone Levels From Last Evaluation as Graduating Resident to First Evaluation as Neonatal-Perinatal Medicine Fellow

There were some statistically significant differences in last residency and first fellowship Milestone levels across the 6 training programs (Table 3). However, no significant change was noted in Milestone levels over time (Table 4).

Table 3

Differences in Milestone Ratings Across Programs

Differences in Milestone Ratings Across Programs
Differences in Milestone Ratings Across Programs
Table 4

Differences in Milestone Ratings Over Time

Differences in Milestone Ratings Over Time
Differences in Milestone Ratings Over Time

There was no difference in first fellowship Milestone levels or Milestone level changes between trainees who did residency and fellowship in the same program and those who did not (same program, mean Milestone level 2.52 [SD = 0.54] vs different program 2.51 [SD = 0.57], P = .50; same program, mean Milestone level change -1.48 [SD = 0.49] vs different program -1.49 [SD = 0.66], P = .50).

Fellows who were a chief resident or a NICU/newborn hospitalist before fellowship had significantly lower Milestone levels as fellows and higher Milestone level change as compared to those who were not (first fellowship mean Milestone level for chief/hospitalist, 2.42 [SD = 0.58] vs not 2.53 [SD = 0.55], P = .010; chief/hospitalist, mean Milestone level change -1.68 [SD = 0.69] vs not -1.45 [SD = 0.64], P < .001).

Residents with Milestone levels in the top quartile had a greater decrease in Milestone levels on first fellowship evaluation as compared to residents in the bottom quartile (top 75th percentile, mean decrease -1.88 [SD = 0.59] vs bottom 25th percentile, -1.32 [SD = 0.59], P < .001). There was no clear relationship between having a last residency Milestone level in the top or bottom quartile and having a first fellowship Milestone level in the top or bottom quartile. Of the 20 residents with Milestone levels in the top 75th percentile, only 2 (10%) also had Milestone levels in the top 75th percentile on first fellowship evaluation. Of the 20 residents with Milestone levels in the bottom 25th percentile, 8 (40%) also had Milestone levels in the bottom 25th percentile on first fellow evaluation.

We studied the ability of the ACGME Milestone levels to accurately measure professional development along the continuum from residency to subspecialty fellowship training. As opposed to seeing a plateau or increase in Milestone level, we found a significant drop in Milestone levels in all 11 subcompetencies studied during the transition from residency to fellowship. The drop was seen in both context-dependent and context-independent subcompetencies. These data suggest a general “resetting of the bar” when trainees move from pediatrics residency to NPM fellowship.

In a study by Li and colleagues of 2030 pediatrics residents across 47 programs, the investigators found significant variation in end-of-year Milestone levels, but noted that the variability decreased during training.28  They reported that 79% of graduating third-year pediatrics residents received a Milestone level of ≥ 3 in all 21 pediatrics subcompetencies, and 21% received a level of ≥ 4 in all subcompetencies. The authors concluded that most graduating pediatrics residents were still advancing on the Milestone continuum and that a Milestone level of ≥ 3 at graduation was a realistic target for pediatrics residents. The Pediatrics Milestone Project supports a target Milestone level of 3 at graduation.11  In our study, the average last residency Milestone level was 3.99 (SD = 0.48), and the average first fellowship Milestone level was 2.51 (SD = 0.56). Therefore, the average first-year NPM fellow Milestone level was lower than the target resident graduation level. The mean Milestone level decreased by -1.49 (SD = 0.65) from residency graduation to the start of fellowship. This decrease was seen in both context-dependent and context-independent subcompetencies. One could argue that this resetting of the bar is appropriate for the context-dependent core competencies of patient care and medical knowledge. For context-dependent subcompetencies, it is logical that a graduating pediatrics resident would be level 4 in general pediatrics patient care and medical knowledge, but not level 4 in neonatology patient care and medical knowledge. We find it hard to argue, however, that a resetting of the bar is appropriate for the 3 context-independent subcompetencies of professionalism we studied. We cannot explain the drop of 1.27 Milestone levels in professional conduct, 1.48 levels in trustworthiness, and 1.59 levels in capacity to accept that ambiguity as part of clinical medicine. Moreover, we cannot explain why the fellows in our study who were a chief resident or a NICU/newborn hospitalist before starting fellowship had a more significant drop in Milestone levels compared to trainees who went directly into fellowship after residency.

Heath and colleagues evaluated Milestone levels of internal medicine subspecialty fellows in the first 6 months of fellowship training as an indicator of response process validity.23  The subcompetencies chosen were context-independent and included professionalism and interpersonal and communication skills. The author's assumed that those subcompetencies would not be affected by changes in the evaluation context of residency versus fellowship. The investigators found that 34% of professionalism subcompetencies and 26% of interpersonal and communication skills subcompetencies were scored at less than the resident “graduation target” of 4 during the first 6 months of fellowship training. These findings raise concerns about the validity of the Milestone system in specific medical subspecialties.

Our findings are consistent with those of Heath et al and provide additional evidence that Milestone levels are systematically reset when trainees begin fellowship. Why might clinical competency committies (CCCs) reset Milestone levels at the start of fellowship training? One reason is that CCCs want to allow for an upward progression in Milestone levels during training. In a 3-year NPM fellowship, this means that each fellow needs to start at a level < 4 to progress to the suggested target level of 4 at graduation.12  This issue is a specific concern with the 11 subcompetencies used in the current study, which were the same for residents and fellows. In those 11 subcompetencies, a “ceiling effect” restricts an upward progression during fellowship. Another reason that CCCs might reset Milestone levels at the start of fellowship is a misapplication of the Milestones. An example of misapplication is to assign levels based on the year of training, rather than individual characteristics of the trainee. In a study of family medicine Milestones by Peabody et al, the authors found that the year of training was the primary factor in assigning Milestone levels.20  This same phenomenon may be present in NPM fellowships. Misunderstanding of the Milestone levels by the CCC is another potential explanation. While each Milestone level we studied had a behavioral anchor, there is some degree of interpretation involved (eg, What is “professional conduct,” “trustworthiness,” and “capacity to accept ambiguity”?) To our knowledge, no psychometric analysis has been done on the pediatrics and pediatrics subspecialty Milestones. However, psychometric analysis of the family medicine Milestones supports the idea that some descriptors cause confusion.20 

This study has several limitations. The sample of 89 trainees was small. This small sample size increases the risk of bias. Further studies in larger cohorts of NPM fellows would offer additional validity evidence. Since each medical specialty and subspecialty has unique subcompetencies and Milestone levels, our study results may not be generalizable to other medical specialties and subspecialties. Due to the small number of training programs in this study, we could not perform subgroup analysis regarding the impact of program size or geographic location. Such programmatic factors may influence resident and fellow evaluation and would be good to control for in future studies.

Efforts can be taken at the local level to improve the quality of Milestones data. Kinnear et al examined available evidence and developed a set of practical tips to maximize the value of CCCs. Some tips to improve the CCC output included using assessment data from multiple sources, conducting regular CCC member training, and engaging the committee in continuous quality improvement.29  Efforts to improve Milestone-based assessment are underway at the national level. The ACGME is currently working on ‘‘Milestones 2.0.”30  The Milestones 2.0 process started in 2016 with the creation of Harmonized Milestones for interpersonal and communication skills, practice-based learning and improvement, professionalism, and systems-based practice.30  The ability of Milestones 2.0 to accurately demonstrate professional development from residency to fellowship training remains to be seen.

Our results suggest Milestone levels are reset when trainees transition from general pediatric residency to NPM fellowship training. These findings challenge the idea that the Milestones create a logical trajectory of professional development in essential elements of competency.

The authors would like to thank the clinical competency committees for their diligent work in this study; Tommy Wood, BM, BCh, PhD, for conducting the statistical analysis; and the Institute for Translational Health Science for allowing the use of REDCap electronic data capture tools.

1. 
Frank
JR,
Snell
LS,
Cate
OT,
et al
Competency-based medical education: theory to practice
.
Med Teach
.
2010
;
32
(
8
):
638
645
.
2. 
Frank
JR,
Mungroo
R,
Ahmad
Y,
Wang
M,
De Rossi
S,
Horsley
T.
Toward a definition of competency-based education in medicine: a systematic review of published definitions
.
Med Teach
.
2010
;
32
(
8
):
631
637
.
3. 
Hawkins
RE,
Welcher
CM,
Holmboe
ES,
et al
Implementation of competency-based medical education: are we addressing the concerns and challenges?
Med Educ
.
2015
;
49
(
11
):
1086
1102
.
4. 
ten Cate
O,
Scheele
F.
Competency-based postgraduate training: can we bridge the gap between theory and clinical practice?
Acad Med
.
2007
;
82
(
6
):
542
547
.
5. 
Nasca
TJ,
Philibert
I,
Brigham
T,
Flynn
TC.
The next GME accreditation system—rationale and benefits
.
N Engl J Med
.
2012
;
366
(
11
):
1051
1056
.
6. 
Allen
S.
Development of the family medicine milestones
.
J Grad Med Educ
.
2014
;
6
(
1 suppl 1
):
71
73
.
7. 
Accreditation Council for Graduate Medical Education.
ACGME Milestones Project: Lessons Learned and What's Next
.
2020
.
8. 
Hauer
KE,
Clauser
J,
Lipner
RS,
et al
The internal medicine reporting milestones: cross-sectional description of initial implementation in US residency programs
.
Ann Intern Med
.
2016
;
165
(
5
):
356
362
.
9. 
Hauer
KE,
Vandergrift
J,
Hess
B,
et al
Correlations between ratings on the resident annual evaluation summary and the internal medicine milestones and association with ABIM certification examination scores among US internal medicine residents, 2013–2014
.
JAMA
.
2016
;
316
(
21
):
2253
2262
.
10. 
Philibert
I,
Brigham
T,
Edgar
L,
Swing
S.
Organization of the educational milestones for use in the assessment of educational outcomes
.
J Grad Med Educ
.
2014
;
6
(
1
):
177
182
.
11. 
Accreditation Council for Graduate Medical Education and The American Board of Pediatrics.
Pediatric Milestones Project
.
2021
.
12. 
Accreditation Council for Graduate Medical Education and The American Board of Pediatrics.
The Pediatrics Subspecialty Milestone Project
.
2021
.
13. 
Kogan
JR,
Conforti
L,
Bernabeo
E,
Iobst
W,
Holmboe
E.
Opening the black box of clinical skills assessment via observation: a conceptual model
.
Med Educ
.
2011
;
45
(
10
):
1048
1060
.
14. 
Messick
S.
Validity
.
In:
Linn
RL,
ed.
Educational Measurement. 3rd ed
.
New York, NY
:
American Council on Education and Macmillan;
1989
:
13
103
.
15. 
Kane
MT.
Validation
.
In:
Brennan
RL,
ed.
Educational Measurement. 4th ed
.
Westport, CT
:
Praeger;
2006
:
17
64
.
16. 
Beeson
MS,
Carter
WA,
Christopher
TA,
et al
The development of the emergency medicine milestones
.
Acad Emerg Med
.
2013
;
20
(
7
):
724
729
.
17. 
Beeson
MS.
The emergency medicine milestones: with experience comes suggestions to improve
.
Acad Emerg Med
.
2016
;
23
(
12
):
1434
1436
.
18. 
Beeson
MS,
Holmboe
ES,
Korte
RC,
et al
Initial validity analysis of the emergency medicine milestones
.
Acad Emerg Med
.
2015
;
22
(
7
):
838
844
.
19. 
Aagaard
E,
Kane
GC,
Conforti
L,
et al
Early feedback on the use of the internal medicine reporting milestones in assessment of resident performance
.
J Grad Med Educ
.
2013
;
5
(
3
):
433
438
.
20. 
Peabody
MR,
O'Neill
TR,
Peterson
LE.
Examining the functioning and reliability of the family medicine milestones
.
J Grad Med Educ
.
2017
;
9
(
1
):
46
53
.
21. 
Turner
TL,
Bhavaraju
VL,
Luciw-Dubas
UA,
et al
Validity evidence from ratings of pediatric interns and subinterns on a subset of pediatric milestones
.
Acad Med
.
2017
;
92
(
6
):
809
819
.
22. 
Accreditation Council for Graduate Medical Education.
Milestones National Report 2019
.
2021
.
23. 
Heath
JK,
Dine
CJ. ACGME
Milestones within subspecialty training programs: one institution's experience
.
J Grad Med Educ
.
2019
;
11
(
1
):
53
59
.
24. 
Harris
PA,
Taylor
R,
Thielke
R,
Payne
J,
Gonzalez
N,
Conde
JG.
Research electronic data capture (REDCap)—a metadata-driven methodology and workflow process for providing translational research informatics support
.
J Biomed Inform
.
2009
;
42
(
2
):
377
381
.
25. 
Harris
PA,
Taylor
R,
Minor
BL,
et al
The REDCap consortium: building an international community of software partners
.
J Biomed Inform
.
2019
;
95
:
103208
.
26. 
Accreditation Council for Graduate Medical Education.
Milestones Annual Report 2016
.
2021
.
27. 
Peterson
LE,
Rankin
W.
Are Milestones really measuring development?
J Grad Med Educ
.
2017
;
9
(
3
):
310
312
.
28. 
Li
ST,
Tancredi
DJ,
Schwartz
A,
et al
Competent for unsupervised practice: use of pediatric residency training milestones to assess readiness
.
Acad Med
.
2017
;
92
(
3
):
385
393
.
29. 
Kinnear
B,
Warm
EJ,
Hauer
KE.
Twelve tips to maximize the value of a clinical competency committee in postgraduate medical education
.
Med Teach
.
2018
;
40
(
11
):
1110
1115
.
30. 
Edgar
L,
Roberts
S,
Holmboe
E.
Milestones 2.0: a step forward
.
J Grad Med Educ
.
2018
;
10
(
3
):
367
369
.

Author notes

Editor's Note: The online version of this article contains the general pediatrics and pediatrics subspecialty subcompetencies used in the study.

Funding: The authors report no external funding source for this study.

Competing Interests

Conflict of interest: The authors declare they have no competing interests.

Supplementary data