Abstract
Resident well-being impacts competence, professionalism, career satisfaction, and the quality of care delivered to patients.
We established normative scores and reported evidence of relationship between the Physician Well-Being Index (PWBI) score to other variables and consequence validity for the PWBI in a national sample of residents, and evaluated the performance of the index after substituting the original fatigue item with an item not associated with driving a car.
We conducted a cross-sectional survey study of a national sample of 20 475 residents. The survey included the PWBI, instruments assessing mental quality of life (QOL) and fatigue, and items on recent suicidal ideation and medical error. Fisher exact test or Wilcoxon/2-sample t test procedures were used with a 5% type I error rate and a 2-sided alternative.
Of 7560 residents who opened the e-mail to participate in the study, 1701 (22.5%) completed the survey. Residents with low mental QOL, high fatigue, or recent suicidal ideation were more likely to endorse each of the PWBI items and a greater number of total items (all P < .001). At a threshold score of ≥ 5, the PWBI's specificity for identifying residents with low mental QOL, high fatigue, or recent suicidal ideation was 83.6%. PWBI score also stratified residents' self-reported medical errors. The PWBI performed similarly using either fatigue item.
The 7-item PWBI appears to be a useful screening index to identify residents whose degree of distress may negatively impact the quality of care they deliver.
What was known
Earlier work shows some residents are in distress during training, with a negative impact on competence, career satisfaction, and quality of care.
What is new
A 7-item instrument identified residents in distress, with a high distress score also associated with a greater number of residents' self-reported medical errors.
Limitations
Survey study assessed association, but not the direction of the relationship or causality; low response rate and the potential for respondent bias.
Bottom line
A brief screening tool with adequate construct validity may be used to identify residents who may benefit from added resources, or in resident self-assessment and subsequent help seeking.
Editor's Note: The online version of this article contains details about the Physician Well-Being Index used in this study.
Introduction
Residents' mental health can impair their competency, professionalism, career satisfaction, and the quality of care delivered to patients.1–6 Burnout and other types of distress increase the risk of alcohol abuse, suicidal ideation, and motor vehicle accidents.7–10 Unfortunately, many trainees hesitate to seek help and some opt to self-prescribe antidepressants,9,11–13 which is a risky and inappropriate strategy.14–17 Barriers to help seeking may include self-doubt about the need for treatment, perceived stigma, concerns about negative consequences, and a professional culture of stoicism.9,18,19
Existing tools to evaluate distress are long, have complex scoring, and typically measure only 1 domain of distress (eg, fatigue, burnout). We first developed the Medical Student Well-Being Index (MSWBI), a 7-item instrument to screen for multiple dimensions of distress that commonly affect medical students. Evidence of content-related validity, internal structure validity, and reliability of the MSWBI, as well as how the MSWBI score relates to 3 clinically relevant outcomes (low mental quality of life [QOL], suicidal ideation, or serious thoughts of dropping out of medical school) and methods for establishing cut-scores in a national sample of medical students has been published.20,21 A slightly modified version, the Physician Well-Being Index (PWBI), was subsequently developed and tested in a national sample of 6994 practicing physicians in 2011.22 That study provided further validity evidence by showing the PWBI score not only stratifies mental QOL and recent suicidal ideation, but also degree of fatigue, career satisfaction, and important clinical practice measures (intent to leave current practice and self-reported medical error) within the context of a diverse sample of physicians out in practice and done with training.
In the present study, we extend this work by gathering validity evidence within the context of a national sample of residents as the working conditions, work-hour expectations, and distress levels of residents are distinct from both medical students and practicing physicians. We report evidence of relationship between index score to other variables and consequence validity for the PWBI in this national sample of residents. We also evaluate the performance of the index after substituting the original fatigue item with a new fatigue item not dependent on a need to regularly drive a car.
Methods
Participants
As previously reported,23 in 2012 we surveyed 20 475 US residents/fellows (“residents”) listed in the Physician Masterfile (PMF), which contains nearly all residents independent of American Medical Association (AMA) membership, who had an e-mail address on file with the AMA, and permitted it to be used for correspondence. Invitations were sent to these residents; 7560 residents opened at least 1 e-mail invitation. We considered those who opened an e-mail to have received an invitation to participate in the study.24 Participation was voluntary and all responses were anonymous.
Study Measures
The survey included the PWBI and items inquiring about demographics, specialty area, recent suicidal ideation, medical error, and standardized instruments to measure QOL.
Physician Well-Being Index
We have previously reported the methods used to develop the 7-item PWBI. Details are provided as online supplemental material. Briefly, the index is intended to include the domains of burnout, depression, stress, fatigue, and mental and physical QOL. It consists of 7 yes/no items and respondents receive a score from 0 to 7 based on responses.21 Our previous studies suggest a threshold score of ≥ 4 for medical students and practicing physicians.20,22 At a threshold score of ≥ 4, the specificity for detecting medical students and practicing physicians with low mental QOL was 87.7% and 81.0%, respectively, and the sensitivity was 59.2% and 73.3%, respectively.
Fatigue
As not all residents drive cars regularly we explored whether the performance of the PWBI improved when an alternative fatigue item not dependent on a need to drive (“have you fallen asleep while sitting inactive in a public place”) was used in place of the original PWBI fatigue item (“have you fallen asleep while stopped in traffic or driving”). Both items stemmed from the Epworth Sleepiness Scale,25 which was identified during our original instrument development phase to capture the intended construct.26 All participants were asked both fatigue items and index scores were calculated for each participant using both the original and alternative fatigue item to see if 1 item improved performance.
Other Study Measures
Residents rated their mental QOL and level of fatigue over the past week on a standardized linear analog scale. This scale has evidence of validity in a variety of medical conditions and populations27,28 and has been used in other studies of residents.5 The item inquiring about recent suicidal ideation within the last 12 months is similar to questions used in large US epidemiologic studies29,30 and has been used in previous samples of physicians.9 One item identical to that used in a previous study of residents and practicing physicians inquired about self-perceived medical errors in the last 3 months.5,31
Relationship to Other Variables
As distress has multiple dimensions (eg, depression, burnout, fatigue, QOL) there is not a single gold standard for defining “severe distress.” In this study, we assessed the ability of the PWBI to:
Identify residents with low mental QOL defined by a score ≥ 1/2 SD below the sex-matched general population norm5 (a clinically meaningful effect size32)
Identify residents who had high level of fatigue defined by a score ≥ 1/2 SD below the sex-matched general population norm (lower scores indicate higher fatigue)33
Identify residents who reported suicidal ideation within the last 12 months as an alternative measure of distress because it represents a clinically relevant outcome that warrants individualized counseling
Stratify residents' likelihood of reporting a recent major medical error to identify those in need of support that could reduce inappropriate self-blame
The Mayo Clinic Institutional Review Board approved the study.
Statistical Analysis
We used basic descriptive statistics and Fisher exact test or Wilcoxon/2-sample t test procedures, as appropriate. We used a 5% type I error rate and a 2-sided alternative. We calculated the sensitivity, specificity, and likelihood ratios (LRs) associated with PWBI scores for outcomes of interest. Lastly, we used a hypothetical cohort to explore the practicality of using the PWBI at various cut points. We conducted all analysis using SAS version 9 (SAS Institute Inc, Cary, NC).
Results
Of the 7560 residents who opened the e-mail invitation, 1701 residents (22.5% participation rate) completed the survey. The demographics of responders in comparison to the 12 9608 residents listed in the PMF were generally similar, although fewer responders were male (participants 49% versus PMF 54%).23 The specialty distribution of responders was similar to US residents in general,34 with the exception of slightly fewer responders in family medicine (6.9% versus 10.0%) and a larger proportion in obstetrics-gynecology (11.0% versus 5.2%). Rates of burnout, symptoms of depression, recent suicidal ideation, high fatigue, and QOL have been previously reported23 and were similar to findings in other studies with substantially higher participation rates.2
Mental QOL
Residents with low mental QOL were more likely to endorse each PWBI item (table 1) as well as a greater total number of items (mean 5.0 [SD 1.5] versus mean 2.7 [SD 2.0], P < .001). As the number of PWBI items endorsed increased so did the odds of having low mental QOL. The likelihood ratio for low mental QOL among residents with PWBI scores ≥ 5 was 3.42 in comparison to 0.37 for those with scores < 5.
Using exact PWBI scores, which provide a way to estimate an individual resident's risk of distress after they complete the PWBI, the LR of low mental QOL ranges from 0.06 to 5.66 (table 2). Assuming a 28.9% prevalence of low mental QOL (ie, the approximate prevalence in the overall 2012 AMA sample23 and in previous large studies of physicians6) as the pretest probability, the PWBI exact score can lower the posttest probability to 2.4%, or raise it to 70.0%.
Fatigue and Suicidal Ideation
Residents with high levels of fatigue or recent suicidal ideation were more likely to endorse each PWBI item and a greater number of total items (all P < .001). As the number of PWBI items endorsed increased so did the odds of high fatigue (odds ration [OR] 0.34–5.189) and suicidal ideation (OR 0.22–4.65). Assuming a prevalence of 32.5% for high fatigue5 as the pretest probability, the PWBI exact score can lower the posttest probability to 15.4% or raise it to 68.7% (table 2). Similarly, using a prevalence of 7.98% for recent suicidal ideation9,23 as the pretest probability, the PWBI exact score can lower the posttest probability to < 1%, or raise it to 25%.
For each outcome (ie, low mental QOL, high level of fatigue, and recent suicidal ideation), the likelihood ratio was less than 1 for those with an exact score of 4 or below and greater than 1 for those with exact scores of 5, 6, or 7, suggesting 5 is the inflection point.
Threshold Scores
Threshold scores (ie, ≥ 1 etc) can estimate the risk of distress in a group scoring at or above a specific threshold. Thus, they can be useful for establishing a cutoff score to identify a subset of residents who may benefit from further evaluation or support. At a threshold score of ≥ 5, the specificity of the PWBI for identifying residents with low mental QOL was 79.4% and the sensitivity was 70.3%.
Next, we examined the prevalence of high fatigue or suicidal ideation among residents who had index scores ≥ 5 but did not have low mental QOL. These residents would be considered “false positives” based on their mental QOL score alone. Among the 244 residents with index scores ≥ 5 who did not have low mental QOL, 84 had high fatigue and 27 reported recent suicidal ideation suggesting that they were experiencing substantial distress despite their mental QOL scores. When only those residents with PWBI scores ≥ 5 who did not have low mental QOL, high fatigue, and/or recent suicidal ideation were considered false positives, the specificity increased to 83.6%.
PWBI Scores and Practice-Related Outcome
Residents who reported a recent major medical error were also more likely to endorse each of the individual PWBI items and a greater number of total items (both P < .001). Assuming an estimated prevalence of 16.3% for recent major medical error5,23 as the pretest probability, the PWBI exact score can lower the posttest probability to 4.8% or raise it to 39.3% (table 2).
Alternative Fatigue Item
Overall, the PWBI performed similarly using either fatigue item. For example, the posttest probability for high fatigue using the alternative fatigue item ranged from 16.5% (score of 0) to 65.4% (score of 7), and sensitivity and specificity for high fatigue differed by only 2.0% to 3.0%.
Use of the PWBI to Screen a Hypothetical Cohort of Residents
Lastly, we modeled the outcome of screening a hypothetical cohort of 100 residents using the PWBI and a PWBI score ≥ 5 to identify residents at risk. Assuming a prevalence of low mental QOL, fatigue, and recent suicidal ideation similar to that of the 2012 national sample of US residents, 35 would have a score ≥ 5. Of these 35 residents, 26 would have low mental QOL, recent suicidal ideation, or high fatigue.
Discussion
In this cohort of 1701 residents representing all specialties and training stages, the PWBI stratified residents' well-being and identified those with low mental QOL, high fatigue, and recent suicidal ideation. At a threshold score of ≥ 5, the PWBI's specificity for identifying residents with low mental QOL is 79.4% and the sensitivity is 70.3%. The specificity improved to 83.6% when we defined false positives as not having low mental QOL, high fatigue, or recent suicidal ideation—a reasonable approach as having these outcomes warrant identification and response regardless of a mental QOL score. At this threshold score the PWBI's specificity is comparable to that of other established and widely used screening instruments.35–40 Notably, the threshold score of ≥ 5 is different from the threshold score previously identified in large samples of medical students and practicing physicians.20,22 This reinforces the need to evaluate the PWBI in a large sample of residents as was done in this study. We also found that replacing the original fatigue item with a new fatigue item not dependent on a need to drive did not substantially alter the performance of the PWBI.
Favorable characteristics of the PWBI include its brevity, ease of administration, simple scoring strategy, breath of dimensions of distress covered, and ability to identify residents whose degree of distress places them at risk for committing a medical error. Hence, it is well-suited for a brief self-assessment tool that residents could regularly use to assess their current level of distress, and gain insight into when their distress is placing them at risk for potentially serious consequences. Residents who completed the PWBI could also receive immediate personalized feedback about how their level of distress compares to peers, and information about local mental health and wellness resources. Thus, it has the potential to help residents self-monitor, promote their own wellness, and recognize when they need help.
Residency programs could use the PWBI as part of a screening process to improve resource allocation to those residents in greatest need of support. Such a proactive approach has the potential to lead to earlier intervention when a resident's distress may be less severe, and before it has led to adverse consequences. Screening approaches may need to be pursued through an employee assistance program or a provost who is not connected to the residency program to ensure confidentiality is maintained. Based on this study's results a threshold score of 5 or greater appears reasonable. Alternatively, aggregate, deidentified resident scores on the PWBI could also be funneled to residency programs to provide global information in regards to trainee well-being.
This study has several limitations. First, distress is a multidimensional construct and no gold standard exists for measuring it. We examined 3 clinically relevant dimensions of distress (low mental QOL, high fatigue, and suicidal ideation) with potential for serious personal and professional consequences. The PWBI may not stratify risk for other dimensions of distress. However, as mean mental QOL decreased in a stepwise fashion with each 1 point increase in PWBI score the PWBI appears to be a powerful risk stratification tool. Second, the PWBI is a screening tool and not a diagnostic instrument. Hence, it is intended to improve residents' self-awareness, provide calibration relative to peers, and identify those who may benefit from further evaluation or support. Third, we conducted a cross-sectional study, with the potential for selection bias. Future studies are needed to evaluate the consequences of residents completing the PWBI (including subsequent allocation of resources or help seeking), as well as the instrument's ability to assess change over time. Strengths include our methodological approach of evaluating the instrument in a national sample of residents while simultaneously using well-established metrics and identifying residents whose degree of distress is associated with clinically important outcomes and relevant practice-related risks.
Conclusion
Our results offer further evidence of construct validity of the PWBI in a selected national sample of residents. Further research is needed to determine the utility of using the PWBI as a screen to identify residents who may benefit from individual resources or for residents to self-assess their level of distress, and whether screening with the PWBI facilitates help-seeking behaviors.
References
Author notes
All authors are at the Mayo Clinic. Liselotte N. Dyrbye, MD, MHPE, is Associate Professor of Medicine, Division of Primary Care Internal Medicine, Department of Medicine, College of Medicine; Daniel Satele, BA, is Statistician in Biomedical Statistics and Informatics, Department of Health Sciences Research; Jeff Sloan, PhD, is Professor of Biostatistics and Oncology, Department of Health Sciences Research; and Tait D. Shanafelt, MD, is Professor of Medicine, Division of Hematology, Department of Medicine, College of Medicine.
Funding: Funding for this study was provided by the American Medical Association and the Mayo Clinic Department of Medicine Program on Physician Well-Being.