Recent studies showed that psychological safety is important to resident perception of the work environment, and improved psychological safety improves resident satisfaction survey scores. However, there is no evidence in medical education literature specifically addressing relationships between psychological safety and learning behaviors or its impact on learning outcomes.
We developed and gathered validity evidence for a group learning environment assessment tool using Edmondson's Teaming Theory and Webb's Depth of Knowledge model as a theoretical framework.
In 2018, investigators developed the preliminary tool. The authors administered the resulting survey to neonatology faculty and trainees at Baylor College of Medicine morning report sessions and collected validity evidence (content, response process, and internal structure) to describe the instrument's psychometric properties.
Between December 2018 and July 2019, 450 surveys were administered, and 393 completed surveys were collected (87% response rate). Exploratory factor analysis and confirmatory factor analysis testing the 3-factor measurement model of the 15-item tool showed acceptable fit of the hypothesized model with standardized root mean square residual = 0.034, root mean square error approximation = 0.088, and comparative fit index = 0.987. Standardized path coefficients ranged from 0.66 to 0.97. Almost all absolute standardized residual correlations were less than 0.10. Cronbach's alpha scores showed internal consistency of the constructs. There was a high correlation among the constructs.
Validity evidence suggests the developed group learning assessment tool is a reliable instrument to assess psychological safety, learning behaviors, and learning outcomes during group learning sessions such as morning report.
The literature demonstrates psychological safety is important to resident perception of the work environment and can enhance resident satisfaction survey scores. However, there is no evidence in medical education literature specifically addressing relationships between psychological safety and learning behaviors or its impact on learning outcomes.
A group learning environment assessment tool using Edmondson's Teaming Theory as a framework.
Single center study which limits generalizability. Survey relied on self-reported data and may have also been influenced by social desirability and acquiescence biases.
The tool is a reliable instrument to assess psychological safety, learning behaviors, and learning outcomes during group learning sessions.
Psychological safety has recently garnered more attention within medical education, as it has been shown to improve residents' perception of their work environment.1 It refers to the perception that a learner is free to take interpersonal risks such as reporting mistakes or problems, or sharing new ideas without feeling they will be penalized for highlighting their vulnerability.1,2 Early work by Edmondson showed that observed medical teams displaying higher levels of teamwork disclosed more medical errors in order to encourage learning and improvement.3 These observations led to the development of “Teaming Theory” in which psychological safety and team learning behaviors are core constructs.3 Although learners in any environment naturally seek to minimize interpersonal risks, environments like morning report (typically a case-based teaching session for faculty and trainees at academic institutions4), which are marked by constant evaluation and hierarchy, can make disclosing medical errors and knowledge deficits challenging.5,6
Although there is extensive evidence that shows psychological safety improves team performance in the business literature, there is limited evidence in medical education. Studies suggested psychological safety may be relevant to resident perception of clinical experiences,1 and it correlates with resident satisfaction survey scores.7 Additionally, one study reported “toxic” work environments—“(negative) interactions with faculty/attendings” and “attendings who berate resident physicians”—as the most critical factor influencing medical trainee burnout.8 Others have highlighted ensuring the psychological safety of learning environments as a key factor in facilitating learning.9
We posit that psychological safety experienced by medical trainees and faculty is an essential, but currently overlooked, component for assessing quality of group learning environments such as morning report. Using Teaming Theory as part of our theoretical framework,3,10 we aimed to develop a tool to assess the group learning environment experienced during a morning report, including its impact on learning outcomes, and to gather validity evidence to support the interpretation of the results derived from this tool.
We used a systematic approach to survey creation to develop our group learning environment assessment tool and Messick's framework to guide the accrual of validity evidence to support the interpretation of assessment results.11–13 Content and response process evidence was obtained during tool development. Internal structure evidence was obtained through psychometric evaluation.
Tool Development and Theoretical Framework
First, we affirmed the need for assessing psychological safety within the Baylor College of Medicine neonatology section through a comprehensive review of the literature,1–3,5,6,9,14–17 informal interviews with fellows and personal communication with faculty and experts on medical education and teamwork within and outside of the institution, and a review of our Accreditation Council for Graduate Medical Education survey results from 2016 to 2018. Through an iterative process by investigators and experts, we developed a theoretical framework (Figure 1) based on synthesis of the literature using a “Teaming” theory (team psychological safety and team learning behaviors models3) as well as the addition of Webb's Depth of Knowledge model.18 Edmondson defines team psychological safety as “a shared belief that the team is safe for interpersonal risk taking” and team learning behaviors as “an ongoing process of reflection and action, characterized by asking questions, seeking feedback, experimenting, reflecting on results, and discussing errors or unexpected outcomes of actions.”3 Together, the team psychological safety and team learning behaviors delineate desirable learning attributes of interest. The depth of knowledge model offers a framework for learning outcomes and requires learners to demonstrate their level of understanding and their ability to transfer knowledge between various contexts (ie, morning report to the bedside).
This theoretical framework guided the tool development. Item creation was based on Edmondson's prior psychological safety survey,3 which resulted in 18 items covering 3 domains: team psychological safety, team learning behaviors, and depth of knowledge. Each item was written as a short phrase asking participants to respond on a 5-point Likert scale. To ascertain content validity, the tool went through an iterative process using expert consensus of faculty in the pediatric sections of neonatology, critical care, hospital medicine, and medical education. In total, we reworded 3 items and added 3 items. Cognitive interviews with end users to assess response process validity were performed. Ten faculty members (neonatology, hospital medicine, critical care) and 10 fellows (neonatology, cardiology and critical care) offered feedback on clarity, understanding, and interpretation of tool items and format. As a result, examples were added to each item in the depth of knowledge domain. The tool was then piloted in December 2018 and revised before administration via paper form.
Setting and Participants
After development, the tool was administered to the neonatology section morning report participants using purposive sampling of morning report on day 5 and 26 of each 4-week service block as well as 2 to 3 additional randomized days during each block. The study period began in December 2018 and concluded in July 2019. Morning report is hosted daily from 8:00 to 8:45 am where the post-call fellow presents overnight admissions and any active or clinically challenging patients. It serves as an overnight handoff to the day team and offers opportunities to review imaging and discuss diagnostic and clinical challenges. Participants include on-service fellows, faculty, radiologists, and other learners (off-service fellows and faculty, medical students, and neonatal nurse practitioners). At the conclusion of morning report, participants were invited to complete the tool reflecting on their experiences during the session.
We conducted exploratory factor analysis (EFA) to examine the factor structure. Confirmatory factor analysis (CFA) was then performed to assess how well our final model fits the data, followed by measurement of internal consistency (Cronbach's alpha, inter-factor correlations, and item-total correlations) to determine validity evidence for internal structure. Sample size was determined using the items-to-participants criterion due to lack of a priori knowledge of communalities. We aimed for 180 minimum responses for EFA (10 participants per item) and 200 minimum responses for CFA.19,20 Kaiser-Meyer-Olkin measure was used to determine sample adequacy.21
Exploratory factor analysis determined whether the tool items intended to assess the proposed 3 domains would load on their respective factors. Estimation of factors was done with weighted least squares means and variance adjusted estimator and rotated with an Oblimin (Geomin) rotation that provided the best-defined factor structure. We planned a priori to interpret an item as loading on a factor if the rotated factor pattern loading was ≥ 0.50 for that factor.22
Confirmatory factor analysis was then conducted to assess how well the model fits another set of data. Multiple fit indices, measuring the degree to which the factor model reproduced the empirical covariance matrix, were computed based on the recommended cutoffs. Cronbach's α coefficient was calculated to determine internal consistency of items within a factor (≥ 0.7 satisfactory).23 Homogeneity of factors was examined using item-total correlations (≥ 0.40 indicate all item are positively correlated).24 In addition, all corrected item-total correlations were assessed, with scores above 0.2 indicating an acceptable correlation between each item and the overall score. Statistical significance was set at P ≤ .05.
We performed factor analyses on Mplus version 8.3 (Muthén & Muthén, Los Angeles, CA) and used SPSS 25.0 (IBM Corp, Armonk, NY) for assessing internal consistency and Kaiser-Meyer-Olkin Measure of Sample Adequacy.
The Institutional Review Board of Baylor College of Medicine approved the study.
The tool comprises 3 domains: team psychological safety, team learning behaviors, and depth of knowledge (Figure 2). We administered 450 surveys and received 393 completed surveys (87% response rate) from 25 morning report sessions. The Kaiser-Meyer-Olkin of 0.87 (> 0.6) indicated sample adequacy for EFA, and EFA on the first 190 responses suggested 3 factors based on Eigenvalue (≥ 1) and scree plot. Given very high cross-loadings on team psychological safety items 6 and 7, both were deleted to reduce redundancy among the items. Due to very high correlation coefficient (0.9) between depth of knowledge items 3 and 4, item 4 was removed as we learned that many respondents had difficulty answering it. The rotated factor loadings are presented in Table 1. The high loadings (> 0.5) for all factors indicated a pivotal relationship between the factor and variable.25 All items were clustered with like items consistent with the proposed 3 domains.
Confirmatory factor analysis on the remaining 203 observations showed acceptable to excellent fit indices of the 3-factor model: standardized root mean square residual of 0.034 (< 0.055 = ideal25), root mean square error of approximation of 0.088 (0.08–1.0 = mediocre), and comparative fit index of 0.987 (> 0.96 = excellent25). These findings indicate that the overall structure of the model fits the data. Structural equation modeling provided standardized path coefficients and significance levels for parameter estimates of the associations as shown in Figures 3a and 3b. The results indicate significant effect of team psychological safety on team learning behaviors (β = 0.75, P < .0001) and significant effect of team learning behaviors on depth of knowledge (β = 0.66, P < .0001). Team psychological safety has no significant effect on depth of knowledge (β = 0.143, P = .09), so that path was excluded from the final model. All path coefficients were high (0.66 to 0.97) with favorable t values (> 2.58) supporting the 3-factor model.25 Almost all standardized residual correlations were small (< 0.10),26 indicating a good fit of the model.
Factor descriptives, Cronbach's alpha, and inter-factor correlations are given in Table 2. Cronbach's alpha values for all 3 factors indicated good to excellent internal consistency reliability (≥ 0.8). The correlations among the 3 factors was high, ranging from 0.66 to 0.70. For each of the 15 retained items, item-total score correlations were positive and significant (P < .001), affirming significant contribution of each item to the total score of the tool.
We developed a group learning environment assessment tool using “Teaming” theory as a theoretical framework. The tool content was guided by this framework based on 3 existing domains: team psychological safety,3 team learning behaviors,3 and depth of knowledge.18 Through the development process and psychometric evaluation, we gathered validity evidence pertaining to content, response process, and internal structure.13 To our knowledge, this is the first study offering empirical evidence in medical education addressing psychological safety and learning behaviors and outcomes in group learning sessions. Although the setting of this study is a morning report session, the systematic approach we described can be used to guide an adaptation of the instrument for other forms of group learning sessions (case conferences, morbidity and mortality conferences, bedside teaching, etc) in medical training that require an educational dialogue27 to facilitate optimal learning. We propose this tool can be administered to faculty and trainees to assess the quality of group learning conferences and inform program improvement efforts. We suggest administering the tool 2 to 4 times per year or as needed to assess psychological safety within these sessions.
Psychological safety is a vital component to group learning and team behaviors.14 It allows learners to take interpersonal risks, exhibit curiosity, ask questions, and show their vulnerabilities of knowledge deficits.6 In a recent qualitative study in medical education, safe learning environments not only built trust and set clear expectations for learners, but also encouraged critical thinking.28 Psychologically safe environments can have profound effects on learning, growth potential, and physician burnout, and may help distinguish Socratic teaching from “pimping.”8,9,17,28
In addition to bringing attention to psychological safety domains for assessing education process within medical training, this is the first study to evaluate psychological safety at an institutional level and examine the relationships between psychological safety, learning behaviors, and learning outcomes (which have been shown to be critical relationships in organizational work teams).2,3,15,16 Although we did not find an association between team psychological safety and depth of knowledge, this may be due to rating errors. Upon reviewing the raw data, there appears to be acquiescence bias in the depth of knowledge domain (ie, giving the same rating to each depth of knowledge item). A relative ranking scale for this domain may have provided more accurate data than the Likert scale we used as this would have encouraged participants to identify how much of each level they perceived during morning report.
This study has limitations. Self-reporting of the team learning behaviors and depth of knowledge poses threats to validity. Social desirability bias, acquiescence bias, and survey fatigue could have influenced the results. This is a single center study, which limits generalizability. Future studies should investigate the impacts of psychological safety on observed learning behaviors and performance measures. Finally, we did not include validity evidence pertaining to relations to other variables and consequences.
In this study, we developed a theory informed group learning environment assessment tool. We demonstrated acceptable validity evidence to support the use of the tool as a means of assessing psychological safety and its consequences—learning behaviors and learning outcomes—during group learning sessions such as morning report.
Funding: This project was funded by the Evangelina “Evie” Whitlock Fund, Texas Children's Hospital Educational Grant.
Conflict of interest: The authors declare they have no competing interests.
This work was previously presented at the Pediatric Academic Societies Annual Meeting, Baltimore, MD, April 24–May 1, 2019.
The authors would like to thank Drs. K. Suresh Gautham, B. Gandhi, J. Garcia-Prats, S. Parmekar, and G. Singhal (Baylor College of Medicine) for their valuable and constructive suggestions during the planning and development of this research work. Their willingness to give their time so generously has been very much appreciated.