ABSTRACT
Resident handoff communication skills are essential components of medical education training. There are no previous systematic reviews of feedback and evaluation tools for physician handoffs.
We performed a systematic review of articles focused on inpatient handoff feedback or assessment tools.
The authors conducted a systematic review of English-language literature published from January 1, 2008, to May 13, 2015 on handoff feedback or assessment tools used in undergraduate or graduate medical education. All articles were reviewed by 2 independent abstractors. Included articles were assessed using a quality scoring system.
A total of 26 articles with 32 tools met inclusion criteria, including 3 focused on feedback, 8 on assessment, and 15 on both feedback and assessment. All tools were used in an inpatient setting. Feedback and/or assessment improved the content or organization measures of handoff, while process and professionalism measures were less reliably improved. The Handoff Clinical Evaluation Exercise or a similar tool was used most frequently. Of included studies, 23% (6 of 26) were validity evidence studies, and 31% (8 of 26) of articles included a tool with behavioral anchors. A total of 35% (9 of 26) of studies used simulation or standardized patient encounters.
A number of feedback and assessment tools for physician handoffs in several specialties have been studied. Limited research has been done on the studied tools. These tools may assist medical educators in assessing trainees' handoff skills.
Introduction
Handoffs, the “process of transferring primary authority and responsibility for providing clinical care to a patient from 1 departing caregiver to 1 oncoming caregiver,”1 have been demonstrated to be a significant causative factor in medical errors.2
Educators have noted that feedback3 and assessment4 are essential facilitators of learning.5 The Accreditation Council for Graduate Medical Education (ACGME) requires programs to monitor handoffs6 to ensure resident competence in this vital communication skill. To provide effective resident monitoring, programs will need handoff feedback and assessment tools.
Methods
Literature Search
An experienced medical librarian (E.M.J.) conducted a comprehensive literature search for English-language articles published on inpatient, shift-to-shift handoffs between January 1, 2008, and May 13, 2015, in Ovid MEDLINE, Ovid MEDLINE In-Process & Other Non-Indexed Citations, Journals@Ovid, CINAHL (EBSCOhost), and “ePub ahead of print” in PubMed. We chose relevant controlled vocabulary and keywords to capture the concepts of handoff, including its multiple synonyms (provided as online supplemental material).
All article titles were independently reviewed for inclusion by at least 2 trained reviewers (from the following group: J.D., C.E., M.M., L.A.R.). If either reviewer selected a reference, the full text was ordered for further review. Using this strategy, 1497 articles were obtained. The percent agreement on initial independent selection of articles for further review was 94%. Interrater reliability using Cohen's kappa was κ = 0.72 (P < .001).
All full-text articles were reviewed by teams of 2 trained reviewers (from the following: J.D., C.R., C.E., M.M.). In cases where reviewers disagreed, articles were discussed by the team until consensus was reached. To identify other relevant articles, the reference sections of all included articles were checked by 2 independent research assistants (C.E. and M.M.).
Inclusion and Exclusion Criteria
At the outset, we developed a comprehensive systematic review protocol, including operational definitions, inclusion and exclusion criteria, and search strategy details. Feedback was defined as any formative process of providing information or constructive criticism that could help improve handoff performance. Assessment was defined as a summative process of assessing performance related to knowledge, content, attitudes, behaviors, or skills.
Articles meeting the following criteria were eligible for review: included medical students, residents, fellows, or attending physician's inpatient, shift-to-shift handoffs; had either quantitative or qualitative research data; and the research focused on feedback or assessment tools aimed at the learner. Exclusion criteria included articles that focused on interhospital or intrahospital transfer, were anecdotal or had no data, or were letters to the editor, commentaries, editorials, or newsletter articles.
Abstraction Process
The team used an iterative process to develop and pilot test an abstraction form designed to confirm final eligibility for full review, assess article characteristics, and extract data relevant to the study. Each article was independently abstracted by 2 of 3 trained reviewers (J.D., C.E., M.M.). The 2 abstractors, along with an author independent to the abstraction process (L.A.R.) discussed and combined the 2 abstractions into a final version. All abstraction disagreements were minor and were resolved during discussions between the reviewers.
Quality Assessment
The team used the Medical Education Research Study Quality Instrument (MERSQI) developed by Reed et al10 to assess quality. It is an 18-point, 6-domain instrument designed specifically for medical education research. The 6 domains are study design, sampling, type of data, validity of assessment instruments' scores, data analysis, and outcomes evaluated. Since its introduction in 2007, multiple studies have shown evidence of its validity and reliability.10–12 Studies were quality scored on each item via team consensus to arrive at final MERSQI scores. As described in its original use,10 the total MERSQI score was calculated as the percentage of total achievable points. This percentage was then adjusted to a standard denominator of 18 to allow for comparison of MERSQI scores across studies.
Response rate is the proportion of those eligible who completed the posttest or survey. For intervention studies, this is the proportion of those enrolled who completed the intervention assessment. For outcomes, handoff demonstration measures were considered skill acquisition if the handoff measure was done once during an intervention, and behavioral demonstration if there were multiple measurements over time in an actual health care setting. If a study measured multiple levels of outcomes, it was given the score corresponding to the highest level of outcome it measured.
Types of Data Reported
We categorized data reported into 4 types: content, process, handoff organization, and professionalism. These were defined as (1) content, which describes items included in the handoff related to a patient's health-related history, treatment management or planning, or hospital course or updating these items; (2) process, which evaluates or assesses environmental or other components of a quality handoff (eg, limiting interruptions, quiet location); (3) handoff organization, which describes adherence to a predefined order of handoff items, patients to be handed off, or coherence and understandability of handoff presentation; and (4) professionalism, which describes provider conduct and appropriateness in the health care setting and relationships with colleagues.
Validity evidence was grouped according to the 5-category validity framework developed by Beckman et al13 and expanded by Cook and Lineberry14: content, internal structure, response process, relationships with other variables, and consequences.
Content included face validity, adapting items from an existing instrument, stakeholder review, literature search, or previous publication. Internal structure included all forms of reliability, factor analysis, or internal consistency. Pilot testing was included as part of response process, whether data of the pilot were reported or not. Relationships with other variables was shortened to “relational” and included correlation to any outside factor or tool. Consequences included any potential objective change or outcome (regardless of whether there was a change or not and regardless of whether the change was intended or not) after feedback or assessment was implemented, as well as any impact on the evaluator or evaluee.13,14
Results
Our search strategy yielded 10 774 unique articles (total with duplicates 13 019). After reviewing the search, we identified 26 articles (32 tools) published between January 1, 2008, and May 13, 2015, that focused on inpatient handoff feedback or assessment tools (figure). Of these articles, 3 were relevant to feedback only,27,29,34 8 to assessment only,15,18,19,21,23,31,33,35 and 15 to feedback and assessment (tables 1 and 2).16,17,20,22,24–26,28,30,32,36–40 Copies of some tools are available from the authors on request.
Abbreviation: N/A, not applicable.
iCATCH, i: identify (name, medical record number, date of admission, code status); C: chief complaint or presenting symptoms; A: active problem list; T: therapies and interventions (planned for next 24 hours); C: clinical trajectory and condition (sick or not sick; response to therapy and help the receiving caregiver anticipate problems); H: help me (encourage questions and dialogue).
SIGNOUT, S: sick or not sick, do not resuscitate orders?; I: identifying patient information (name, MR#); G: general hospital course (reason for admission); N: new events of the day; O: overall health status—getting better or worse; U: upcoming possibilities for plan, rationale; T: tasks to complete overnight.
SIGNOUT, S: sick or not sick; I: identifying patient information; G: general hospital course; N: new events of the day; O: overall health status; U: upcoming possibilities with a plan and rationale; T: tasks to complete overnight.
SIGNOUT, S: sick or “do not resuscitate” status; I: identification data; G: general hospital course; N: new events of the day; O: overall health; U: upcoming possibilities; T: tasks to do.
dINAMO, d: doctor, remember; I: identify (age, sex, name); N: needs of the patient (chief complaints) A: analysis (state of the evaluation); M: medical management (planned evaluation or treatment); O: organization (planned transfer, discharge).
SIGNOUT, S: sick/not sick or code status; I: identifying data; G: general hospital course; N: new events of the day; O: overall health status; U: upcoming possibilities with plan and rationale; T: tasks to complete with plan and time for questions.
I-PASS, I: illness severity; P: patient summary; A: action list; S: situation awareness and contingency plans; S: synthesis by receiver.
The mean quality score of the studies was 12.2 (SD = 2.4; range = 7–16.5; possible maximum = 18). The consistently lowest-scoring domains were study design (mean = 1.5, SD = 0.62), outcome (mean = 1.7, SD = 0.42), and sampling (mean = 1.7, SD = 0.54). Ten studies (38%) reported funding; however, the mean quality score was identical for funded and unfunded studies (12.2).
Most of the studies occurred in the United States (22 of 26, 85%).15–34,37,40 Only 2 studies occurred entirely outside of the United States,35,36 and 2 more occurred in both Canada and the United States.38,39 There were several different types of study designs among the articles. The study design most commonly used was pre-post intervention (11 of 26, 42%).15,17,24,28,30,32,33,36–39 Other study designs included validity evidence only (6 of 26, 23%)19,21,26,29,31,35 ; randomized control trial (2 of 26, 7.7%)16,25 ; posttest study (2 of 26, 7.7%)20,40 ; observational study (2 of 26, 7.7%)18,23 ; and matched group design with random assignment to control and trained groups (1 of 26, 3.8%).22 The studies included the specialties of internal medicine (12 of 26, 46%)15,16,18,19,23–27,31,34,37 ; pediatrics (3 of 26, 12%)33,38,39 ; pediatric cardiac critical care (1 of 26, 3.8%)21 ; surgery (1 of 26, 3.8%)29 ; emergency medicine (1 of 26, 3.8%)36 ; and gastroenterology (1 of 26, 3.8%).40 Several of the studies used participants from more than 1 specialty (7 of 26, 27%).17,20,23,28,30,32,35 The participants involved in the most studies were interns and residents (21 of 26, 81%)15–18,20,22–27,29–34,36–39 but also included attending physicians (7 of 26, 27%),19,21,27,31,36,38,40 fellows (2 of 26, 7.7%),21,40 medical students (2 of 26, 7.7%),27,28 nurse practitioners (1 of 26, 3.8%),31 and physician assistants (1 of 26, 3.8%).31 One study focused on physicians but also included pharmacists, nurses, psychologists, and educators (1 of 26, 3.8%).35
Feedback
Feedback methods varied. Most often, feedback was provided 1-on-1 to learners (15 of 18, 83%).17,20,22,24,25,28–30,32,34,36–40 However, 17% (3 of 18) of the articles reported that feedback was provided in group sessions as part of an intervention or curriculum.16,26,27 All but 1 article with feedback26 showed statistically significant improvements in at least 1 component assessed.
The most commonly used method was to provide feedback to the learner once or during 1 session (11 of 18 studies, 61%).18,22,26,28–30,32,37–40 Some studies provided feedback to learners more than once (7 of 18 studies, 39%).16,20,24,25,27,34,36 Studies providing feedback over time showed varied results, ranging from significant increases in handoff provider satisfaction with personal verbal handoff quality preintervention to postintervention20 and significant improvements on all measured content and organization (2 of 3, 67%)29 to mixed results, with some elements improved (inclusion of advanced directives and anticipatory guidance) and no improvement in organization nor readability (1 of 3, 33%).24
Of the 18 studies, 3 (17%) provided feedback for several weeks or months.24,27,34 All reported some improvements over time, with 1 study documenting statistically significant improvement in overall quality score.27
Feedback provided to the learners usually included content of the handoff (17 of 18, 94%).16,17,20,24–30,32,34,36–40 All studies measuring content compared to a control or preintervention group showed an improvement.16,24,27,30,36,39 A few studies provided feedback on the process of the handoff (6 of 18, 33%).16,20,28,37–39
Varied outcomes of explicit content were measured after feedback. Code status was the most frequent item that showed statistically significant improvement in inclusion during handoffs after feedback.24,30,37,39 Other items that were often statistically improved after feedback were medications,16,30,39 anticipatory guidance,24,27,30,34,37,39 and diagnostic tests/results.27,36,39 Occasionally, some content items were omitted more frequently after feedback, such as major medical problems16 or asking if the receiver had any questions.35
Assessment
The assessment process was measured in heterogeneous ways across studies. The Handoff Clinical Evaluation Exercise (CEX) or tools based on it were the most commonly used.18,19,28,31 Articles with assessment tools used several types of outcome measures, including content-based (22 of 23, 96%)15–21,23–26,28,30–33,35–40 ; process-based (11 of 23, 48%)16,18–20,28,31,35,37–40 ; perception of professionalism (11 of 23, 48%)18–20,22,26,28,31,35,38–40 ; and organizational measures (17 of 23, 74%).15,16,18–20,21–26,28,31,33,35,38,39
Five articles included more than 1 assessment tool (table 2): 1 with self-perception and receiver-perception of handoff15 ; 1 with verbal and written assessment30 ; 1 with separate tools for the giver and receiver31 ; and 2 with 3 tools (1 each for printed, verbal giver, and verbal receiver).38,39 One study used a single tool in a global assessment of a trainee in roles of both sender and receiver.18
Feedback and Assessment
In 7 studies, the person providing feedback and/or assessment received training.16,25,28,30,37–39 Of the studies that contained both feedback and assessment, 4 had tools exclusively for feedback,16,36,38,39 although many studies used their assessment tools as a feedback guide.17,20,23,24–28,30,33
Seven studies assessed the accuracy of handoff content with 4 embedding this in the tool,20,25,38,39 2 by independent retrospective chart review,30,34 and 1 by querying senior faculty.36 In addition, 7 studies used tools that assessed whether or not the content of the handoff was updated.16,18,23–26,37
Learners were evaluated using audiotapes16 and videotapes24,33,37 in several studies. In 2 of the studies using videotape, learners were able to review the recordings for educational purposes.33,37 Two studies used real patient handoffs, 1 with audiotape16 and 1 with videotape,37 and 2 used simulated handoffs.22,33 All 4 demonstrated significant improvements, either in pre- to postcomparisons33,37 or when compared to a control group.16,22 The observed simulated handoff experience was used in 2 studies,17,28 and the objective standardized clinical examination was used in 1 study.40 Overall, 9 studies used some form of simulation, standardized patient encounter, or standardized resident encounter.17,22,26,28,32,33,38–40 Three studies used a combination of educational/simulation and workplace testing.37–39
Six articles focused solely on describing or offering validity evidence for a tool.19,21,26,29,31,35 Other studies, not specifically aimed at validation, also reported various types of validity evidence (table 2). Eight articles used behavioral anchors for at least some levels of tool items,18–20,22,23,26,28,31 with 2 using anchors for all levels.20,23
Discussion
Our systematic review of the literature yielded 26 articles and 32 tools relevant to feedback and assessment of inpatient handoff communication. The interventions and outcomes measured varied widely across the studies. As expected, most articles showed that using feedback and/or assessment improved the content or organization measures included in the respective tools. Process and professionalism measures were less reliably improved. Two studies measured perceived safety,32,34 and 1 study measured actual patient outcomes (medical errors and adverse events).39
Handoff communication errors have been linked to adverse patient outcomes, which has led to a national focus on the need to improve handoff communication. However, the existing literature on handoff feedback and assessment tools has not demonstrated a clear link between use of these tools and improved patient outcomes. Although Starmer and colleagues39 demonstrated improved patient outcomes, their study included a bundle of interventions (not solely the use of a handoff feedback/assessment tool). There is no clear link between use of the tool itself and patient outcomes.
The tools identified were diverse. One reason for this is that different specialties and institutions may require different types of handoffs with different relevant information. To address this, some handoff experts have proposed the concept of flexible standardization, a core set of universally accepted components that can be modified for a specific institution or specialty as needed.41–43 This would apply to feedback and assessment tools. In addition, patient handoffs must provide a balance between consistent content and necessary flexibility in diverse patient scenarios. Feedback and assessment tools should address this dynamic tension.
The Handoff CEX or tools based on it are the most widely studied tools we identified; however, even these tools require further research to confirm their effectiveness. Due to the recent nature of this body of literature (2009–2015), and the relatively small number of studies (26) and tools identified (32), it is too early to definitively identify the best tools for particular disciplines and/or learner levels. We hope that with time and further study a rich body of feedback and assessment tools for handoffs will develop.
Overall, the items included in the assessment tools were mainly content based, followed by organizational measures. Professionalism and process-based measures were used less often in evaluating learners. If the goal of providing feedback and/or assessment is to improve handoff content, then checklist tools assessing presence/absence will suffice. However, we believe that there are factors other than content that make a quality handoff. While process, organization, and professionalism can be assessed using dichotomous (yes, no) or categorical (never, rarely, occasionally, usually, always) scoring, learners may benefit more from tools with descriptive behavioral anchors. We identified 8 tools with at least some behavioral anchors.18–20,22,25,26,28,31
Handoff is a skill that requires deliberate practice in order to master. In fact, it is 1 of the most important skills for incoming interns to learn before residency.44 Simulation, standardized patient encounters, and role play would be ideal modalities for safely teaching and assessing this important skill. Indeed, 9 of 26 (35%) studies in this review used some form of simulation or standardized encounter.17,22,26,28,32,33,38–40 One of these 9 studies (11%)28 used medical students, and 3 (33%)22,26,33 specifically mentioned including interns. In the future, the use of simulation or objective standardized clinical examinations to assess graduating medical student and intern competency in handoffs may help ensure patient safety.
It is recognized that regular feedback is important in the acquisition of clinical skills.3,45 However, only 39% (7 of 18) of feedback articles provided feedback more than once. One study34 introduced a new electronic handover system and showed that implementing the electronic system without feedback increased omissions of both allergies and code status. When feedback was implemented, allergy and code status omissions were reduced, and an improvement was seen in inclusion of patient location, patient identification information, and anticipatory guidance.34 Doers et al27 suggested that providing feedback to medical students, residents, and attending physicians once a month was an effective way to sustain improvements in handoff quality, and Dine et al26 showed that at least 10 peer assessments during a single rotation and 12 to 24 across multiple rotations were needed to adequately assess handoff skills. Clearly, more research is needed to answer the question about how much feedback is sufficient.
Handoffs require mastery of a complex set of diverse skills (eg, communication, teamwork, prioritization, organization). Aylward and colleagues20 identified handoffs as an example of an entrustable professional activity (EPA), an activity requiring multiple tasks and responsibilities that faculty can progressively entrust learners to perform independently.46 Handoffs, viewed as EPAs, require feedback over time; however, this will require adequate faculty development and time to provide the needed feedback and assessment. This creates an entirely new set of issues, as faculty may have different ideas about what constitutes an effective versus ineffective handoff. In addition, effective feedback requires specific skills that faculty may not possess. Finally, there are competing demands on faculty time. Each of these will need to be addressed by medical education leadership.
Who evaluates learners may play a role in the validity and reliability of the assessment. Of the 26 studies, 7 explicitly stated that the person providing assessment or feedback received training.16,25,28,30,37–39 Using standardized videos and the Handoff CEX tool, Arora et al19 found that internal medicine faculty could reliably discriminate different levels of performance in each domain. Peer assessments, while feasible, show evidence of leniency,18,31 and their impact on resident workload is unclear.18 These studies suggest that well-trained or experienced external observers are necessary to ensure adequate assessment of learners' handoff skills.
Funding is an important consideration in medical education studies, and it can impact study quality.10 However, in our study the mean quality score was 12.2 (possible range = 1–18) for funded and unfunded research. Less than half of the studies reported receiving project or author funding (10 of 26, 39%), and only 1 of the funded studies measured patient outcomes. Showing benefit to patients is the ultimate goal; however, funding studies that measure this can be quite expensive. It will be important in the future to identify handoff measures that are proven to both improve the handoff itself and translate into improved patient safety.
This review is limited by the search strategies used. Some relevant studies may have been quality improvement studies, which may not be reported in the peer-reviewed literature.47 Although our comprehensive search strategy to identify relevant articles minimizes the risk of missing germane articles, it does not eliminate the possibility. Finally, the heterogeneity of the studies in both methodology and interventions limits the conclusions that can be drawn.
Conclusion
We identified 26 studies on handoff feedback and assessment containing 32 tools. These tools were exclusively hospital based but spanned many specialties. No single tool arose as best for any particular specialty or use. Assessment and ongoing feedback are important components for improving physician handoffs. The tools we identified or their components can be used as templates for medical educators wishing to develop handoff feedback and assessment tools that incorporate institutional and specialty-specific needs.
References
Author notes
Funding: Catherine Roach was supported by a Foundation for Anesthesia Education and Research Medical Student Anesthesia Research Fellowship. This research was partially supported by the University of Alabama at Birmingham Medical Student Summer Research Program (NIH/NHLBI T35HL007473).
Competing Interests
Conflict of interest: The authors declare they have no competing interests.
Some preliminary data were presented at the Minority Health Disparities Resource Center Summer Research Display, Birmingham, Alabama, July 2014; the Dale J. Benos Medical Student Research Day, Birmingham, Alabama, October 28, 2014; the American Society of Anesthesiologists Conference, San Diego, California, October 24, 2015; and the Accreditation Council for Graduate Medical Education Annual Educational Conference, National Harbor, Maryland, February 26, 2016.
The authors would like to thank Amos J. Wright, medical librarian, Department of Anesthesiology and Perioperative Medicine, University of Alabama at Birmingham.
Editor's Note: The online version of this article contains a table of literature search methods and an annotated bibliography of handoff feedback and evaluation tools.