Context.—The rate of surgical pathology report defects is an indicator of quality and it affects clinician satisfaction.
Objective.—To establish benchmarks for defect rates and defect fractions through a large, multi-institutional prospective application of standard taxonomy.
Design.—Participants in a 2011 Q-Probes study of the College of American Pathologists prospectively reviewed all surgical pathology reports that underwent changes to correct defects and reported details regarding the defects.
Results.—Seventy-three institutions reported 1688 report defects discovered in 360 218 accessioned cases, for an aggregate defect rate of 4.7 per 1000 cases. Median institutional defect rate was 5.7 per 1000 (10th to 90th percentile range, 13.5–0.9). Defect rates were higher in institutions with a pathology training program (8.5 versus 5.0 per 1000, P = .01) and when a set percentage of cases were reviewed after sign-out (median, 6.7 versus 3.8 per 1000, P = .10). Defect types were as follows: 14.6% misinterpretations, 13.3% misidentifications, 13.7% specimen defects, and 58.4% other report defects. Overall, defects were most often detected by pathologists (47.4%), followed by clinicians (22.0%). Misinterpretations and specimen defects were most often detected by pathologists (73.5% and 82.7% respectively, P < .001), while misidentifications were most often discovered by clinicians (44.6%, P < .001). Misidentification rates were lower when all malignancies were reviewed by a second pathologist before sign-out (0.0 versus 0.6 per 1000, P < .001), and specimen defect rates were lower when intradepartmental review of difficult cases was conducted after sign-out (0.0 versus 0.4 per 1000, P = .02).
Conclusion.—This study provides benchmarking data on report defects and defect fractions using standardized taxonomy.
Reports are the principal product of surgical pathology. Review of revised surgical pathology reports reveals a spectrum of defects in the surgical pathology process, both documenting relatively innocuous alterations (eg, correcting patient age) and drastic changes in diagnosis that affect patient management (eg, discovery that a malignant lesion has been mistaken for benign). Defects are discovered both unsystematically, when clinicians, office staff, or patients happen upon them, and systematically by surgical pathology quality assurance exercises (eg, reviews of cases with particular attributes), conferences (eg, tumor boards), or through mandatory review of surgical pathology slides and reports at the time of patient referral to other institutions.
Meier et al1,2 proposed taxonomy that divided report defects into 4 categories: (1) misinterpretations, (2) misidentifications, (3) specimen defects, and (4) other defects that involve nondiagnostic information and are not included in the first 3 categories. Misinterpretations include both inaccurate diagnoses (eg, false positives and false negatives) and misclassifications (eg, recognition of the sort of malignancy, but giving the tumor a specific name that later requires revision). Misinterpretations also include revisions in secondary diagnostic elements (eg, tumor grade or stage, margin status, lymph node status) when these features have implications for patient classification and management. Misidentifications include errors in patient identification, tissue identification, specimen laterality, and specimen anatomic localization. Specimen defects include lost specimens, inadequate specimen volume or specimen size, specimens with inadequate or discrepant measurements, and inadequately representative specimens, as well as specimens with inadequate or absent ancillary studies when such studies were warranted. After defects in these first 3 categories have been sorted out, the remaining defects are limited to defects in nondiagnostic information; missing or erroneous nondiagnostic information (eg, wrong clinician, wrong procedure, wrong date of procedure), dictation or transcription errors that do not affect the diagnosis, and failures of electronic formatting or transmission of reports.
Separation of report defects into the 4 categories allows for calculation of defect rates (number of report defects per total number of reported cases) and defect fractions (the percentages of defects that fall into each of the 4 defect categories). Such rates and fractions index problems in surgical pathology and allow monitoring of the effectiveness of targeted interventions to reduce defects. This Q-Probes study represents the first large, prospective multi-institutional study to use this taxonomy to define rates and fractions across institutions and to test associations of these 2 quality measures with demographic, institution, practice, and quality monitoring variables.
This College of American Pathologists Q-Probes study was conducted during the third quarter of 2011. Volunteer participants prospectively reviewed all surgical pathology reports in their institutions that underwent changes to correct defects of any sort. Included in the review were all cases accessioned as surgical pathology specimens, including bone marrow. Cytology specimens were excluded. Any surgical pathology report that was changed after the case was finalized or signed-out in order to correct any defect in the original report was included in the study. Each participant collected such cases for a 3-month period or until 50 reports with defects were identified. For each report included in the study, the type of report defect was categorized either as a misinterpretation, misidentification, specimen defect, or other defect type. When reports required revision of multiple defects, the report's study classification was according to the first defect that occurred in the sequence of the surgical pathology process. Additional information was also collected about the defect event, how the report defect was discovered (by chance or as part of a systematic effort to prevent or discover defects), by whom it was discovered (by pathology staff versus clinicians), and whether or not a change in diagnosis resulted from the report defect. Surgical pathology cases referred from outside institutions were excluded. Finally, to assemble potential stratifying variables, basic institutional demographic data were collected, and each participating institution completed a questionnaire regarding various laboratory practices and policies that relate to prevention, detection, and handling of report defects.
To classify the report defects the study participants applied the following definitions.
Defects Involving Misinterpretations
Misinterpretations qualified as revision of diagnostic information on 2 levels. At the primary level, they included inaccurate diagnoses (eg, false positives and false negatives) and misclassifications (eg, sorted to the appropriate diagnostic category but requiring reclassification within that category). At the secondary level, misinterpretations involved changes in modifying factors (eg, tumor grade or stage, margin status, or lymph node status) when these features have implications for patient classification and management.
Defects Involving Misidentifications
Errors in identification were at the levels of patient, tissue type (eg, colon versus stomach), laterality of a specimen (eg, right breast versus left breast), or specific location in an organ (eg, skin of shoulder versus skin of thigh).
Defects Involving Specimens
Such defects involved loss of a specimen, inadequate size of specimen, lack of critical measurements in specimen description, and inadequate gross sampling of a lesion. Specimen defects also included lack of critical ancillary studies or wrong ancillary studies selected.
Other or Remaining Defects
These were all the remaining defects that did not fall into the first 3 categories. Misinterpretations, misidentifications, and specimen defects were explicitly excluded before report changes were placed into this fourth category. These defects typically involved missing or erroneous nondiagnostic information, dictation and typographic errors, and failures in computer formatting or transmission. Defects were placed in this final category only after possible sorting into 1 of the first 3 categories was explicitly excluded. Cases in which multiple defect types occurred were categorized by the defect that occurred earliest in the sequence of the surgical pathology process.
The performance indicators for this Q-Probes study were the rate of surgical pathology report defects (per 1000 reports) and the defect-specific rates (per 1000 reports). From the number of specimen accessions in the study period and the total number of defects discovered, rates of surgical pathology report defects (per 1000 reports) were calculated. By using the total number of defects as the denominator value, the fractions of the 4 levels of defects were calculated. Defect rates and fractions were then tested for associations with institutional demographic and practice variables (teaching status, residency program presence, practice location, accessioned cases, number of pathologists, and number of clinical conferences attended each week). Practice variables included mechanisms of defect discovery, defect discoverer (pathologist versus clinician), whether intradepartmental or interdepartmental conferences contributed to the discovery, and systematic pre–sign-out efforts to prevent defects. Owing to the skewed distributions, these rates were normalized with a log transformation, and adjusted rates were used for the regression analyses. A stepwise approach was used for the analysis where individual associations between the metrics and the demographic and practice variables were tested by using Kruskal-Wallis tests for the discrete-valued independent variables and regression analysis for continuous independent variables. Variables with significant associations (P < .10) for the univariate analyses were then introduced into a forward-selection multivariate regression model. A significance level of .05 was used for this final model. All analyses were performed with SAS 9.1 (Cary, North Carolina).
Seventy-three (73) institutions submitted data for this Q-Probes study. Seventy (95.9%) were located in the United States and 3 in Saudi Arabia. Thirty-two (43.8%) were teaching hospitals and 25 (34.2%) had pathology residency training programs.
The College of American Pathologists had inspected 66 participating institutions (90.4%) and the Joint Commission, 9 (12.3%), within the previous 2 years. Forty institutions (59.7%) were urban; 17, suburban (25.4%); and 10, rural (14.9%). Fifty-four (81.8%) were nongovernmental institutions. The most common institution types were voluntary, nonprofit hospitals (54.5%), followed by independent laboratories (9.1%), university hospitals (7.6%), and veterans hospitals (7.6%). Distribution of hospital bed size was as follows: 0 to 150, 33.9%; 151 to 300, 37.5%; 301 to 450, 14.3%; 451 to 600, 5.4%; greater than 600, 8.9%.
The participating laboratories in the prior year (2010) accessioned a median of 10 968 surgical pathology cases (10th to 90th percentile range, 3672–43 104). The median number of full-time equivalent pathologists was 4 (10th to 90th percentile range, 2–12), and the median number of clinical conferences attended by department pathologists per week was 2 (10th to 90th percentile range, 0–7).
Participants from the 73 institutions identified 1688 surgical pathology report defects by reviewing 360 218 surgical pathology accessioned cases, for an aggregate defect rate of 4.7 per 1000. The median institutional defect rate was 5.7 per 1000 (10th to 90th percentile range, 13.5–0.9) (Table 1). The goal for participants was to collect up to 50 reports with defects during a 3-month period. Twenty of 73 participants (27%) reached 50 cases and the average number of reported cases was 24. Complete data for categorization of types of report defects were available for 1665 (98.6%) of the defects from 70 participants (96%). The studied defects fell into the 4 categories in the following way: 14.6% misinterpretations, 13.3% misidentifications, 13.7% specimen defects, and 58.4% other defects (Table 2). Among 247 misinterpretations studied, the most common defect was change in specific disease type within the same general category (eg, changing one benign entity to another benign entity) (44.1%), followed by change in stage (24.2%), false-negative diagnoses (11.8%), change in grade (8.6%), change in margin status (8.6%), and false-positive diagnoses (5.7%). Among 225 misidentifications, the most common were of the patient (31.5%), followed by anatomic localization (25.5%), laterality (23.0%), and the tissue type (20.0%). Among 231 specimen defects, the most common involved additional testing whose results did not alter the diagnosis (72.2%), followed by inadequate gross sampling (7.4%), inadequate microscopic sampling (5.7%), and missing required or appropriate ancillary studies whose inclusion did alter the diagnosis (5.2%).
The remaining 985 cases were classified as other report defects involving nondiagnostic information. Within this final category, defects had the following distribution: typographic errors (45.3%), missing or erroneous nondiagnostic information (32.1%), failures in computer formatting or transmission (1.5%), and “not specified” (21.1%). The 441 cases involving dictation and typographic errors made this sort of defect the most common cause of revised reports overall (26.5%). Among participating institutions, the median rate of misinterpretation was 0.8 per 1000 (10th to 90th percentile range, 0.0–2.2), median rate of misidentification was 0.6 per 1000 (10th to 90th percentile range, 0.0–1.8), and median rate of specimen defects was 0.3 per 1000 (10th to 90th percentile range, 0.0–3.1). The residual category of other defects involving nondiagnostic information showed a median of 2.6 per 1000, with a wider range among institutions (10th to 90th percentile, 0.1–8.9) (Table 3).
As detailed in Table 4, the most common means of discovering report defects was “following review of the report, not slides” (24.9%), followed by “don't know how error was detected” (16.4%), “clinician requested review” (11.4%), and “following review of ancillary studies” (9.7%). “Review for tumor board” contributed only 3.6% of defects, which was roughly equal to discovery “by chance” (3.3%).
Overall, report defects were most often discovered by pathologists (47.4%), followed by clinicians (22.0%), other nonphysician clinical staff (9.5%), other nonphysician pathology staff (9.1%), patients (0.3%), and others (11.7%).
Defect Distribution by Organ Site and Disease Category
The most common organ sites with report defects were skin (18.2%), followed by breast (17.7%), lower gastrointestinal tract (10.5%), female genital tract (10.4%), and upper gastrointestinal tract (7.6%). Cases with report defects most commonly involved neoplastic disease, as original diagnoses of benign and malignant neoplasms accounted for 61.6% of defect cases.
Relation of Defect Fractions to Defect Discoverers and Organ Sites
Misinterpretations and specimen defects were more often discovered by pathologists (73.5% and 82.7%, respectively). Misidentifications were most often discovered by clinicians (44.6%). These differences were statistically significant (P < .001, χ2 test) (Table 5). In Table 6 the defect fractions for the 5 most common organ sites are reported: breast, female genital tract, upper gastrointestinal tract, lower gastrointestinal tract, and skin, with all other organs grouped as “other.” The most common source of misinterpretation was “other” (38.5%), followed by skin (20%) and breast (16%). The most common source of misidentifications was skin (42.6%), the most common site for specimen defects was breast (41.6%), and the most common site for the residual category of other nondiagnostic information defects was “other” (40.3%). These differences also were statistically significant (P < .001, χ2 test). It should be noted that participants did not collect information about the relative number of each organ type evaluated during the study period. Thus, the relative proportions of defects by specimen type may be biased by the proportion of those specimens reported in this data collection.
Activities That May Affect Detection of Report Defects
Pathology-only case review conferences were reported in 45 of 73 participating institutions (61.6%). These pathology conferences were held pre–sign-out in 35 of these institutions (77.8%), and most often occurred daily. Post–sign-out pathology conferences were held in 33 of the 45 institutions (73.3%); these were most often conducted on a monthly basis (Table 7). Clinical multidisciplinary conferences were more common and were reportedly attended by pathologists in 62 of 73 participating institutions (84.9%). The most common were general tumor board (83.9%), followed by breast (71.0%), gastrointestinal (46.8%), heart and lungs (41.9%), and gynecologic (40.3%) tumor board conferences. Conference frequency ranged from weekly to quarterly; no daily clinical conferences were reported.
Pre–sign-out strategies used to prevent report defects are detailed in Table 8. Sixty-three of 70 institutions (90%) require a name and a unique identification number for each patient specimen. While most had second patient identifiers on slides (75.7%), a minority had second identifiers on tissue blocks (40.0%). The most common pre–sign-out audit systems were intradepartmental consultation for difficult cases (85.5%), followed by second pathologist review of all initial malignant diagnoses (43.5%), second pathologist review of all outside cases in which there is a diagnostic disagreement (40.6%), second pathologist review of all cases on a predetermined list of case types (26.1%), second pathologist review of all malignancies (20.3%), and review of a set percentage of cases by a second pathologist (15.9%). One laboratory reported that a second pathologist reviewed all cases. In 12 laboratories (17.6%) a targeted percentage of cases underwent review before sign-out, with a median target of 10% of cases (10th to 90th percentile range, 2%–80%) (Table 9). Similarly, 28 institutions (45.6%) performed a check of protocols for accuracy of nondiagnostic elements before case sign-out, with a median target of 97% of cases reviewed (10th to 90th percentile range, 1%–99%). This quality control check was most often performed by transcriptionists (33.3%), followed by pathologist assistants (21.2%), and other ancillary personnel (21.2%). A small minority of laboratories reported no pre–sign-out audit systems in place (5.8%). Not quite half of laboratories tracked the rate of intradepartmental case review for each pathologist (48.6%).
Post–sign-out strategies used to prevent report defects are detailed in Table 10. The most common post–sign-out strategies were slide review of cases presented at clinical case conferences (60.9%), followed by second pathologist review of a set percentage of cases (37.7%), intradepartmental consultation for difficult cases (34.8%), second pathologist review of initial malignant diagnoses (13.0%), and second pathologist review of all malignant diagnoses (10.1%). Post–sign-out review of reports for accuracy of nondiagnostic elements was conducted in 25.7% of laboratories, most often by a pathologist (38.9%). A small minority of laboratories reported no post–sign-out audit system (8.7%). Table 11 details institutional percentiles for percentage of cases that underwent post–sign-out review. Compared to pre–sign-out, more laboratories have a policy of post–sign-out slide review (34 versus 15), but the median percentage of cases reviewed was lower with the post–sign-out approach (6% versus 10%).
Participants also reported on other laboratory practices related to surgical pathology reports. Almost half (47.1%) of surgical pathology departments tracked individual pathologist report defect data for use in peer review programs. Incident reports were generated for report defects in slightly more than half of institutions (52.2%). Nearly all participants (97.1%) reported routinely making an attempt to assess the clinical effect of report defects. A minority of laboratories did not use diagnostic synoptic reporting (10.1%).
The rate of surgical pathology report defects and the types of surgical pathology report defects were tested for associations with institutional demographic and practice variables by using multivariate regression models. A P value <.05 was considered statistically significant (Table 12). Statistically significantly higher rates of surgical pathology report defects tended to occur in laboratories with a pathology resident/fellow training program (median, 8.5 versus 5.0 per 1000; P = .01). Lower rates of surgical pathology reports with misidentification defects tended to occur in laboratories that systematically reviewed all malignancies by a second pathologist before sign-out (median, 0.0 versus 0.6 per 1000; P < .001). Lower rates of reports with specimen defects tended to occur in laboratories with intradepartmental review of difficult cases after sign-out (median, 0.0 versus 0.4 per 1000; P = .02). Also, institutions that had a set percentage of cases reviewed by a second pathologist tended to have higher defect rates, but this association did not reach statistical significance (median, 6.7 versus 3.8 per 1000; P = .10). Findings were similar for laboratories conducting a quality control check of protocols for accuracy of nondiagnostic elements after sign-out (median, 7.8 versus 5.3; P = .10, Kruskal-Wallis test). No other pre–sign-out or post–sign-out strategies showed statistically significant associations with defect rates.
This Q-Probes study established prevailing rates of report defects in surgical pathology and found demographic and practice variables that influence those rates. Regarding demographic variables, higher overall defect rates tended to occur in laboratories with a pathology resident/fellowship training program. The higher defect rate in training programs underscores the additional monitoring challenges a training program may create for attending pathologists. Lower rates of report defects with misidentification errors tended to occur in laboratories that review all malignancies by another pathologist before sign-out. Thus, while double review of cases is generally thought of as a mechanism for avoiding misinterpretations, it proved effective as a second check on protocols and slides that discovered labeling errors. Lower rates of report defects involving specimen defects tended to occur in laboratories that have intradepartmental review of difficult cases after sign-out. This is another double check, this time on specimen adequacy.
Tracking of defects in surgical pathology reporting is an important element in any laboratory quality improvement program, and some have suggested that databases be established with such information.3,4 However, wide variation in collection methodology and in terminology makes interinstitutional comparisons impossible. For this study, the data collection and terminology were based on a taxonomy that Meier et al1 proposed, dividing report defects into 4 categories, or defect fractions: (1) misinterpretations, (2) misidentifications, (3) specimen defects, and (4) other defects involving nondiagnostic information. In a 430-case validation set reviewed by 4 observers from 4 institutions, the taxonomy showed excellent interobserver agreement, with a median κ statistic of 0.8780. Calculation of defect fractions, in addition to overall defect rates, allows for monitoring of effects of various interventions. As the first large, prospective multi-institutional study to use this taxonomy, this Q-Probes study provides important benchmarking information on defect rates and defect fractions.
Participants in our study reported a median defect rate of 5.7 per 1000 cases, but with wide variation (10th to 90th percentile range, 13.5–0.9). Six Sigma benchmarking has been used as a quality measure in laboratories,5 and if we consider report defects in this context, the aggregate defect rate in our study (1688 per 360 218 reports) was 4686 defects per million, corresponding to a sigma metric of 4.1 (10th to 90th percentile range, 3.8–4.7). Many industrial processes fall into the 3 to 4 sigma range, but a sigma metric of 5 is considered by many to be the goal of health care processes.5 With this objective in mind, the report defect rate from this study indicates room for improvement.
How does our study's defect rate compare to the pathology literature? In many instances, variation in terminology and collection methods makes direct comparison challenging, but the overall defect rate we found appears to be near the high end of what has been reported in studies with methodology that is most similar to ours. Using the taxonomy applied in this study, Meier et al1 reported a lower overall defect rate (ranging from 2.6–4.8 per 1000) during a period of 4 years at a single institution. After implementation of real-time editing of amended reports during the introduction of a lean process improvement program,6,7 Meier et al2 documented amendment rates initially increasing from 4.8 to 10.1 per 1000, then showing a subsequent incremental decrease to 5.6 per 1000 as lean processes took hold. A 1996 Q-Probes study involving 1 667 547 accessioned surgical pathology cases in 359 laboratories also revealed a lower aggregate amended report rate (1.9 per 1000) and a median institutional rate of 1.46 per 1000 (10th to 90th percentile range, 4.75–0.22).8 Two studies with different collection methods have shown higher amendment rates. Renshaw and Gould9 reported a higher amendment rate (0.5%-4.4%) in a single institution study of a mix of 2659 surgical pathology and nongynecologic cytology cases. Also, a 2003 Q-Probes multi-institutional study of post–sign-out review of a mix of surgical pathology and cytopathology cases revealed an amendment rate of 3.4%.10
In our study, defect fractions, in aggregate, were as follows: 14.6% misinterpretations, 13.3% misidentifications, 13.7% specimen defects, and 58.4% other defects. By comparison, the single-institution study by Meier et al,1 which used the same taxonomy, reported higher fractions of misinterpretations and misidentifications (22.9%–27.9% misinterpretations, 19.6%–38.2% misidentifications, 4.2%–10.4% specimen defects, and 27.9%–48.0% other defects in nondiagnostic information), but the rates of these defects were similar to the current study. In the current multi-institutional study, the median defect fraction rates per 1000 were as follows: 0.8 misinterpretation, 0.6 misidentification, 0.3 specimen defects, and 2.6 other defects in nondiagnostic information. These rates fall within the ranges of the single-institution amendments reported by Meier et al1 (0.65–1.14 misinterpretations, 0.78–1.34 misidentifications, 0.12–0.43 specimen defects, and 0.91–2.32 report defects). After implementation of a lean process improvement program, using the taxonomy to track defect rates and distributions over time, Meier et al2 reported that misinterpretations incrementally decreased from 18.3% to 3.0% and misidentifications incrementally decreased from 15.6% to 8.7%, while specimen defects remained variable from 1.9% to 11.0%, with a reciprocal increase in other nondiagnostic report defects from 64.2% to 83.0%. In terms of specific misidentifications, our multi-institutional study revealed 31.5% patient misidentification, 25.5% anatomic localization, 23.0% laterality, and 20.0% tissue-type misidentifications. The single-institution study by Meier et al2 reported a higher proportion of patient misidentification (50%), lower proportion of anatomic localization (12.5%), similar laterality (26.8%), and lower tissue-type misidentifications (10.7%). The differences likely reflect wider variation in practice in our multi-institutional study.
The 1996 Q-Probes study of amended reports used different terminology and evaluated 4 types of amendments: revision of final diagnosis, revision of preliminary diagnosis, correction of patient identification, and correction of clinically significant information other than the diagnosis.8 The reported amendment types were as follows: 38.7% to correct diagnosis (akin to the misinterpretation fraction), 19.2% to correct patient identification errors, and 26.5% to change clinically significant information other than the diagnosis.
Misinterpretation is the defect fraction most widely considered in surgical pathology literature. This fraction has been referred to as “misdiagnosis,” “diagnostic disagreement,” “diagnostic discrepancy,” “diagnostic error,” and “diagnostic change,” though these terms are not always applied with equivalent meaning or clinical significance. Consequently, misinterpretation rates vary considerably depending on methods of case selection and any efforts made to delineate the clinical severity of error. In our study, the aggregate misinterpretation rate was 0.69 per 1000 and the institutional median was 0.80 (10th to 90th percentile range, 0.0–2.2). In a literature review of error detection in anatomic pathology, Zarbo et al11 noted a wide range of reported interpretive error rates, from 1.2 to 50 per 1000 cases, and higher overall rates with retrospective review detection methods. In one single-institution study of 2659 surgical pathology and nongynecologic cytology cases, Renshaw and Gould9 found that diagnostic disagreements were even higher (ranging from 2.2%–6.9%). In a review of 4 published series, Renshaw et al12 reported overall error rates ranging from 0.0% to 2.36%, with clinically significant rates from 0.34% to 1.19%. A retrospective, blinded review of 592 biopsy cases was associated with a 4.2% diagnostic disagreement rate, with differences in diagnostic threshold the most common.13 Another blinded review of 5000 biopsies (mostly skin), showed a disagreement rate of 8.88%, but most disagreements were threshold issues and the “true error” rate was assessed as 0.10%.14 In a study of post–sign-out case review, Raab et al15 found a diagnostic error rate ranging from 1.7% to 3.7%, with major errors ranging from 0.3% to 0.5%, and minor errors ranging from 1.2% to 3.3%.
Defect Discovery Mechanisms
In our study, report defects were discovered through a variety of mechanisms, the most common of which was review of the report without the slides (24.9%), followed by “don't know” (16.4%) and clinician requested review (11.4%). Review for tumor board and discovery “by chance” made roughly equal contributions (3.6% and 3.3%, respectively). Compared to prior studies, our data show a lesser contribution from clinician requests and conference reviews. In the 1996 Q-Probes study of amended reports, the most common discovery method was clinician-requested review (20.5%).8 Meier et al1,2 reported that conference review had detected 10% to 20% of overall defects and 42.4% to 83.0% of misinterpretations. In our study all forms of quality assurance slide reviews combined for only 2.3% (38 of 1665) of the discovered defects, whereas the 1996 Q-Probes study estimated 30% of errors were detected through slide review.8 In our study 5.5% of report defects were discovered through review of additional clinical information previously unknown to the pathologist, similar to the 10% rate noted in the 1996 Q-Probes study.8 Another Q-Probes study of the necessity of clinical information in surgical pathology showed that 0.73% of cases required additional clinical information for diagnosis (10th to 90th percentile range, 3.01%–0.08%).16 When additional information was obtained there was a substantial change in diagnosis for 6.1% of those cases.
In our study, 94.2% of participants reported some form of pre–sign-out auditing for errors. The most common strategies were intradepartmental consultation for difficult cases (85.5%), followed by second pathologist review of all initial malignant diagnoses (43.5%), second pathologist review of all outside cases in which there is a disagreement (40.6%), second pathologist review of all cases on a predetermined list of case types (26.1%), second pathologist review of all malignancies (20.3%), and review of a set percentage of cases by a second pathologist (15.9%). In addition, 17.6% of laboratories indicated that a target percentage of cases undergo pre–sign-out review, with a median target of 10% of cases (10th to 90th percentile range, 2%–80%). In 45.6% of participating institutions there is a pre–sign-out check of case protocols for accuracy of nondiagnostic elements, with a median target of 97% of cases reviewed (10th to 90th percentile range, 1%–99%).
These data indicate an increase in auditing programs since the 1996 Q-Probes study of amended reports.8 In that study a process of pre–sign-out slide review was reported in 31.9% of laboratories, and post–sign out review in 37.3%. Specifically, in the 1996 study, 9.9% reviewed a set percentage of cases before sign-out, 33.1% reviewed a set percentage after sign-out, 2.0% reviewed all cases before sign-out, 4.0% reviewed all cases after sign-out, 26.0% reviewed all malignancies before sign-out, 8.2% reviewed all malignancies after sign-out, and 24.0% reviewed all conference cases. Among laboratories with a slide review program in 1996, the most common proportion of reviewed cases was 6% to 10%, similar to our study. The 1996 Q-Probes study also reported how various methods of slide review contributed to error, with 3.7% detected by review of a set percentage of cases, 3.5% from review for clinical conferences, 0.7% from review of all cases, and 0.5% from review of all malignancies. Neither the current study nor the 1996 study revealed any statistically significant relationship between defect rates and various pre–sign-out audits.
From the current study there is suggestion for some targeted pre–sign-out intervention to address defect rates. For example, the most common organ sites for overall defects were the skin (18.2%) and breast (17.7%), and 61.6% of cases with defects had neoplastic diagnoses. Among defect fractions, misinterpretations were most commonly found in breast (16.2%) and skin specimens (19.8%), suggesting targets for pre–sign-out case review. Misidentifications were most common by far among skin specimens and may be a target for better tracking programs. Similarly, specimen defects were most common by far among breast specimens, potentially indicating the need for more standardized gross sampling procedures. However, the efficacy of focusing on particular specimen types depends on the relative rates of errors for those specimen types. An institution is best served by determining what specimen types are relatively greater sources of error in its own practice and then focusing surveillance and corrective efforts appropriately.
Audit programs at the level before pathologist case review can also affect report defects. Obtaining adequate clinical information is another important pre–sign-out strategy and in our study 5.5% of defects were discovered as a result of additional clinical information. As mentioned above, a prior Q-Probes study16 emphasized the necessity of clinical information and its role in diagnostic changes. Specifically, additional clinical information was associated with higher rates of diagnostic change in the small bowel (11.6%), lung (10.1%), and ovary (14.9%), malignant neoplasms (9.8%) and therapy changes (11.0%), and resections (9.1%) greater than biopsies (5.6%). These data suggest targeted review of the patient electronic record in some instances. It has also been recommended that second identifiers be placed on both tissue blocks and slides in an effort to limit patient and specimen identification errors. Of the participating laboratories in our study, 75.5% report a second identifier on glass slides, but only 40% include a second identifier on surgical pathology blocks. Before case evaluation by a pathologist, a quality control check of protocols is performed for accuracy of nondiagnostic elements in a slight minority of laboratories (45.6%). A prior study of mislabeling in surgical pathology specimens, blocks, and slides reported an aggregate mislabel rate of 4.2 per 1000, roughly the overall defect rate in our report defects study.17 In that study, mislabeling was discovered at all stages of the process, from accessioning through pathologist sign-out, emphasizing the importance of the “prepathologist” phase of potential report defects. Makary et al18 reviewed surgical specimen submissions and found an identification error rate of 4.3 per 1000 (5.12 in outpatient, 3.46 in operating rooms). The errors more often came from biopsy procedures (59.3%) and the most common organ sites were breast, skin, and colon. Unfortunately, Meier et al2 reported that during a 4-year interventional period, accession redesign had little impact on report defects, with an up and down pattern, and standardization of specimen accession and gross examination only reduced specimen defects surrounding ancillary testing. In addition, specific interventions at the collection level had little effect on patient misidentification, and clinician education about specimen identification had only a modest impact on misidentifications.2
Post–Sign-Out Discovery of Defects
Five general mechanisms have been identified as plausible ways to discover defects: (1) pathologist case review without additional information or material; (2) review of the case with inclusion of additional information or material but without clinician prompting; (3) review for preparation or presentation at conference with clinicians; (4) clinician-initiated review or reconsideration of the case; and (5) as the result of external consultation.11 In our study, no post–sign-out audit systems showed statistically significant differences in regard to defect rates, but post–sign-out review of a set percentage of cases by a second pathologist was associated with a higher rate of defects (6.7 versus 3.8 per 1000).
The significant downside of a post–sign-out audit is that errors may be discovered relatively late in the course of patient treatment, and having 2 different diagnoses in the patient record creates confusion. However, the post–sign-out approach has benefits, including the ability to review more cases without disrupting workflow, with the potential to uncover more errors and discover more opportunities for improvement.19 Renshaw et al12 reviewed 4 series that measured error rates in surgical pathology and used the reported error rates to determine how many cases would need to be reviewed to show a significant difference from published error rates. Because published error rates in surgical pathology are relatively low, a large number of cases need to be reviewed to detect a great difference in performance. To evaluate false negatives the case number ranged from 1351 to 22 580, and up to approximately 50 000 to detect significant differences in error rates due to threshold and tumor type. It is also worth keeping in mind that case review is not perfect, as Renshaw et al13 reported 98% sensitivity for blinded review in biopsy cases.
Relation of Report Defects to Patient Harm
Prevention of patient harm is the primary goal in any quality program. Our Q-Probes study did not ask participants to determine the clinical effect of each report defect, though 97.1% of laboratories reported that they routinely make an attempt to assess the clinical effect of report defects. For misinterpretations, determining the true clinical effect of a report defect would require a case-by-case review, but some inferences can be drawn from our data. For example, change in stage, change in margin status, false positives, and false negatives are all diagnostic changes that have a high likelihood of being clinically significant, as are misidentification of the patient and misidentification of laterality. These defect types combined for 225 cases (13.5% of defects) in this study and an overall aggregate rate of 0.62 per 1000 cases.
Participants were asked to collect report defects through whatever mechanisms currently existed in each institution. That is, this study made no effort to standardize the methods of case identification, and all data were self-reported. Consequently, it is difficult to know if our reported defect rates represent the true error rate in laboratory practice. The laboratories in our study likely use a spectrum of passive and active surveillance programs for defects. Meier et al1 noted that passive monitoring only captured 2.8 per 1000 amendments, while active monitoring captured 4.8 per 1000, so an active approach increased amendment detection by 40%. In addition, variation in application of terminology and assignment of the amended cases to our study taxonomy may have created inconsistency in our data collection. It has been recommended that use of an independent editor of amendments allows for consistent application of the taxonomy.2 Most importantly, in this context, variable use of amended reports versus addendum reports also potentially affected our data. Included in this study was any surgical pathology report that was changed in any way after the case was finalized in order to correct any defect in the original report. In this data collection, 9.8% of defects involved the addition of information whose results did not alter the diagnosis. Such information could plausibly be designated as addenda. Removing those cases from the data analysis leaves a median institutional defect rate of 5.1 per 1000 (10th to 90th percentile range, 12.6–0.9). Several authors,1,20 including the Association of Directors of Anatomic and Surgical Pathology (ADASP), have acknowledged the difficulty of separating these 2 types of modified reports and have pointed out the need for standardized use of the addendum and amendment functions. In one ADASP-sponsored published questionnaire to which 41 laboratories contributed, disagreement reigned among pathologists as to what changes in diagnostic information should prompt the designation amendment versus that of addendum.21 While 97% of respondents agreed that a change in diagnosis from malignant to benign or vice versa required an amendment, there was considerable disagreement about “lesser” changes. The ADASP and others advocate that all alterations other than pure additions to reports should be designated as amendments.1,2,20
Amendments Versus Addenda
Others have pointed out that inappropriate amendment/addendum labeling can lead to underestimated amendment rates, and, most concerning, underestimated rates of misinterpretation error. Recently, Finkelstein et al22 reported a review of report addenda during a 15-year period in their institution, finding that the percentage of cases with addenda had increased incrementally from 0.9% to 8.6%. As expected, the addenda increases were primarily due to increased use of ancillary studies, such as molecular tests, immunohistochemistry, and cytogenetics. However, in their assessment, 5.6% of the addendum reports should have been amendments. In another audit of addendum reports, Meier et al2 reported that 10% of addenda should have been amendments. This raises concern that focusing on report amendments, especially misinterpretations, and using these rates as measures of practitioner quality in ongoing practitioner performance assessment, may be counterproductive and tempt pathologists to place some of these errors in the addendum category, which is generally unmonitored. Thus, a quality improvement program is likely better served by monitoring both amendments and addenda. Moreover, monitoring of addenda themselves can be helpful in detecting weakness in organization and workflow.
Our data are further limited by lack of discovery timeline for defect discovery and by only collecting information on the first discovered defect for each case. This could potentially bias both the true defect rate and the type of defects discovered. For example, discovery of a simple typographic error could have prevented reporting of a misinterpretation error that was discovered at a later time. Timing of defect discovery may also be important, since late discovery of a clinically significant error may have grave consequences for a patient.
In conclusion, rates and distribution of surgical pathology report defects are indicators of quality that can be compared across institutions to produce performance benchmarks. Standardization of terminology is important for communication about these defects, for relating them to features of surgical pathology practice, and to assure customer confidence in the surgical pathology process. Furthermore, for laboratories to contribute to surgical pathology error databases, standardization of error definitions, terminology, and methods for discovery and collection is essential. Such standards would be a step toward the ultimate goal of unifying the literature on errors in anatomic pathology.23 This Q-Probes study represents the first large-scale, prospective multi-institutional application of a taxonomy for report defects and provides benchmarks for monitoring of those defects. It found a current median institution defect rate of almost 6 defective reports per 1000 reports issued. Regarding defect fractions, the fractions of misinterpretations, misidentifications, and specimen defects were about the same (13%–14%). Important correlates that the study found were that misinterpretations and specimen defects were more often discovered by pathologists, misidentifications were most often discovered by clinicians, and that second pathologist review of malignancies and intradepartmental review of difficult cases lowered misidentification rates and specimen defects, respectively, but neither they nor other strategies had impact on the misinterpretation fraction.
The authors thank the participating institutions for their time and diligence in collecting the data for this study.
The authors have no relevant financial interest in the products or companies described in this article.