The College of American Pathologists requires synoptic reports for specific types of pathology reports.
To compare the accuracy and speed of information retrieval in synoptic reports of different formats.
We assessed the performance of 28 nonpathologists from 4 different types of users (cancer registrars, MDs, medical non–MDs, and nonmedical) at identifying specific information in various formatted synoptic reports, using a computerized quiz that measured both accuracy and speed.
There was no significant difference in the accuracy of data identification for any user group or in any format. While there were significant differences in raw time between users, these were eliminated when normalized times were used. Compared with the standard format of a required data element (RDE) and response on 1 line, both a list of responses without an RDE (21%, P < .001) and a paired response with more concise text (33%, P < .001) were significantly faster. In contrast, both the 2-line format (RDE header on one line, response indented on the second line) (12%, P < .001) and a report with the RDE response pairs in a random order were significantly slower (16%, P < .001).
There are significant differences in ease of use by nonpathologists between different synoptic report formats. Such information may be useful in deciding between different format options.
Synoptic reporting of all tumor excisions is required by the College of American Pathologists (CAP). The CAP specifically requires that each element in a synoptic report be reported in a required data element (RDE) pair consisting of the element and the corresponding response (CAP Laboratory Accreditation Process Checklist question ANP.12385).1 While there are significant data supporting the use of checklists in general,2–14 and their use to improve the completeness of surgical pathology reporting specifically,15–33 there are much fewer data concerning the significance of the specific formats. Valenstein34 examined structured reporting and focused on diagnostic headlines, white space, standardized layout, continuity over time, and reduction of clutter but provided few data addressing the specific formats. In addition, previous studies have examined the significance of specific formatting features of the checklist rather than the synoptic report that affect the completion rate of the final report.24 Whether different formats are associated with differences in reader accuracy or ease of use is not known. To address this, we created a computer-based quiz that measured both accuracy and speed to determine if different formats were associated with differences in performance by nonpathologists who may read such reports.
To test the accuracy and speed of identification of specific data elements in a synoptic report, a Python script was written that provided instructions and a test platform for these quizzes. Specifically, the participant is shown a specific phrase that may or may not be in a synoptic report. When the user presses “enter,” the synoptic report appears on the screen and the timer starts. The user then examines the report to determine if the phrase is or is not present. If it is present the user enters the number “2”; if it is not, he/she enters “1” and then presses “return.” The timer stops when return is entered. The program automatically records the time and whether the answer was correct, and these data are then transferred to a comma-separated values file for further analysis.
For the purposes of this study, we chose to use the urinary bladder checklist from CAP. We chose to include 8 elements, including histologic type, histologic grade, microscopic tumor extension, adequacy of material for determining muscularis propria invasion, associated epithelial lesions, lymphovascular invasion, procedure, and margins. However, for testing we only tested and varied the first 4 elements, using only 2 possible responses from the same checklist. The participant's task was to determine which of the 2 possible responses for these 4 elements were included in the report.
The formats were tested in 3 different quizzes, each given in sequence and at the same sitting. The first quiz contained 12 questions each from 3 different formats (36 total questions). These include a standard format that included the exact RDE and the response directly copied from the checklist on a single line (standard format; Figure 1). This format served as the control in all 3 quizzes. The second format only included the same responses as in the standard format, 1 per line, without the RDE descriptor (list format; Figure 2). The final format was the one that is often used in the CAP electronic Cancer Checklist (eCC) (depending on the specific Information Technology vendor) where the RDE is on one line and the response is indented on the following line (2-line format; Figure 3). The order of the elements was exactly the same in every case (though this order did not match the order in the checklist). The same questions and answers were asked for all 3 formats, allowing for the use of a paired t test in analysis. However, the order of the questions was randomized for each participant.
Quiz 2 contained 16 questions using the standard format, and 16 questions with the same format but for which the text had been edited to make it as concise as possible (concise-text format; Figure 4). All other features were the same as in quiz 1.
Quiz 3 contained 16 questions using the standard format in the same order as usual, and 16 questions in which the order of the elements in the synoptic report was randomized (random-order format). Unlike the previous tests, the first 16 questions followed the standard format, and the second 16 questions were the randomized questions. This was done because preliminary testing suggested that when the 2 formats are mixed together randomly, it is not possible to recognize the standard order. However, within each group the order of the test questions remained randomized.
Twenty-eight participants completed all 3 quizzes. They were all nonpathologists and included 7 cancer registrars, 7 MDs (all internists), 7 non-MD medical personnel (3 physician assistants, 4 laboratory technologists), and 7 nonmedical personnel (administrative assistants, other professionals). We specifically excluded pathologists from this testing, since we wanted to measure the performance of a user other than a pathologist.
We did note that participants got faster from quiz 1 to quizzes 2 and 3 as they got practice with the test. As a result, no comparison was made between the 3 quizzes, and all 3 quizzes were always taken in the same order. In addition, there was a wide range of speed for the different users. To allow comparison between these users, times were normalized to the mean of the standard format for each user. As a result, the normalized time for the standard format was the control with a normalized time of 1, and the time for all other formats was in comparison with that time.
Statistical analysis was performed by using a 2-tailed χ2 test and a paired t test, as appropriate, with a P value significance threshold of .05.
There was no significant difference in accuracy between any of the 4 user groups (Table 1).
When raw times were used, there were significant differences between the 4 user groups (Table 2). MDs took the longest time, while nonmedical personnel took the shortest time. However, when times were normalized to the corresponding standard format, these differences disappeared (Table 3). As a result, the data for all users were analyzed together.
There was no significant difference in accuracy between any format (Table 4).
Results for speed are shown in Table 5. In the first quiz, compared with the normalized time for the standard format, the 2-line format was significantly slower (12%, P < .001). In contrast, compared with the standard format, the list format was significantly faster (21%, P < .001). In the second quiz, compared with the standard format, the concise format was significantly faster (33%, P < .001). In the third quiz, compared to the standard format, the random-order format was significantly slower (16%, P < .001).
There are several possible measures of success in synoptic reporting in surgical pathology: completeness rate, accuracy and ease of use for the pathologists, accuracy and ease of use for the reader, and clinical outcome. As previously mentioned, there is an abundance of evidence that mandating and requiring a specific list of elements for inclusion in synoptic reporting, along with the provision of a checklist of those items, can successfully improve the completeness of reports.15–33
There are also published data concerning the accuracy of the data included in a synoptic report. Specifically, the words no and not are rarely but consistently left off of reports reducing accuracy, and ensuring that the phrases in the checklist contain as few of these elements as possible improves accuracy.35 Reducing the number of required elements also is associated with a reduction in the number of clerical errors in a report.36 In addition, there are data concerning the effect of formatting of the checklist for the pathologist who generates the synoptic report on the completeness of that report: easily identified and consistently formatted styles to identify the required elements improve completeness.24 Finally, while there is an association in the use of synoptic reporting with improvement in some measures of quality associated with clinical therapy and outcome over time, such as the number of lymph nodes obtained from colon carcinoma specimens,33 it is difficult to be sure whether the change in performance is related to synoptic reporting. There are multiple other quality efforts that have taken place during the same time period as when these results were measured, including specific recommendations for how many lymph nodes should be obtained from these specimens,37 which are confounding variables. No randomized controlled trial for the clinical impact of synoptic reporting has been performed, and the reliance on historical controls may make demonstrating this association difficult.
Nevertheless, one of the main reasons for selecting a particular format is its ability to easily and accurately convey information to the reader. This study is the first to specifically compare these features in several different formats. The results would support the general hypothesis that reports that are short, simple, and consistently formatted are more easily read than those that are not. Our study suggests that none of the formats we have examined are associated with significant differences in the accuracy of identification of data. However, several formats were significantly faster and others were significantly slower than the standard format we took as our control. Some of these changes can be implemented by pathology groups and still be in compliance with the CAP laboratory accreditation process requirements, while others cannot. Additional studies to confirm these findings may be of value in refining the exact criteria for a synoptic report.
Perhaps more importantly, there are data entry programs available for creation of synoptic reports, including the eCC from CAP. These programs offer advantages that free text cannot, including force functions to ensure completeness and direct creation of a structured data set that can be transmitted to other sites without additional work by the IT department.28,38 At present these programs appear to most often use the 2-line format we tested (though this is vendor dependent), which did not perform as well as several other choices. Since pathologists have very little ability to customize the features of these programs, it would seem likely that programs using formats that are preferred by both pathologists and end users and are associated with increased accuracy would likely be more successful.
In this study, we did not specifically examine the performance of a narrative diagnosis versus a synoptic report. Nevertheless, in this study, with the 2-line format where the information is split over several lines, it took significantly longer to identify data than with the standard format. It appears likely that a narrative text would also suffer from at least as much difficulty, but we did not specifically test this hypothesis.
Finally, we suggest that tools such as the quiz we have developed here may be of value in providing quantitative data for the evaluation of proposed changes to these synoptic reports. While there are other sources of qualitative and quantitative data for this evaluation, the data this tool collects are objective, quantitative, and relatively easy and inexpensive to obtain.
There are several limitations to the current study. First, data recognition is just one element of the user interaction with a synoptic report. User preference is an important consideration, which can be assessed by surveys. Comprehension is another facet, which was not tested in the current study. Indeed, several of the nonmedical participants noted that while they could perform the study, they really did not understand the meaning of the phrases for which they were looking. While some differences made a large difference for every single participant (concise format), others were less consistent. It is likely that individual users may have different performance results. In addition, we only examined 1 specific checklist and restricted our study to synoptic reports with 8 elements. The results may vary with different elements of the same checklist, other checklists, and with reports of shorter or longer length. Indeed, the header part of the RDE may have a greater impact on the results with longer rather than shorter reports. Randomizing the order of the elements likely is more problematic with longer reports as well. Finally, we specifically designed the test to examine the speed of identification. As such we wanted a response (entering 1 or 2) that was simple and consistent between questions. If instead of giving the question we had asked what the result of a particular element was, the results may also have been different, but the speed of this process would have been much more difficult to measure in a consistent way.
In conclusion, there are significant differences in ease of use by nonpathologists between different synoptic report formats. The data and study design we describe may be useful in deciding between different format options.
The authors have no relevant financial interest in the products or companies described in this article.