Auditors must ensure that their audit plans budget sufficient time for key audit steps. Research has shown that insufficient audit time budgets can be detrimental to audit quality. We examine whether framing audit steps negatively (e.g., assess whether management's assumptions are not appropriate) increases time budgets—particularly for steps in which the auditor perceives that performance quality is less verifiable, and thus most at risk of being performed with low quality. First, we report the results of analyses indicating that, in practice, audit steps are predominantly framed positively, potentially resulting in smaller time budgets. We then report the results of an experiment in which 50 experienced audit managers budget time for an audit program that tests a Level-3 fair-value estimate. Prior research and Public Company Accounting Oversight Board (PCAOB) inspections indicate that this is a challenging audit area, vulnerable to allegations of low audit quality. The results support our predictions and suggest that reframing audit steps negatively would increase audit time budgets—an audit quality indicator—particularly for less-verifiable steps.
The PCAOB identified the audit time budget as an indicator of audit quality: higher audit time budgets indicate higher audit quality (PCAOB 2015a). More specifically, the PCAOB stated that the amount and allocation of hours to various parts of the audit convey the effort put forth by the auditor, all of which are important inputs into audit quality. This position is supported by prior research (e.g., see Knechel, Krishnan, Pevzner, Shefchik, and Velury  for a review). We propose that two key features fundamental to the auditing context—the framing of audit steps and the perceived verifiability of the performance quality of audit steps—may affect audit budgeting judgments. We examine the effect of audit step frame and verifiability in the setting of auditing fair values, which is an area where step verifiability tends to be low (Bell and Griffin 2012; Christensen, Glover, and Wood 2012; Cannon and Bedard 2017) and auditors are vulnerable to providing insufficient effort (PCAOB 2015b, 2015c, 2015d, 2015e).
Framing refers to alternative yet equivalent ways to describe the same decision task, with alternative frames accentuating positive or negative characteristics of the object of that decision task (Levin, Schneider, and Gaeth 1998). Typically, auditing standards and audit steps are framed positively. For example, the audit is planned and effort is allocated to determine whether a recorded fair value, as well as underlying assumptions, is reasonable, rather than not reasonable (AS 2502.28-29, PCAOB 2015f; ISA 540.6, IAASB 2008). We manipulate how steps are framed and, based on prior research in psychology, predict that auditors will budget more audit hours when steps are framed negatively. Examining this effect of audit step frame is important: responses from 93 experienced auditors and examination of two Big 4 firms' audit manuals show that, in practice, audit steps are predominantly framed positively, potentially resulting in smaller audit time budgets. We further predict (based on an untested suggestion from the psychology literature) that framing effects are more pronounced for audit steps where auditors believe that the quality of performance is harder to verify. Thus, a negative frame should prompt auditors to budget more hours in order to examine audit steps for which they believe the quality of work will be harder to verify (hereafter, less-verifiable steps).
We test our predictions with an experiment in which 50 audit managers from two Big 4 audit firms plan an audit of an asset classified as Level-3 in the fair-value hierarchy.1 Their primary task is to determine the number of audit hours necessary for 15 audit steps that, according to their firms' audit manuals, are typically used to audit management's process for determining fair values. We manipulate step frame (positive/negative) between participants and elicit each auditor's perceptions of step verifiability for every one of the 15 steps. We use audit managers as participants because managers tend to be primarily responsible for both planning the audits and reviewing the work of the audit seniors—the latter of which usually perform the audit steps relevant to fair-value audits (Griffith, Hammersley, and Kadous 2015).
As predicted, results indicate a significant main effect of step frame on auditors' planning judgments, with auditors allotting more hours under a negative frame than under a positive frame. Results also provide some support for a significant interaction between frame and perceived verifiability of steps, such that a negative frame increases the planned audit budget more for steps perceived as having lower verifiability (e.g., steps examining management's assumptions).
Our study contributes to the existing audit research literature on framing (e.g., Bedard and Graham 2002; Fukukawa and Mock 2011; Mock and Fukukawa 2016). First, our study extends this literature by showing that the framing of audit steps affects time budgets in the setting of fair-value auditing. We believe that examining the impact of frame in the setting of fair-value auditing is of particular interest to research and practice. Prior research indicates that auditors view fair-value estimates as having a very high risk of misstatement (e.g., see Bell and Griffin 2012; Cannon and Bedard 2017). Thus, when auditing fair values, auditors may assign time budgets that already reflect a very high-risk assessment. This results in time budgets that are already very large and unlikely to increase farther when audit steps are framed negatively.
Second, our study provides some initial evidence that lower verifiability of audit steps may increase this framing effect. More specifically, for steps that reviewers perceive as being more difficult to verify, a negative frame is particularly effective in increasing budgeted audit hours—a measure identified by the PCAOB as one of the key inputs into audit quality (PCAOB 2015a).
The remainder of this paper proceeds as follows: The second section provides background and hypotheses. The third section describes our experimental method. The fourth section provides results and analyses, and the fifth section provides conclusions, implications, and directions for future research.
BACKGROUND AND HYPOTHESES
Fair Values and Audit Planning
Current U.S. and international auditing standards require auditors to determine whether management's estimates are “reasonable” (AS 2502.28-29, PCAOB 2015f; ISA 540.6, IAASB 2008). Fair-value estimates are particularly challenging for auditors because of the high amount of uncertainty present in such estimates (e.g., see Cannon and Bedard 2017; PCAOB 2015b, 2015c, 2015d, 2015e).2 In the fair-value hierarchy used by the FASB and IASB, Level-3 estimates involve the greatest amount of subjectivity because they are based on inputs that are not observable (FAS 157 and IFRS 13). Thus, auditing Level-3 fair-value estimates involves significant ranges of potentially acceptable conclusions, providing latitude for psychological factors to affect judgment.
Both U.S. and international standards allow three approaches for auditing fair-value estimates: (1) auditing the management process for determining fair values, (2) developing independent estimates, and (3) auditing subsequent events (AS 2502.23, PCAOB 2015f; ISA 540, IAASB 2008). Recent research indicates that auditors predominately choose the approach of auditing management's process (Griffith et al. 2015; Cannon and Bedard 2017).
In addition to identifying what audit steps to perform, auditors budget the amount of time necessary for the subordinates to carry out these steps. The PCAOB specified that an audit's time budget is an input to the audit's quality because the allocation of hours to various parts of the audit conveys the effort the auditor is willing to devote to the step (PCAOB 2015a). The PCAOB (2015a) stated that “the hours that levels of an audit team devote to risk areas can suggest whether audit managers have staffed the audit appropriately to reflect the risk areas identified during the planning phase of the audit, and the extent to which senior members of the team have focused sufficiently on those areas.” This position is supported by prior research indicating that higher audit budgets positively affect audit quality (e.g., see Knechel et al.  for a review). For example, research has indicated that time budget pressures may result in trade-offs of audit effectiveness for efficiency (McDaniel 1990) and increase the likelihood of reducing audit quality by acts such as signing off on audit workpapers prematurely, or failing to perform necessary audit steps (e.g., Alderman and Deitrick 1982; Kelley and Margheim 1990; Reckers, Wheeler, and Wong-On-Wing 1997; Asare, Trompeter, and Wright 2000; Agoglia, Hatfield, and Lambert 2015). Thus, time budgets are a reasonable proxy for the audit quality.
Effect of Step Frame on Audit Time Budgeting Judgments
We investigate whether audit time budgeting judgments are affected by the way in which audit steps are presented, or “framed,” to the auditor preparing the budget. As stated earlier, framing refers to alternative ways of describing the same decision task, with the resulting frames typically differing according to whether they evoke positive or negative connotations (Tversky and Kahneman 1981). Bonner (2007) notes that “attribute framing” is most closely associated with potential framing effects in auditing, where a frame changes only one attribute of a decision task. For example, a positive frame for an audit step is “Assess whether a calculation is accurate.” A negative frame for the same step is “Assess whether a calculation is not accurate.” Importantly, positive and negative frames should be viewed as alternative yet equivalent ways of referring to the same event (e.g., an amount that is not “accurate” is “not accurate”), so framing effects can be unambiguously attributed to psychological processes, as opposed to alternative frames that ask about different events.
Research in psychology suggests that negative wording of a task should encourage auditors to budget more audit hours to that task. Levin et al. (1998) review research in psychology, showing that attribute framing effects occur due to a shift in attention, with people focusing on frame-consistent information and seeking to support the outcome implied by the frame. This effect occurs because framing impacts the encoding and representation of information in memory (Levin and Gaeth 1988). Positive encoding emphasizes positive aspects of information in memory and negative encoding emphasizes negative aspects.3 Thus, psychology research on framing suggests that positive frames may cause auditors to focus on supporting client assertions, while negative frames may cause auditors to complete a more critical and thorough examination of evidence.
Prior research in auditing provides mixed support for this claim and has generally found that framing audit tasks negatively has no effect on auditors' judgment (e.g., Kida 1984; Trotman and Sng 1989; Asare 1992). This is presumably because auditors pay attention to negative information regardless of question framing (Smith and Kida 1991). However, more recent studies suggest that framing may affect auditor judgment under some circumstances. Chung et al. (2013) review research in accounting on the role of biases in auditor judgments and speculate that, in some settings, a positive frame may lead auditors to seek information confirming client assertions and engage in less thorough information processing. At the same time, a negative frame may encourage auditors to avoid seeking information that confirms a client's assertions and to engage in information processing that is more thorough.
Consistent with this suggestion, Martin, Rich, and Wilks (2006) warn that the step wording for auditing complex estimates may encourage auditors to search for evidence that corroborates management's assertions, suggesting a possibility that positively worded audit steps in the setting of auditing complex estimates may lead to under-auditing. However, Fukukawa and Mock (2011) do not find a frame effect on risk assessment in valuation assertion testing. In a follow-up study, Mock and Fukukawa (2015) find a frame effect—but only after including additional information that made the setting less vague. However, the frame manipulation in these studies was achieved by general, rather than attribute, frame manipulation. Frame manipulation in these studies changed more than one attribute, manipulating attention on positive versus negative aspects of the task and manipulating the task's wording to be expressed in noun versus verb form.4 Such compound manipulation is appropriate for some experiments but precludes the conclusion of which manipulated attribute is responsible for the framing effect.
Bedard and Graham (2002; hereafter, BG) contribute to this research by suggesting that one of the reasons for the prior mixed results on frame effects could be attributed to weak frame manipulations. This explanation is consistent with the suggestion in psychology literature: attribute framing requires a manipulation that is strong enough to affect the encoding and representation of information relevant to the audit step (Levin and Gaeth 1988). In their study, BG use a compound manipulation of frame wherein they vary the wording of individual risk statements, and also manipulate the decision aid's orientation on losses by way of mentioning versus not mentioning the risks of litigation and reputation damage. The results show that, for high-risk clients, auditors identify more risks using negative decision aid orientation over using positive decision aid orientation. We extend this line of research by manipulating only one attribute (positive versus negative frame) in the wording of each audit step of a Level-3 fair-value audit program. This allows us to conclude that positive versus negative frame manipulation was responsible for the effect.
In summation, prior research suggests that a negative frame will lead reviewers to expect preparers to process risks related to the audit step more thoroughly and critically, requiring more audit time. Thus, we expect a negative frame to result in a larger audit time budget:
Auditors assign more audit hours when steps are framed negatively than when steps are framed positively.
We believe that examining step framing for the setting of fair-value auditing is an important extension of prior studies on the effect of framing. These prior studies suggest that auditors view the risks of material misstatement in fair values as being very high (e.g., see Christensen et al. 2012; Griffith et al. 2015) and potentially irreducible (e.g., see Bell and Griffin 2012; Cannon and Bedard 2017). Given the auditors' widespread view that the risk of misstatement in fair-value estimates is very high, auditors may tend to develop large time budgets that are not sensitive to frame variations in the audit steps. When a time budget is very high by default, framing audit steps negatively may not increase the time budget further. The settings examined by the earlier studies (e.g., going concern analysis and analysis of accounts receivable) may have been more sensitive to the changes in audit step frames. This is because auditors may not generally view those areas by default as presenting a high risk of material misstatement and requiring high time budgets. Thus, based on the prior research, it is unclear whether we would find support for the framing effect predicted in H1 in the setting of fair-value auditing.
In addition, we believe that examining how a negative frame affects time budgeting is important because audit steps are typically framed positively in practice. We gathered two types of additional data to support this assertion. First, a group of 93 auditors (with an average of six years of public accounting experience) from six large firms respond to a survey question about how audit procedures are usually framed in practice. Auditors selected their response from seven options in counterbalanced order anchored at “Procedures are always framed positively” and “Procedures are always framed negatively.” Overall, 91.4 percent of the respondents indicated that procedures tend to be framed positively. Second, two Ph.D. students rated the extent to which audit steps in manuals from two Big 4 firms for auditing fair-value estimates are framed positively versus negatively.5 Results indicated that 64 percent of all audit steps are framed (the other 36 percent do not appear to be framed), and that 94 percent of these framed steps are framed positively. These additional analyses support the notion that positively framed steps are the norm—both generally in auditing, and specifically to the guidance used by participants' firms to indicate steps for auditing fair-value estimates.
Interaction between Step Frame and Step Verifiability
We also investigate whether the effect of frame on audit planning judgments differs in a predictable manner, depending on auditors' perceptions of a step's verifiability. Psychology literature suggests that a framing effect requires a range of reasonable choices available to the decision maker, and that this effect has the potential to be more significant when the range of choices is greater (e.g., Levin et al. 1986; Beach, Puto, Heckler, Naylor, and Marble 1996). Auditors may view some steps as more difficult to verify because those steps are more complex—the auditors performing those steps may have had a wider range of audit procedure choices because they had to assess a wider range of risks that were being tested by the audit step. Generally, identifying more specific risks relevant to the step should lead to a greater amount of assigned audit time to allow for obtaining the necessary evidence. Thus, some low-verifiability steps may require a greater amount of audit time. In addition, since a negative frame leads to more thorough processing of the risks, the frame should have a greater effect when audit steps have lower verifiability.
For example, consider the following audit step: testing the reasonableness of the client's assumption about the level of growth in annual rental revenue. This step is likely difficult to verify because it is complex—the auditor has a wide range of procedure choices to address a wide range of risks. Such risks may include that the management's assumption is based on inappropriate comparisons with other rental properties, or that the current state of the rental market is too unstable to allow a reasonable prediction of annual growth rate. Depending on the identification and assessment of risks, the auditor may choose to undertake only one, or a number of, audit procedures to complete the audit step. The more risks the auditor decides to examine, the more procedures the auditor may have to perform. In this instance, the auditor may choose to examine the relevant clause in the contract, verify the customer's existence and financial standing, review a history of comparable rental agreements, obtain and review local rental market reports, etc. The more procedures the auditor chooses in response to risks, the higher the time budget should be. A negative frame should lead auditors to process risks more thoroughly and critically; so when audit steps have lower verifiability, and thus likely more risks for auditors to consider addressing with procedures, a negative frame should have a greater impact on the time budget.
Conversely, consider the audit step of mathematically recalculating a terminal value of a rental building based on the management's method. This step is likely relatively easy to verify because it is relatively simple—the auditor has a very narrow range of procedure choices to address a very narrow range of risks. There is essentially only one risk that is relevant to this step: that the arithmetic calculation is incorrect. If this calculation was provided in an Excel spreadsheet, then the procedure to address this step would generally involve verifying that the Excel formula uses the right cells and the right arithmetic to calculate the terminal value. Auditors should identify this risk regardless of how thoroughly they are processing risks. Since frame affects thoroughness of risk processing, when audit steps have higher verifiability and, therefore, likely fewer risks for auditors to consider addressing with procedures, a negative frame should have a smaller effect on the time budget.
This combined reasoning suggests that the framing effect's strength may depend on the auditor's perception of the step's verifiability. More specifically, lower verifiability may lead to greater framing effects on budgeted audit time. Therefore, we predict that a negative frame will increase budgeted time more for steps that auditors view as less verifiable:
The extent to which an auditor assigns more audit hours when steps are framed negatively is greater for steps perceived as less verifiable.
Overview of Experiment and Participants
Practicing auditors make audit planning judgments with respect to 15 audit steps in an experiment in which the audit step frame (positive/negative) is manipulated between participants who budget time for each step. Participants budget hours under the assumption that there is relatively low budget pressure (low-efficiency pressure). Then, participants assess the probability that no material misstatement exists (achieved audit assurance), assuming that the work was done within this time budget and no material misstatement was found. Participants then budget for each of the 15 steps again, assuming relatively high pressure to budget few hours (high-efficiency pressure) and, assuming that the work was done within this time budget and no material misstatement was found, assesses the achieved audit assurance again.6 Participants then rate the extent to which the audit work quality for each of the 15 steps can be verified in the review process. These verifiability ratings are the basis for our low- or high-verifiability step classification for testing H2. Participants finish the experiment by answering supplemental questions and providing demographic data. Figure 1 outlines our experiment.
The participants include 50 audit managers and senior managers from two Big 4 firms, with an average of 10.7 years of experience.7 Participants had worked on an average of 10.3 audits where they audited the valuation model underlying an asset or liability fair value, as well as an average of two audits of real estate investment companies. Participants were recruited by a senior representative of their firm and completed the experiment by accessing the experimental materials online.
For each of the 15 steps, participants assign a number of planned audit hours in 15-minute increments, first under low-efficiency pressure and then under high-efficiency pressure. Measuring the dependent variable in this way allows us to examine our hypotheses at high and low levels of efficiency pressure. We focus our results and perform our tests and analyses using the first measure of audit hours: budgeted hours given low-efficiency pressure. However, we also report whether the results hold under high-efficiency pressure, which is common to audit environments (see Knechel et al.  for a review).
Step Frame is manipulated between participants by using a positive or negative frame in step wording. An example of a positive framed step is, “Assess whether management's forecasts and projections have been accurate historically.” An example of the same step under a negative frame is, “Assess whether management's forecasts and projections have not been accurate historically.” This frame manipulation is designed to hold constant the participants' views about the level of assurance provided by the audit, and to produce alternative frames that participants view as complementary and equivalent in terms of the audit task's implications (e.g., assumptions that are “not reasonable” are not “reasonable,” such that P [assumptions are reasonable] = 1– P [assumptions are not reasonable]).
Perceived Step Verifiability ratings for each of the steps included in Appendix A are measured for each participant. Each participant indicates for each step the extent to which the quality with which the senior performed the step could be verified in the audit review process. A rating of 1 (7) indicates no (complete) ability to verify that the step was performed well.
Task and Step
The task is adapted from training materials used at a Big 4 accounting firm. First, experimental materials were pilot tested with two senior partners specializing in auditing fair values, and were modified based on their comments. Then the materials were pilot tested with 12 audit managers from the two audit firms providing participants, and then again modified. The two audit partners reviewed the final instrument and confirmed that the instrument is realistic and consistent with their firm's audit approach and policies.
The task invites participants to assume the role of an audit manager responsible for planning the audit of a rental property that is the largest asset on ABC Investment Corporation's balance sheet. The property is classified as Level-3 in the FASB's fair-value hierarchy, and participants are provided with the client's discounted cash flow model used to value the property (calculating a present value of $10.6 million), and a list of audit steps used by their firm for similar audits.8 Participants are informed that comparable properties average 30 to 40 preparer hours (excluding manager and partner review), but vary considerably between clients, inherent and control risks are assessed as sufficiently high to not allow significant modifications to subsequent substantive testing, the partner concluded no specialist was needed for this work, and that the person performing the steps and reporting to the participant would be an audit senior experienced on this audit. Participants are also told that misstatements totaling $250,000 would be considered material.
After reviewing the client's schedule of prospective cash flows and the discounted cash flow calculation, participants receive the low-efficiency pressure manipulation and budget audit effort for the 15 audit steps, with the steps framed positively or negatively, depending on the assigned framing condition. All participants receive steps in the same order that is used in the audit firm's materials from which the case is adapted. In order to preserve the natural order of steps that auditors follow, we do not counterbalance it. We note that while the order could potentially have an effect on the overall level of the assigned audit hours (or on the average verifiability), this impact cannot explain the effects of frame or the interaction of frame and verifiability on audit hours. Participants estimate achieved assurance while assuming that the steps were performed within the time budget and no material misstatement was found. Participants then receive the high-efficiency pressure manipulation, and repeat the process of budgeting audit hours (adjusting the previous hours budgeted under low-time pressure) and estimating achieved audit assurance.
Next, each participant rates the extent to which they believe that the quality with which the senior performed each of the 15 steps could be verified in the audit review process. Data collection concludes with the participants answering some supplemental and demographic questions.
To measure whether participants attended to the framing manipulation, a supplemental question was asked to indicate whether audit steps were written to assess whether a particular condition had been met (e.g., “Assess whether management's forecasts and projections have been accurate historically”), not met (e.g., “Assess whether management's forecasts and projections have not been accurate historically”), or they did not recall. Forty-two (86 percent) of 50 participants answered the manipulation check correctly, indicating that the frame manipulation was successful. Three participants indicated that they could not remember whether the frame was positive or negative, and one participant did not answer the question. Performance on the manipulation check did not differ between framing conditions (Pearson Chi-squared value = 0.27; p < 0.60). Removing the eight participants who failed the manipulation check or did not answer the question does not affect the conclusions. Thus, we base our analyses on all 50 participants.
Tests of Hypotheses
Effect of Frame
Table 1, Panel A includes descriptive statistics of auditors' planning judgments by step and step frame. Consistent with H1, when efficiency pressure is low, participants budget significantly more audit hours given a negative frame than they do given a positive frame (F = 5.059; p-value < 0.013, Table 1, Panel B).9 Under the negative frame, mean effort is higher for 12 of 15 steps (Table 1, Panel A). For the three steps with higher mean effort under the positive frame, the difference is insignificant. At the audit program level, auditors budgeted 31 hours on average for all 15 steps when these steps were framed positively, and 38 hours on average when these steps were framed negatively. Untabulated ANOVA and nonparametric (median, Mann-Whitney, and Kruskal-Wallis) tests show that this difference is significant at a 5 percent level. Untabulated results show that the framing effect is also significant when efficiency pressure is high (F = 9.377; p-value < 0.01), and when the two efficiency pressure conditions are combined (F = 13.105; p-value < 0.01). These results support H1, indicating that auditors plan more audit hours when steps are framed negatively.
Combined Effect of Frame and Perceived Verifiability
H2 predicts that the effect of frame on the allocation of audit hours is more severe for steps perceived as less verifiable. Table 2 shows the average verifiability rating for each step. Since H2 predicts an ordinal interaction with a specific pattern (greater effect of frame in lower verifiability), we use planned contrast tests as suggested by Buckless and Ravenscroft (1990) to test this hypothesis. This test requires the independent variables to be categorical. Therefore, we transform the continuous verifiability variable into a categorical variable with two values—high or low verifiability—by comparing the step's average verifiability rating with that of all steps combined.10
This approach yields classifications that appear intuitive and reasonable. Specifically, our approach classifies Steps 3, 4, 5, 6, 8, 10, 11, 12, and 13 as low verifiability because their mean verifiability is below the overall mean of 5.54 (see Table 2). These steps address the model's assumptions (Steps 3, 4, and 5), ensuring that relevant contractual terms have been incorporated into the model (Step 6), ensuring sufficiency and consistency of evidence (Steps 8, 10, and 11), and making high-level judgments about fair value (Steps 12 and 13). Our approach classifies Steps 1, 2, 7, 9, 14, and 15 as having high verifiability because their mean verifiability is above the overall mean of 5.54. These steps address reviewing the historical accuracy of management's forecasts (Step 1), ensuring accuracy of reconciliation and calculations (Steps 2, 7, and 9), and assessing completeness and conformity of disclosures with GAAP (Steps 14 and 15).
Table 3 shows descriptive statistics by frame and verifiability. Table 3, Panel A shows that, in the low-efficiency pressure condition, the effect of frame was slightly higher in high verifiability compared to low verifiability (0.46 hours versus 0.45 hours), contrary to H2.
As we suggest in our theory for H2, the ability to verify the quality of audit step's performance may reflect the extent to which identifying more risks (by way of the negative frame) would lead to more audit hours. When audit budgets can be set without constraints, it is likely that auditors simply assign hours based on the number of risks they identify for each audit step, while being less sensitive to the effectiveness of increasing audit hours in response to identifying more risks. Thus, verifiability is more likely to moderate the effect of frame when auditors experience efficiency pressure, and are thus more sensitive about whether increasing audit hours in response to more risks will actually be effective. Consistent with this reasoning, in the high-efficiency pressure, Table 3, Panel B shows that the frame effect was greater in the low- compared to high-verifiability conditions (0.59 hours > 0.41 hours), consistent with H2. The comparison between the frame effects is similar when hours are averaged between the low- and high-verifiability conditions (0.52 hours > 0.43 hours).
We test H2 using a contrast test suggested by Guggenmos, Piercey, and Agoglia (2018) for testing whether an effect (i.e., frame) is greater at a different level of a variable (i.e., verifiability) with an incorporated main effect of that variable (i.e., verifiability). This contrast uses the following weights: +1 (low verifiability, positive frame), +3 (low verifiability, negative frame), −2 (high verifiability, positive frame), and −2 (high verifiability, negative frame). This pattern of weights incorporates a greater difference between frames at low verifiability, and a smaller difference at high verifiability, consistent with H2. In addition, this pattern reflects an intuitive prediction that, on average, lower-verifiability steps would require more audit hours. This is in line with our theory for H2 that suggests these steps tend to incorporate more audit work to address a greater number of risks. However, we do not separately hypothesize this main effect prediction because it reflects the nature of Level-3 fair-value auditing, and may or may not be relevant to other auditing areas. Future research could examine whether lower-verifiability steps require more audit hours in other auditing contexts as well. However, this question is beyond the scope of this study.
As recommended by Guggenmos et al. (2018), we perform contrast tests only when descriptive statistics match the hypothesized pattern. At low-efficiency pressure, the descriptive statistics do not match the hypothesized pattern; the effect of frame is somewhat smaller when the verifiability of steps is low compared to when it is high (0.45 < 0.46, Table 3, Panel A). In addition, an ANOVA indicates that the Frame × Verifiability interaction is insignificant (F = 0.52; p = 0.236, Table 4, Panel A). Thus, H2 is not supported at low-efficiency pressure.
We also analyze whether H2 is supported at high-efficiency pressure. The insights from this analysis are somewhat limited, because in our experiment the auditors assessed the time budget at high-efficiency pressure after they assessed it at low-efficiency pressure. Thus, auditors' responses at low-efficiency pressure potentially affected their responses at high-efficiency pressure. However, this analysis can be useful because high-efficiency pressure is the setting in which auditors typically operate (e.g., see Houston 1999; Asare et al 2000). In addition, since the Frame × Verifiability interaction at low-efficiency pressure is insignificant, any carryover effect from budgeting at low-efficiency pressure is unlikely to explain a significant interaction under high-efficiency pressure.
Table 4, Panel B shows that the contrast is significant (0.59 > 0.41, t = 11.207; p-value < 0.001). As recommended by Guggenmos et al. (2018), we also calculate a residual between-cells variance test to assess whether the pattern of results tested by this contrast completely explains the observations in our data. The residual variance test is insignificant (F = 0.278; p-value = 0.841, Table 4, Panel B), indicating that the data fit the pattern tested by the contrast [+1, +3, −2, −2]. We calculate the residual variance test using the following formula outlined by Rosenthal, Rosnow, and Rubin (2000): .11 Consistent with this contrast pattern, Table 4, Panel B also shows that the effect of frame is significant when verifiability is low (2.21 < 2.80, t = 3.111; p-value < 0.001), but not when verifiability is high (1.10 < 1.51, t = 1.073; p-value < 0.284). ANOVA results in Table 4, Panel C show that the effect of the Frame × Verifiability interaction is marginally significant (F = 2.155; p-value < 0.072). Overall, these results provide limited support for H2.12
Effect of Frame on the Verifiability Measure
Since verifiability is a measured variable, there is a potential concern that frame may affect verifiability and that this may impact our conclusions. In order to examine this concern, we first test whether frame has a significant effect on verifiability ratings. We find that verifiability is lower when frame is negative (5.44 out of 7.00), compared to when frame is positive (5.65 out of 7.00) and that this difference is marginally significant (p-value < 0.094). This finding provides limited support to the idea that participants may view steps as being less verifiable at negative frames.
However, this effect does not impact our analyses or conclusions. First, we examine whether the effect of frame on verifiability affects our analyses and conclusions for H1, which predicts that a negative frame leads to more audit hours. We included a raw verifiability score into the model with frame as the predictor, audit hours as the response, and steps as the repeated measure. We noted that, consistent with H1, the effect of frame on audit hours (after including verifiability into the model) continues to be significant at the 5 percent level. Thus, any effect of frame on verifiability has no impact on H1.
Second, we examined whether frame's effect on verifiability may impact our analyses and conclusions for H2, which predicts that the frame's effect on audit hours will be greater when verifiability is lower. We removed the effect of frame on verifiability by saving the residuals from the model where frame is the predictor, verifiability raw score is the response, and step is the repeated measure. The residuals from this model represent verifiability stripped of any effect of the frame. We then classify steps into high or low verifiability based on how the mean residual for each step compares to the mean residual for all steps combined. Classifying steps in this way results in exactly the same steps being assigned to a high- or low-verifiability group, when using the raw verifiability score in the main analysis. Thus, any effect of frame on verifiability has no impact on H2. In summation, we conclude that although frame has a marginally significant effect on verifiability, this effect does not impact our analyses or conclusions for H1 and H2.
Achieved Audit Assurance
During the experiment, participants estimated achieved audit assurance (AAA) under both low- and high-efficiency pressure. This was accomplished by participants' assessing of the probability that no material misstatement exists within ABC's fair values, under the assumption the audit was performed within their time budget and no material misstatement was found. Average estimated AAA was 68.21 and 76.88 under a positive and negative frame, respectively. A one-way ANOVA indicates no significant main effects of frame (F = 1.766; p-value < 0.190) on AAA. Thus, although participants modify their audit programs significantly in response to the frame, their judgments of AAA, assuming that the budgeted audit was completed as planned, are not significantly different. This lack of effect of the frame on AAA, despite a higher time budget, potentially indicates that auditors identify more risks under a negative frame. If auditors believe that under a negative frame they need to address more risks, then higher budgeted time may not lead to an increase in AAA. Future research is needed to examine whether this suggestion is supported.
This paper reports results of an experiment in which 50 audit managers, with an average of over ten years of audit experience, plan the audit of an investment classified as Level-3 of the FASB's and IASB's fair-value hierarchy. The audit step frame (positive, negative) is manipulated between participants, and audit step verifiability varies across 15 steps, with each step's verifiability calculated as the mean of all participants' assessments of that step's verifiability. Results indicate that more hours are planned under a negative step frame, particularly with respect to steps that auditors perceive as less verifiable—such as examining management's assumptions (e.g., Step 3), which recent research reports is a challenging area for auditors (see Bell and Griffin 2012; Griffith et al. 2015; Cannon and Bedard 2017).
Our finding of a main effect of the frame and an interaction between the frame and step verifiability add to prior findings about the effect of the frame in other contexts (e.g., Kida 1984; Trotman and Sng 1989; Asare 1992; Emby 1994; Emby and Finley 1997; Fukukawa and Mock 2011; Mock and Fukukawa 2016). Most notably, we find that the frame has a larger effect for steps in which auditors believe it is more difficult to verify performance quality. From a practice perspective, our results suggest that framing effects potentially could be used by audit firms, as well as by auditing standard setters, to enhance audit effectiveness. This is similar to decision aids that employ “choice architecture” interventions recommended in other contexts (Thaler and Sunstein 2008). Firms that understand auditors' tendency to be influenced by a step frame could design their Level-3 fair-value audit programs to frame steps negatively, thus “nudging” auditors to plan more audit hours, which the PCAOB (2015a) identified as a measure of audit quality. Our results indicate that negatively framing audit steps will be particularly effective when under budget pressure in increasing planned audit effort for the less verifiable steps that likely involve significant judgment. Our findings, in combination with Bedard and Graham (2002), who report that a negative frame increases risk identification, also suggest that the insufficient auditing of fair values reported by the PCAOB may in part be due to incomplete identification of risks during audit planning. Thus, audit costs are likely to increase due to higher audit time budgets, mostly for the areas identified as challenging by PCAOB inspections (see PCAOB 2015b, 2015c, 2015d, 2015e). However, higher audit time budgets may help audit firms address fair-value audit deficiencies identified by the PCAOB in these inspections such as the failure to perform sufficient procedures, omission of key procedures, etc.
The interventions suggested by our results would only be useful if framing effects do not dissipate over time, as auditors become accustomed to a change in the step frame. Extant research in psychology reports that repeated exposure to framing, as well as making participants aware of the framing effect, does not eliminate it—and may even increase it (Levin et al. 1986; Levin, Jasper, and Gaeth 1996; LeBoeuf and Shafir 2003; Chong and Druckman 2007; Druckman, Fein, and Leeper 2012; Baden and Lecheler 2012; Lecheler and de Vreese 2011). This is presumably because repeated consideration increases focus on the aspects highlighted by the frame (Chong and Druckman 2007).
Our research is subject to some additional limitations. First, we use a measured rather than manipulated variable when testing the effect of step verifiability. While our results indicate it is unlikely that carryover or selection biases could account for our evidence of the interactive effect of frame and verifiability on audit planning judgments, we cannot preclude that possibility. Second, although the PCAOB views audit hours as an important input into audit quality, auditors tend to believe that the fair-value risks may not be reducible. Thus, it is unclear whether budgeting additional audit time due to a negative frame would actually improve audit quality. We call for future research to examine this important question. Third, in our experiment, the low-efficiency pressure condition always preceded the high-efficiency pressure condition. We have no reason to expect results to differ under an alternative order, but cannot test that assertion with our data. Finally, our study does not provide insight into how the effects we report would vary with variations in other important factors common in the fair-value audit environment, such as complexity, risk of misstatement, materiality of the account, and others. We call for future research into examining these important questions.
The Institutional Review Board at Cornell University, where the data were collected using Qualtrics software, granted approval for this experiment.
For each of the Big 4, PCAOB inspections identify insufficiency of the extent of testing as one of the deficiencies in testing fair values. For example, the PCAOB mentions “failure to sufficiently test significant assumptions … used in developing an estimate” (PCAOB 2015b), failure “to perform any procedures to test valuation of … securities, for which fair value was determined by the issuer through an internal valuation model” (PCAOB 2015d), etc.
Notably, Levin et al. (1998) do not use prospect theory proposed in Tversky and Kahneman (1981; hereafter, TK) to explain the mechanism underlying attribute framing. This is because the risky choice framing discussed in TK involved framing each option in an option set, whereas attribute framing involves manipulating only one attribute of the same choice (see Levin et al.  for further discussion).
Both studies manipulate frame as follows (emphasis added [as italic for form and underline for frame]): Positive frame, “The valuation of accounts receivable on the 2004 balance sheet is proper.” Negative frame, “The accounts receivable on the 2004 balance sheet is not properly valued.”
Kappa coefficients assessing the agreement of the two raters were 0.45 (audit manual of Firm 1; p-value < 0.01) and 0.54 (audit manual of Firm 2; p-value < 0.01), which exceed the threshold of 0.40 generally viewed as indicating moderate reliability (Landis and Koch 1977). Disagreements occurred primarily because one rater initially used the “not framed” category more often than the other rater used it. The raters resolved these disagreements most often by concluding that if in doubt, a procedure should be labeled “not framed.”
In the low-efficiency pressure, participants are asked to assume a relatively low time pressure. In the high-efficiency pressure, participants are told that the partner asked them to determine the minimum amount of time that could be allocated to each procedure while still providing appropriate assurance that audit objectives have been met. Based on consultations with two audit partners, this order of manipulations matches how auditors might approach the budget—first, consider the work that needs to be done (low-efficiency pressure) and then consider the lowest acceptable amount of time to perform the required work (high-efficiency pressure). Consistent with prior research in accounting (e.g., Houston 1999), efficiency pressure had a significant effect on audit hours: auditors budgeted less time when efficiency pressure was lower (p-value < 0.01). However, we do not find support for our prediction that efficiency pressure will interact with frame. We also find no interaction of efficiency pressure with verifiability. For conciseness, we remove the hypothesis related to efficiency pressure and omit further discussion of the main effect of efficiency pressure.
One participant did not respond to demographic questions. Results of analyses are similar with that participant excluded, but the main analyses are based on all 50 participants to reflect all data obtained. Twenty-one participants are from one firm, while 29 are from the other. Firm effects are insignificant and do not affect any of our results. Of the 49 participants who responded to demographic questions, 29 are managers and 20 are senior managers. Participants are assigned randomly to experimental manipulations.
We used an asset classified as Level-3 in the FASB's and IASB's fair-value hierarchy because we wanted to make sure the audit plan had to address a fair-value estimate that included relatively high uncertainty.
All models include the Procedure variable as repeated measures. Including Procedure as the fixed effects variable produces very similar results. All p-values examining directional predictions are one-sided.
Classifying verifiability based on medians produces very similar results: descriptive statistics and analyses support H2 at the same levels of significance as classifying verifiability based on means. However, classifying based on medians results in significant loss of observations that are equal to medians because these observations do not fall under either high or low verifiability. Thus, we focus on means and do not discuss median-based classifications further.
Where Fbetween is the F-value of the test examining whether the means of any of the four cells defined by our design differ (Fbetween = 42.051), DFbetween is the numerator degrees of freedom for the Fbetween test (DFbetween = 3), and DFresidual is the numerator degrees of freedom for the Fresidual test (DFresidual = DFbetween − 1 = 2). The denominator degrees of freedom are the same for both the Fbetween and Fresidual tests.
We also assessed whether our theory applies to participants' perceptions of verifiability relative to their own average verifiability rating for all 15 audit steps. For each participant, we classified an audit step as “high” (“low”) when the participant's rating of that procedure was higher (lower) than that participant's average rating of all 15 steps. The results using this method support H2 (contrast test p-value < 0.01 and residual variance test p-value > 0.1). The effect of frame is significant only when verifiability is low (p-value < 0.01), but not when verifiability is high (p-value > 0.1). The ANOVA interaction of Frame × Verifiability is also significant (p-value < 0.05).
We are grateful to Richard C. Hatfield (editor) and the anonymous referees. We also thank Lindsay Andiola, Jeremy Bentley, Rob Bloomfield, Scott Emett, Devon Erickson, Jackie Hammersley, Max Hewitt, Pat Hopkins, Susan Krische, Christian Leuz, Bob Libby, Dave Ricchiute, Hun Tong Tan, and workshop participants at American University, Brigham Young University, Case Western Reserve University, The George Washington University, Indiana University, Iowa State University, Northwestern University, University of Notre Dame, and the 2016 International Symposium on Auditing Research for comments on this paper. We also thank Jeremy Bentley and Scott Emett for research assistance.
Funding for the research project described in this article was provided by the Center for Audit Quality. However, the views expressed in this article and its content are those of the authors alone and not those of the Center for Audit Quality.
Editor's note: Accepted by Richard C. Hatfield.