## ABSTRACT

Using a large sample of nonprofit organizations in the United States reporting governance information on their IRS Forms 990, we develop and evaluate several different composite measures of nonprofit governance. These measures can be used to control for governance broadly in a variety of settings, including research that examines nonprofit funding, reporting quality, and executive compensation. Our results suggest that relatively basic indices perform as well as, and in some cases better than, more complex indices. In fact, when controlling for governance in the standard donations model, the collective evidence indicates that an index computed using the simple sum of five binary indicators (audit committee, majority independent board, no outsourcing, CEO salary review, and information available on the organization's website) performs best.

Data Availability: All data are publicly available.

## I. INTRODUCTION

This paper evaluates the relative effectiveness of different measurement approaches to control for corporate governance in nonprofit research. In 2008, the Internal Revenue Service (IRS) instituted a significant change in reporting by requiring nonprofit organizations to answer a comprehensive list of governance questions on the IRS Form 990, Return of Organizations Exempt from Income Tax (Form 990). The broad array of governance policies included on the Form 990 illustrates that nonprofit governance is not a single attribute, but rather has multiple dimensions. While researchers have begun to use these data to answer a variety of research questions, little work has been done to guide researchers on how to aggregate the wealth of governance information now available. We aim to begin the methodological discussion by comparing how different measures from the Form 990 disclosures reflect the complex construct of governance, which may be useful to researchers seeking to control for governance in their empirical models.

Research in the nonprofit arena examines a variety of topics including, but not limited to, funder decisions (e.g., donations received), board decisions (e.g., compensation structure), managerial behavior (e.g., financial reporting choices), and mission performance measures (e.g., program efficiency). Governance likely affects many, if not all, key aspects of a nonprofit organization's operations. In fact, research suggests that good governance is associated with higher contributions (Olson 2000; Kitching 2009; Saxton, Neely, and Guo 2014; Harris, Petrovits, and Yetman 2015), improved pay-for-performance (Newton 2015), more accurate expense reporting (M. Yetman and R. Yetman 2012), fewer executive perquisites (Balsam, Harris, and Saxton 2020), higher reported mission-related spending (Callen, Klein, and Tinkelman 2003; Desai and Yetman 2015), and a reduced likelihood of fraud (Harris, Petrovits, and Yetman 2017b).

For many questions investigated by nonprofit researchers, governance may be a determinant of the dependent variable and be correlated with the independent variable of interest. For example, if a researcher is examining whether donors respond to executive compensation levels, prior research provides evidence that both donations and compensation are associated with governance (Harris et al. 2015; Newton 2015). Consequently, excluding measures of governance may result in biased inferences. Thus, it is important for researchers to include a control for governance.

The availability of governance data on the Form 990 makes controlling for governance in large-scale empirical studies feasible. Yet, the question of how to measure governance still exists. Governance in the nonprofit sector is a complex, multifaceted construct. Moreover, while the governance disclosures on the Form 990 represent different policies, most of them, not surprisingly, are significantly correlated with each other. From a practical standpoint, researchers will benefit from an understanding of how various measurement approaches compare in effectiveness in controlling for governance.

Approaches for measuring governance can be evaluated on multiple criteria including validity, consistency, replicability, and comprehensiveness. A measure is valid if it represents what it purports to represent, in this case, the key aspects of nonprofit governance. A measure is consistent if it can be created in the same manner across different organizations and over time. A measure is replicable if other researchers can recreate it without the process being burdensome. Finally, a measure is comprehensive if it embodies all of the important dimensions of governance. An ideal measure of governance would embody all of these attributes, but researchers often make tradeoffs across these dimensions because of data or methodological constraints. Moreover, the most theoretically sound measure is specific to the research question and, thus, no one universal proxy for governance exists.1 Nevertheless, it is useful to document how different measures compare to each other in different models.

Building on prior literature, we develop measures of nonprofit governance based on information contained in the Form 990 using four approaches.2 The first approach utilizes all of the raw governance disclosures individually, and then simultaneously. In the second approach, we create composite measures using factor analysis. For the third and fourth approaches, we construct indices using the sum of binary indicators from the raw governance disclosures. In the third approach, we weight index components based on variability (using their respective standard deviations), whereas in the fourth approach we weight index components equally.

We next assess the relative effectiveness of these measures by including them in five different models of interest to nonprofit researchers; namely, charitable contributions, government support, reporting quality, executive compensation, and program efficiency. Specifically, we examine five response variables: direct donations, government grants, zero reported fundraising, pay ratios, and program ratios. These models are selected based on prior research studies that have established a theoretical and empirical relationship between the response variable and governance. We check whether the coefficients on the governance measures are significant in the predicted direction and eliminate approaches where multicollinearity problems are evident. Because we are interested in model selection, we document and compare the adjusted coefficient of determination (adjusted R2) and the Akaike (1983) Information Criteria (AIC) for each model.

We use a sample of 16,824 nonprofit organizations in the United States reporting on governance from 2008–2012. We find that several governance indices are very similar in their ability to serve as a control for nonprofit governance. In fact, relatively basic governance indices perform as well as, or better than, more complex measures. When estimating the charitable contributions model, Governance Index5, an index computed as the sum of five binary indicators (presence of an audit committee, a majority independent board, no outsourcing of management functions, CEO salary review policy, and key information being available on an organization's website), provides the best fit of all composite measures examined while also not exhibiting multicollinearity issues. Governance Index5 also performs well for the government support and reporting quality models. For the executive compensation model, a simple index of 12 governance policies performs best, although Governance Index5 produces fit statistics that are very close in value to Governance Index12. As we detail later in the paper, our results are inconclusive regarding the best measure to control for governance in our program efficiency model.3 To help ensure the within-sample analysis does not drive our results, we replicate our analysis separately for 2013–2014 and reach similar conclusions.

Our results indicate that researchers are not sacrificing explanatory power by using a simple sum of indicators that reflect key governance policies when controlling for governance in models of funding, reporting quality, and compensation. This evidence is consistent with simple indices reflecting predictive and convergent validity. Simple indices are also easy to replicate consistently over time and across organizations. Our results and recommendation can be used to assist researchers in developing empirical models that address a wide range of nonprofit research questions. While the governance indices examined in this paper perform similar to each other, we recommend Governance Index5 when controlling for governance in nonprofit research.

In Section II, we discuss governance reporting on the Form 990 and briefly review research examining the effects of nonprofit governance. Following that, in Section III, we create composite measures for the multidimensional construct of governance using four different approaches. In Section IV, we describe our empirical methodology and present our results. In Section V, we note some caveats and discuss the implications of our findings for future research.

## II. BACKGROUND ON NONPROFIT GOVERNANCE

Nonprofit corporate governance refers broadly to the set of internal and external mechanisms designed to ensure that managers are working to advance their organization's charitable mission and meet their fiduciary responsibilities. Effective governance reduces the likelihood of misuse of charitable assets and better aligns nonprofit managers' personal objectives with those of their organization and the public they serve.

Nonprofit organizations are established to fulfill a charitable purpose and, unlike for-profit organizations, have no shareholders for board members and managers to appease. In the absence of owners, regulatory authorities play a key role in nonprofit accountability. Specifically, the IRS requires most nonprofit organizations above a specified size to file a Form 990.4 Because organizations are required to make these forms publicly available, the Form 990 serves as the primary source of information on a nonprofit organization's activities and financial metrics.

In 2008, the IRS updated the Form 990 to include a comprehensive list of governance questions. Nonprofit organizations now report on a range of governance mechanisms, such as board independence, conflict of interest policies, the existence of an audit committee, board practices for reviewing the Form 990, and executive compensation policies. It is important to note that the IRS does not mandate the adoption of any governance policies. Instead, the IRS utilizes a disclosure approach whereby organizations must indicate whether they have adopted a given policy. Evidence suggests that, in response to these required disclosures, nonprofit organizations have more frequently adopted some but not all of the governance policies on the form (Boland, Hogan, and Johnson 2018). The nonprofit sector provides an interesting contrast to the for-profit sector, where the Securities and Exchange Commission and stock exchanges mandate the adoption of specific governance policies. Because nonprofit organizations differ in their adoption of policies, there is cross-sectional variability in governance, which increases the importance of controlling for governance in nonprofit research.

Other governance mechanisms exist outside those reported in the Form 990. For example, BoardSource (2016), a leading provider of support and training for nonprofit executives, emphasizes organizational culture and recommends several policies beyond those reported on the Form 990, such as new member orientation, board term limits and attendance policies, as well as diversity and inclusion initiatives. We examine the Form 990 policies because they are available for most nonprofit organizations in a machine-readable format, whereas other governance mechanisms are challenging to observe. It is worth noting that the Form 990 polices do represent the most significant suggestions from both BoardSource (2016) and the Independent Sector (2016).

The academic literature on nonprofit governance has grown in recent years, likely in part due to the availability of the Form 990 governance data. Prior research provides evidence that the adoption of governance policies is associated with more donations, more accurate financial reporting, improved pay-for-performance, higher reported mission-related spending, and fewer instances of asset misappropriation (Callen et al. 2003; Kitching 2009; Yetman and Yetman 2012; Saxton et al. 2014; Harris et al. 2015; Desai and Yetman 2015; Newton 2015; Harris et al. 2017b). Overall, this literature suggests that governance influences many different aspects of a nonprofit organization. Researchers must consider whether it is necessary to include a measure of governance in their empirical specifications to control for this influence, even if the research question does not specifically relate to governance.

Prior research operationalizes governance using a variety of empirical measures, depending on the specific research question. For example, Kitching (2009) and Aggarwal, Evans, and Nanda (2012) include one specific indicator of governance; Yetman and Yetman (2012) and Harris et al. (2017b) include multiple indicators simultaneously; Harris et al. (2015) develop governance measures using a factor analysis; Saxton et al. (2014) develop an index using a simple summation of components; and Newton (2015) creates an index weighted by the components variability.5 While each of these measurement approaches is appropriate for the given research question and research design, no standard method for measuring governance has emerged. Researchers seeking to control for governance in their studies may appreciate guidance on the effectiveness of the various approaches, and thus, in the next section, we build off the prior research to develop and compare several measurement approaches to control for governance.

## III. DATA AND CONSTRUCTION OF GOVERNANCE MEASURES

### Sample Selection

Table 1, Panels A–C, provides information on our sample selection process. We obtain governance disclosures from the Form 990 for all organizations in the annual IRS Statistics of Income (SOI) files from 2008 to 2012.6 We use SOI data because they are the most common source in nonprofit research and, thus, our results provide the most relevance to a wide range of studies. We reduce our sample by the 511 observations that are missing necessary data for the models. Specifically, our final sample consists of 51,904 firm-year observations for 16,824 unique organizations.7 In terms of industry composition, 24 percent of our sample comes from human services, followed by 18 percent from hospitals, 14 percent from other health (excluding hospitals), 8 percent from universities, 14 percent from other education (excluding universities), 6 percent from arts, and the remaining 16 percent from various other industries.

TABLE 1

Sample Information

### Development of Governance Measures: Four Approaches

The governance systems in most nonprofit organizations are multifaceted, comprised of a number of policies and procedures related to different aspects of monitoring and directing managerial decisions. Our objective is to develop and evaluate measures that can serve as proxies for these governance systems in nonprofit research that uses archival data. There is no universally accepted process for developing these governance measures. We create measures using four approaches to represent the information from the Form 990 disclosures. In the first approach, we use the raw governance variables directly from the Form 990. Under the remaining three approaches, we transform the raw data into composite measures. For every approach, we create our measures such that higher values reflect stronger governance. As discussed in Section IV, we run a “horse race” to determine what approach provides the best model fit; we have no a priori expectations that one approach will perform better than the others. All governance measures and the other variables used in our analysis are defined in Appendix A.

#### APPROACH 1: Raw Form 990 Governance Variables

We identify 17 governance policies reported on the Form 990, containing most of the items in Part VI, as well as audit information from Part XI. Except for board independence, the Form 990 provides a “yes” or “no” response to the question of whether an organization has adopted a given policy, which we convert into indicator variables increasing in the strength of governance. The indicator variable for board independence is set to 1 if the organization's board has at least five voting members and the majority is independent, which is based on the criteria defining a strong governing body by Charity Navigator (2019), a leading provider of charity ratings.8

Following prior literature that includes individual measures of governance (Kitching 2009; Aggarwal et al. 2012), as well as multiple indicators simultaneously (Yetman and Yetman 2012; Harris et al. 2017b), we measure governance first by including each of the 17 governance variables in separate empirical models and then by including them all simultaneously in one model. The benefit of using any one variable is that it is easy to implement, whereas the cost is that one variable may not fully represent the entire governance system. Including all 17 governance variables simultaneously captures different aspects of governance, but the loss in the degrees of freedom may make this approach impractical for some samples. Moreover, multicollinearity among the governance variables may be a concern.

Panel A of Table 2 provides descriptive statistics for these variables. On average, organizations are more likely than not to adopt a given governance policy, although there is variation in the rate of adoption. For example, 89 percent of firm-year observations report having a conflict of interest policy, while 77 percent of firm-year observations report having a policy to review the CEO's salary. Providing key information on the organization's website is the least common policy, with only 11 percent of firm-year observations doing so.

TABLE 2

Descriptive Statistics

Panels B and C of Table 2 provide the correlations between these 17 governance variables. Nearly all of the variables are significantly correlated with each other, and, except for correlations between various variables with No Relations and/or No Outsourcing, the significant correlations are all positive. We report some very high coefficients of correlation (e.g., the correlation between Conflict Policy and Officers Conflict is 0.825), suggesting that some of the governance variables provide similar information regarding governance.9 Our next approach takes advantage of the correlations between the variables to create more parsimonious measures of governance.

#### APPROACH 2: Factor Analysis

In our second approach to measuring governance, we follow Harris et al. (2015) and implement a factor analysis, which extracts the common variance in the observable Form 990 variables to identify governance factors, which presumably represent underlying governance dimensions with less error than the observable variables. We include the 17 variables in a principal component factor analysis with a promax rotation and report the results in Appendix B. From this analysis, we identify six governance factors with eigenvalues greater than 1, which together explains 86 percent of the variance in the raw data. We interpret and label the six factors by examining the underlying Form 990 governance variables that have a substantive association with them, which we evaluate as a factor loading greater than 0.4. The six factors are Policies Factor, Compensation Factor, Audit Factor, Minutes Factor, Transparency Factor, and Management Factor.10 When using this approach to control for governance, we include all factors concurrently in a model because, by construction, each represents a different governance dimension. Descriptive statistics are reported in Panel A of Table 2.

The benefit of this approach is that it is a methodologically sound way to aggregate information from a large number of variables. In particular, this approach takes advantage of correlations to identify underlying governance dimensions. The downside is that it is more challenging to implement; separate factors must be computed for each sample because it is not appropriate to apply factor loadings from one sample to another.

Additionally, there is debate among econometricians over whether binary variables, such as the raw governance variables, should be used in exploratory factor analysis because factor analysis assumes continuous, normally distributed input variables. The concern with using binary measures in factor analysis is that the factors will be based on the structural variables with similar distributions rather than similar attributes, which can reduce the meaningfulness of the factors. Despite this debate, exploratory factor analysis is commonly used with discrete measures because alternative techniques have limitations as well (Comrey and Levonian 1958; Percy 1976). The resulting factors from our factor analysis in Appendix B “make sense” in that the underlying governance variables load together in an intuitively reasonable way (e.g., different compensation policies load in the same factor; different audit practices load in the same factor), which alleviates the concerns regarding the use of binary variables. Moreover, in our study, factor analysis simply provides one approach to controlling for governance. It is valuable for researchers to know whether the governance factors are meaningful, and our results will speak to this. If this approach does not produce a plausible control for governance, then the factors will perform poorly relative to other approaches.

#### APPROACH 3: Weighted Governance Index

In our third approach, we create a weighted index in the same vein as Newton (2015), who uses a weighted governance index to examine the link between governance and pay-for-performance.11 Following Newton (2015), we sum the governance indicators but weight each by its standard deviation. Indicators with more variation, therefore, have more weight in the formation of the index. Specifically, we create All 990 Weighted Index, which is the sum of the 17 Form 990 variables, each weighted by its standard deviation. These weights (i.e., the standard deviations) are reported in the last column of Panel A, Table 2.

The benefit of a weighted index approach is that it takes advantage of the fact that the variability in responses to governance questions contains information. Questions in which virtually all organizations answer the same way are less informative and, therefore, less useful to researchers. For example, 98 percent of organizations report that they keep minutes of board meetings and, thus, Minutes Gov has the lowest weight in the All 990 Weighted Index. Moreover, this approach results in a single governance measure rather than the multiple measures produced by factor analysis. The downside of this approach is that weighted indices are sample specific and are more complicated to compute than more simple indices.

#### APPROACH 4: Simple Governance Indices

Finally, we create indices by summing (i.e., equally weighting) various governance indicators similar to Gompers, Ishii, and Metrick (2003) and Saxton et al. (2014). Specifically, we develop six different indices using different combinations of the 17 Form 990 variables. While our approach to creating these indices is admittedly ad hoc, we aim to develop an easy-to-implement measure that reflects a range of governance policies.12

Governance Index17 is the sum of all 17 indicator variables. Governance Index12 takes into account that some of the 17 variables reflect similar information. Specifically, to compute Governance Index12, we remove governance variables that have a correlation greater than 0.65 with another governance variable. As seen in Panels B and C of Table 2, Officers Conflict and Enforce Conflict are highly correlated both with each other and with Conflict Policy; Doc Destruction is highly correlated with Whistleblower; Officer Salary is highly correlated with CEO Salary; and Audit Comm is highly correlated with Review or Audit. Thus, Governance Index12 includes all of the governance indicators except Officers Conflict, Enforce Conflict, Doc Destruction, Officer Salary, and Audit Comm because these five indicators likely reflect “repeat” information.13

We base Governance Index7 on the results of our factor analysis described in APPROACH 2 and reported in Appendix B. Specifically, we create this index by selecting the governance variable with the highest loading for each of the six factors (e.g., Officer Conflict has the highest loading for the Policies Factor). One variable—Indep Board—does not substantively load with any factor, so we also include this variable in this index. Thus, Governance Index7 equals the sum of seven governance indicators: Officers Conflict, Officer Salary, Review or Audit, Minutes Gov, Own Web, Reachable, and Indep Board.

Finally, we develop Governance Index5, Index4, and Index3 to be succinct indices or, in other words, indices based on relatively few variables (i.e., five, four, and three variables). We select which governance indicators to include based on the results from stepwise regressions.14Governance Index5 equals the sum of Audit Comm, Indep Board, CEO Salary, No Outsource, and Own Web. Governance Index4 equals the sum of Audit Comm, Indep Board, CEO Salary, and No Outsource. Governance Index3 equals the sum of Audit Comm, Indep Board, and CEO Salary. Descriptive statistics for these simple indices are provided in Table 2, Panel A.

The benefit of simple governance indices is that they are easy to implement across a variety of samples and reflect a range of governance policies. The costs include that all governance policies have an equal effect on the index and that the development of these indices is ad hoc rather than grounded in particular theory. In addition, we acknowledge that the development of these simple indices was based on examining the sample itself. We ensure that this approach does not drive our results by conducting an out-of-sample robustness test.

## IV. EMPIRICAL METHODOLOGY AND RESULTS

### Evaluating Nonprofit Governance Measures

Our objective is to provide evidence on the effectiveness of different composite measures in controlling for governance, and, in this section, we discuss the methodology that we use to evaluate effectiveness. Specifically, we assess the performance of each governance measure by documenting how well it predicts commonly used response variables in five empirical models of interest in nonprofit research. The selected models are based on prior nonprofit research articles that have established a theoretical and empirical relationship between the response variable and governance; the models are charitable support (Donations), government support (Gov Grants), reporting quality (Zero Fundraising), executive compensation (Pay Ratio), and program efficiency (Prog Ratio).15 This study is methodological in nature, rather than hypothesis driven, and our relative assessment of the measures is based on how the data “speak.”

No one definitive way to assess the adequacy of a given empirical model exists, and in practice, researchers use a subjective combination of criteria (Griffiths, Hill, and Judge 1993). Griffiths et al. (1993) emphasize that logic and compatibility with prior expectations should play a prominent role in the selection of a variable. Signs on coefficients that are flipped from expectation are a common symptom of multicollinearity, which exists when two or more regressors in the same model are moderately or highly correlated. To that end, we examine whether the coefficient on each governance measure is statistically significant in the predicted direction and dismiss models where a coefficient on a governance measure is statistically significant in the opposite of the predicted direction.

We examine two frequently cited goodness of fit statistics that result from the estimations: adjusted R2 and AIC. Adjusted R2 reports the proportion of variation in the dependent variable explained by the regressor(s).16 While the adjusted R2 penalizes the inclusion of additional regressors, some statisticians question whether the penalty is sufficient (Greene 2003). The AIC is an alternative measure of fit that places a premium on parsimony and reports a relative estimate of the amount of information lost for a given model.17 As such, a lower AIC represents a higher-quality model when comparing models. AIC values may be positive or negative; it is not the absolute size of the AIC for a given model that matters, but rather the relative AIC values over a set of models that is important (Burnham and Anderson 1998). This approach allows us to assess the relative explanatory power of each governance measure after controlling for other common drivers of the dependent variables; governance measures that result in lower AIC and higher adjusted R2 values are considered better governance controls for the purposes of our analysis.

### Models for Response Variables

We evaluate the governance measures by including each of them in five different models, representing different settings of interest to nonprofit researchers. Panel D of Table 2 reports the descriptive statistics for each of the five response variables, as well as the control variables.

#### Charitable Support Model: Donations

We first examine charitable support using the standard donations model developed by Weisbrod and Dominguez (1986) and commonly adopted by subsequent researchers:
$$\def\upalpha{\unicode[Times]{x3B1}}$$$$\def\upbeta{\unicode[Times]{x3B2}}$$$$\def\upgamma{\unicode[Times]{x3B3}}$$$$\def\updelta{\unicode[Times]{x3B4}}$$$$\def\upvarepsilon{\unicode[Times]{x3B5}}$$$$\def\upzeta{\unicode[Times]{x3B6}}$$$$\def\upeta{\unicode[Times]{x3B7}}$$$$\def\uptheta{\unicode[Times]{x3B8}}$$$$\def\upiota{\unicode[Times]{x3B9}}$$$$\def\upkappa{\unicode[Times]{x3BA}}$$$$\def\uplambda{\unicode[Times]{x3BB}}$$$$\def\upmu{\unicode[Times]{x3BC}}$$$$\def\upnu{\unicode[Times]{x3BD}}$$$$\def\upxi{\unicode[Times]{x3BE}}$$$$\def\upomicron{\unicode[Times]{x3BF}}$$$$\def\uppi{\unicode[Times]{x3C0}}$$$$\def\uprho{\unicode[Times]{x3C1}}$$$$\def\upsigma{\unicode[Times]{x3C3}}$$$$\def\uptau{\unicode[Times]{x3C4}}$$$$\def\upupsilon{\unicode[Times]{x3C5}}$$$$\def\upphi{\unicode[Times]{x3C6}}$$$$\def\upchi{\unicode[Times]{x3C7}}$$$$\def\uppsy{\unicode[Times]{x3C8}}$$$$\def\upomega{\unicode[Times]{x3C9}}$$$$\def\bialpha{\boldsymbol{\alpha}}$$$$\def\bibeta{\boldsymbol{\beta}}$$$$\def\bigamma{\boldsymbol{\gamma}}$$$$\def\bidelta{\boldsymbol{\delta}}$$$$\def\bivarepsilon{\boldsymbol{\varepsilon}}$$$$\def\bizeta{\boldsymbol{\zeta}}$$$$\def\bieta{\boldsymbol{\eta}}$$$$\def\bitheta{\boldsymbol{\theta}}$$$$\def\biiota{\boldsymbol{\iota}}$$$$\def\bikappa{\boldsymbol{\kappa}}$$$$\def\bilambda{\boldsymbol{\lambda}}$$$$\def\bimu{\boldsymbol{\mu}}$$$$\def\binu{\boldsymbol{\nu}}$$$$\def\bixi{\boldsymbol{\xi}}$$$$\def\biomicron{\boldsymbol{\micron}}$$$$\def\bipi{\boldsymbol{\pi}}$$$$\def\birho{\boldsymbol{\rho}}$$$$\def\bisigma{\boldsymbol{\sigma}}$$$$\def\bitau{\boldsymbol{\tau}}$$$$\def\biupsilon{\boldsymbol{\upsilon}}$$$$\def\biphi{\boldsymbol{\phi}}$$$$\def\bichi{\boldsymbol{\chi}}$$$$\def\bipsy{\boldsymbol{\psy}}$$$$\def\biomega{\boldsymbol{\omega}}$$$$\def\bupalpha{\bf{\alpha}}$$$$\def\bupbeta{\bf{\beta}}$$$$\def\bupgamma{\bf{\gamma}}$$$$\def\bupdelta{\bf{\delta}}$$$$\def\bupvarepsilon{\bf{\varepsilon}}$$$$\def\bupzeta{\bf{\zeta}}$$$$\def\bupeta{\bf{\eta}}$$$$\def\buptheta{\bf{\theta}}$$$$\def\bupiota{\bf{\iota}}$$$$\def\bupkappa{\bf{\kappa}}$$$$\def\buplambda{\bf{\lambda}}$$$$\def\bupmu{\bf{\mu}}$$$$\def\bupnu{\bf{\nu}}$$$$\def\bupxi{\bf{\xi}}$$$$\def\bupomicron{\bf{\micron}}$$$$\def\buppi{\bf{\pi}}$$$$\def\buprho{\bf{\rho}}$$$$\def\bupsigma{\bf{\sigma}}$$$$\def\buptau{\bf{\tau}}$$$$\def\bupupsilon{\bf{\upsilon}}$$$$\def\bupphi{\bf{\phi}}$$$$\def\bupchi{\bf{\chi}}$$$$\def\buppsy{\bf{\psy}}$$$$\def\bupomega{\bf{\omega}}$$$$\def\bGamma{\bf{\Gamma}}$$$$\def\bDelta{\bf{\Delta}}$$$$\def\bTheta{\bf{\Theta}}$$$$\def\bLambda{\bf{\Lambda}}$$$$\def\bXi{\bf{\Xi}}$$$$\def\bPi{\bf{\Pi}}$$$$\def\bSigma{\bf{\Sigma}}$$$$\def\bPhi{\bf{\Phi}}$$$$\def\bPsi{\bf{\Psi}}$$$$\def\bOmega{\bf{\Omega}}$$$$\tag{1}{\ln Donation{s_{i,t + 1}} = {\beta _0} + {\beta _1}Governanc{e_{i,t}} + {\beta _2}\ln Asset{s_{i,t}} + {\beta _3}Prog\,Rati{o_{i,t}} + {\beta _4}\ln Fundraisin{g_{i,t}} + {\beta _5}\ln Program\,Re{v_{i,t}} + {\beta _6}\ln Gov\,Grant{s_{i,t}} + \sum {{\gamma _i}} Industr{y_i} + \sum {{\delta _i}Yea{r_t}} + {\varepsilon _i} }$$
where Donations represents contributions received from individuals, corporations, and private foundations. Governance represents the different measures detailed previously in this paper and outlined in Panel B of Appendix A. Consistent with prior research, we measure the donor response in year t+1 to governance information reported in year t. We expect β1 to be significantly positive, because prior research indicates that donors reward nonprofit organizations with better governance (Harris et al. 2015).

We include standard controls following prior literature (Weisbrod and Dominguez 1986; Petrovits, Shakespeare, and Shih 2011). Assets controls for scale effects. Prog Ratio is the ratio of program expenses to total expenses, is intended to control for the efficiency with which an organization supports its mission, and is expected to be positively associated with donations.18Fundraising is the reported dollar amount of fundraising expenses, is included as a proxy for fundraising effort, and is expected to be positively associated with donations. Program service revenue (Program Rev) and government grants (Gov Grants) represent other sources of revenue and control for any crowding-out or crowding-in effects. We also include industry and year controls.

We estimate model (1), as well as all subsequent models in this study, using robust regression with standard errors clustered by organization.

#### Government Support Model: Government Grants

Next, we evaluate our governance measures using a government grants model. Specifically, we estimate the model developed by Petrovits et al. (2011):
$$\tag{2}{\ln Gov\,Grant{s_{i,t + 1}} = {\beta _0} + {\beta _1}Governanc{e_{i,t}} + {\beta _2}\ln Asset{s_{i,t}} + {\beta _3}Prog\,Rati{o_{i,t}} + {\beta _4}\ln Fundraisin{g_{i,t}} + {\beta _5}\ln Program\,Re{v_{i,t}} + {\beta _6}\ln Donation{s_{i,t}} + {\beta _7}GD{P_t} + {\beta _8}Lobb{y_{i,t}} + \sum {{\gamma _i}Industr{y_i}} + \sum {{\delta _i}Stat{e_i} + {\varepsilon _i}} }$$
where Gov Grants represents contributions from local, state, and federal government agencies. Similar to donations, we expect β1 to be positive if government grantors provide more funding to organizations with better governance. As in model (1), Assets controls for scale effects, and Prog Ratio controls for efficiency. Because prior research finds that nonprofits reduce fundraising when government funding is received, we predict a negative coefficient on Fundraising (Andreoni and Payne 2003; Petrovits et al. 2011). Program Rev and Donations control for any crowding-out or crowding-in effects from other revenue sources. GDP is annual gross domestic product, is intended to control for economic conditions, and may positively affect the supply and/or negatively affect the demand for grants. Lobby is an indicator variable that captures whether the organization is politically savvy and is expected to have a positive effect on government grants (Harris, Leece, and Neely 2017a). We use state indicator variables to serve as proxies for demand for government funding. Because we include GDP on an annual basis, we cannot concurrently include a year control.

#### Reporting Quality Model: Zero Fundraising

In our third analysis, we examine our governance measures using a reporting quality model based on Krishnan, M. Yetman, and R. Yetman (2006):
$$\tag{3}{Zero\,Fundraisin{g_{i,t}} = {\beta _0} + {\beta _1}Governanc{e_{i,t}} + {\beta _2}\ln Asset{s_{i,t}} + {\beta _3}\ln Donations\,Intensit{y_{i,t}} + \sum {{\gamma _i}} Industr{y_i} + \sum {{\delta _i}} Yea{r_t} + {\varepsilon _i}}$$

For reporting quality, we examine Zero Fundraising, which is equal to 1 if reported fundraising expenses are equal to 0 when reported donations are greater than \$10,000, and 0 otherwise. Zero Fundraising represents instances when an organization foregoes reporting fundraising expense when it is likely such costs have been incurred. Consistent with previous studies, we measure governance in the same year as reporting quality. Prior research provides evidence that strong governance is negatively associated with expense misreporting (Yetman and Yetman 2012). Thus, we expect β1 to be negative. We also include Donations Intensity, measured as the ratio of total donations to total revenue, to control for the organization's reliance on donations as opposed to service and investment revenues. Similar to the previous models, we also include controls for size, industry, and year.

#### Executive Compensation Model: Pay Ratio

Next, we assess our governance measures using a model of executive compensation:
$$\tag{4}{\ln Pay\,Rati{o_{i,t}} = {\beta _0} + {\beta _1}Governanc{e_{i,t}} + {\beta _2}\ln Asset{s_{i,t}} + {\beta _3}Prog\,Rati{o_{i,t}} + \sum {{\delta _i}} Yea{r_t} + {\varepsilon _i}}$$

We measure Pay Ratio as the ratio of CEO compensation to average non-CEO employee pay, adjusted for the industry mean. We measure governance and pay contemporaneously. We expect that good governance curbs abnormally high executive compensation, consistent with Desai and Yetman (2015) and Newton (2015), and expect β1 to be negative. Unlike the donations model, there is not a standard model of nonprofit compensation. We include size and program efficiency, which are a proxy for organizational performance, as other determinants of pay.

#### Program Efficiency Model: Program Ratio

The last response variable we examine is program efficiency. Callen et al. (2003) report that the presence of major donors on an organization's board is positively associated with the relative amount of program spending, suggesting better governance increases the amount of spending on mission-related activities. In the same vein, Desai and Yetman (2015) use state laws and enforcement as a proxy for good governance and find that governance is positively associated with the reported program ratio. Based on these prior studies, we use a program efficiency model as the fifth and final way to assess our governance measures:
$$\tag{5}{Prog\,Rati{o_{i,t}} = {\beta _0} + {\beta _1}Governanc{e_{i,t}} + {\beta _2}\ln Asset{s_{i,t}} + {\beta _3}Zero\,Fundraisin{g_{i,t}} + {\beta _4}Zero\,Admi{n_{i,t}} + \sum {{\gamma _i}Industr{y_i}} + \sum {{\delta _i}} Yea{r_t} + {\varepsilon _i}}$$

As in the previous models, Prog Ratio is the ratio of program expenses to total expenses. We expect β1 to be positive if better governance results in more mission-related spending. We control for possible misreporting of the program ratio by including indicator variables for organizations that report zero fundraising expense or zero administrative expense. We also include controls for size, industry, and year.

To the best of our knowledge, no research has documented an association between the governance policies reported on the Part VI of the Form 990 and the program ratio. This is notable because Callen et al. (2003) and Desai and Yetman (2015) study sample periods before 2008 (i.e., before Form 990 changed). Over the past decade, thinking about the nonprofit program ratio has evolved such that the nonprofit watchdogs do not advocate that organizations minimize overhead but rather that organizations spend an optimal amount on overhead to support their missions (Gregory and Howard 2009; Taylor, Harold, and Berger 2013; Mitchell and Calabrese 2018). In fact, the IRS intentionally redesigned the Form 990 in 2008 to include narrative disclosures on key programs and outcomes at the beginning of the form (IRS 2008). As a result, the existence of a linear relationship between governance and the program ratio in our sample period has not been established in the same way that the relationships between governance and the other response variables have been established by prior research.

### Main Results

The main results for the five models are presented in Table 3, Panels A through E respectively. For each of the five models, we use the governance measures from each of the four approaches previously described. For the first approach, we include all 17 variables simultaneously, as reported in the first column of each panel. We then estimate the models including each of the 17 governance variables separately. For space considerations, we opt only to report the best individual measure in the second column rather than all 17 individual measures. Results for the second approach (i.e., the governance factors), the third approach (i.e., the weighted governance index), and the fourth approach (the simple indices created by summing specified governance variables) are reported in the subsequent columns in each panel. In each panel, we report the AIC and its likelihood (which measures the likelihood of the model being the best model within the set of alternative models based on the AIC), as well as the adjusted R2. Values in bold indicate the top three governance measures with respect to fit (after excluding models where symptoms of multicollinearity are present) for each panel. It is also worth noting that the signs on the control variables in all of our models are consistent with theoretical predictions (untabulated).

TABLE 3

Comparing Governance Measures

The results for model (1) are reported in Panel A. Theory predicts and prior research finds a positive effect of governance on donations. Contrary to this prediction, in our simultaneous model we report negative coefficients on Conflict Policy, Doc Destruction, Officer Salary, No Relations, Minutes Gov, Review 990, and Review or Audit. These coefficients are suggestive of multicollinearity, which exists when regressors in the same model are highly correlated as we observe in Panel B of Table 2, and thus we do not consider including all variables simultaneously to be an effective control for governance. However, in line with predictions, the best individual measure (Indep Board), as well as each of the composite governance measures (i.e., those in APPROACHES 2–4), have a positive and significant association with donations.

A comparison of the AIC and adjusted R2 across APPROACHES 2–4 indicates that the best models with respect to fit are those that use Governance Index5 and Governance Index4. In terms of the AIC fit metric, all of the models are relatively close as shown by the likelihood metric. The Governance Index4 model is 99.8 percent as likely to be the best model within the set and has an adjusted R2 39.00 percent, compared to an adjusted R2 of 39.20 percent for Governance Index5.

Overall, the evidence in Panel A shows that the composite measures from APPROACHES 2, 3, and 4 do not suffer from multicollinearity and yield fit statistics close to the best model.19 These results indicate it is appropriate for researchers to use a composite measure as a control for governance in multivariate models of donations. Within the composite measures, Governance Index5 (reflecting the existence of an audit committee, majority independent board, no outsourcing, CEO salary review, and information available on the organization's website) serves as the best control among the alternatives we explore.

Next, we assess our governance measures using model (2) and report the results in Panel B. Again, the approach of including all Form 990 governance variables simultaneously appears to suffer from multicollinearity, as five of the governance variables have negative coefficients when we expect government grants to increase with good governance. We do find a positive and significant coefficient for the best individual measure, CEO Salary, consistent with our predictions. For APPROACH 2, five factors have significantly positive coefficients, while one factor has an insignificant coefficient.20 In APPROACHES 3 and 4, all of the governance indices have significantly positive coefficients as predicted. Examining the fit statistics for APPROACHES 2, 3, and 4, we find that they generate fit statistics that are very close.

The best model as defined by a low AIC and high adjusted R2 uses Governance Factors, followed very closely by the model that uses Governance Index5. All of our composite models are 99.8 percent or more as likely to be the best model within the set of models we analyze, and the adjusted R2s range from 21.02 percent to 21.26 percent. Overall, Panel B suggests that simple indices, such as Governance Index5, produce similar results to the other governance measures in our analysis of government grants and may be preferable to researchers over more challenging to replicate measures, such as the governance factors.

Panel C reports the results from model (3), which resemble the results in Panels A and B. In this case, we expect the coefficient on governance to be negative, as good governance likely results in less misreporting. Again, including all variables at the same time results in symptoms of multicollinearity so we dismiss that approach. For the remaining approaches, coefficients on the best individual measure, as well as all of the composite indices, are as predicted and fit statistics are close across all of the models. Again, the performance of models with simple governance indices, such as Governance Index5, are comparable to those with more complex governance measures.

Panel D presents the results of tests of executive compensation from model (4). If good governance curbs excessive compensation, we predict the coefficients on the governance measures will be negative. As before, including all Form 990 variables simultaneously does not appear a suitable approach due to multicollinearity concerns. The fit statistics across the remaining models are within a very narrow range. Governance Index12 and Governance Index3 perform the best from the remaining models, but all of our composite models are 99.9 percent or more as likely to be the best model within the set of models we analyze.

Finally, Panel E examines the governance measures in the context of program efficiency. We predict organizations with better governance will report higher program ratios. Overall, our results provide mixed evidence. The coefficients on the governance measures are not always significantly positive (and, in fact, the coefficient on Management Factor is significantly negative). The best individual measure, Own Web, performs well. While Callen et al. (2003) and Desai and Yetman (2015) document a positive relationship between reported program ratios and governance, these studies examined a different sample period. As noted previously, in the last decade, there has been less of a myopic focus on minimizing overhead spending. Perhaps, good governance is associated with optimizing the program ratio rather than maximizing it. To shed more light on this possibility, in a later section, we examine the link between governance and the program ratio in more recent years, specifically 2013–2014.

### Robustness Tests

We test the robustness of our donations and government grants models by adding two additional control variables occasionally included in these models in prior research. Specifically, we estimate the following specifications:
$$\tag{6}\ln Donation{s_{i,t + 1}} = {\beta _0} + {\beta _1}Governanc{e_{i,t}} + {\beta _2}\ln Asset{s_{i,t}} + {\beta _3}Prog\,Rati{o_{i,t}} + {\beta _4}\ln Fundraisin{g_{i,t}} + {\beta _5}\ln Program\,Re{v_{i,t}} + {\beta _6}\ln Gov\,Grant{s_{i,t}} + {\beta _7}\ln Ag{e_{i,t}} + {\beta _8}\ln Donation{s_{i,t}} + \sum {{\gamma _i}Industr{y_i}} + \sum {{\delta _i}Yea{r_t}} + {\varepsilon _i}$$
$$\tag{7}\ln Gov\,Grant{s_{i,t + 1}} = {\beta _0} + {\beta _1}Governanc{e_{i,t}} + {\beta _2}\ln Asset{s_{i,t}} + {\beta _3}Prog\,Rati{o_{i,t}} + {\beta _4}\ln Fundraisin{g_{i,t}} + {\beta _5}\ln Program\,Re{v_{i,t}} + {\beta _6}\ln Donation{s_{i,t}} + {\beta _7}GD{P_t} + {\beta _8}Lobb{y_{i,t}} + {\beta _9}\ln Ag{e_{i,t}} + {\beta _{10}}Gov\,Grant{s_{i,t}} + \sum {{\gamma _i}Industr{y_i}} + \sum {{\delta _i}} Stat{e_i} + {\varepsilon _i}$$

The first additional variable is organizational age. Age serves as a proxy for the reputation of the organization (Trussel and Parsons 2007) and equals the number of years since an organization received tax-exempt status. We do not include age in our main analyses given that prior literature finds it is highly correlated with organization size (Petrovits et al. 2011). The second additional control variable is the lagged dependent variable. Harris et al. (2015) include lagged donations to aid in ruling out concerns of correlated omitted variables given that lagged donations control for time-invariant organizational characteristics, such as fundraising aptitude.

Results from estimating models (6) and (7) are presented in Table 4.21 For donations in Panel A, we continue to find that the three most predictive models include the governance factors, Governance Index5, and Governance Index4. We do report small differences between these results and the main results in Table 3—namely, the best individual measure is now CEO Salary, and, in our governance factors model, we lose significance on the Audit and Minutes Factors. Overall, Panel A continues to suggest that Governance Index5 is a suitable approach to control for governance in donations models. In addition, it is worth noting that the coefficients on all of the governance indices are still significantly positive. In other words, while governance is sticky over time for an organization, lagged donations do not fully control for governance.

TABLE 4

Comparing Governance Measures with Additional Control Variables

With respect to our government grants model in Panel B of Table 4, Governance Index5 has the lowest AIC. Almost all of the models have the same adjusted R2. We report significantly positive coefficients on all of the governance indices after including the lagged dependent variable, indicating that governance is a valuable control. Again, we observe inconsequential differences between Table 3 and Table 4 Panel B—Officer Salary is now the best individual measure and, in our governance factors model, we lose significance on the Audit, Transparency, and Management Factors. Nevertheless, our main inference from Table 3, Panel B (i.e., Governance Index5 works well as a control) remains unchanged.

#### Out-of-Sample Data

Thus far, our empirical analysis indicates that simple indices control for governance in a similar manner to more complex measures for donations, government grants, reporting quality, and executive compensation. It is important to note that the construction of the simple indices is motivated by a within-sample analysis of the data. In other words, in order to narrow down the number of Form 990 governance variables to develop more parsimonious measures of nonprofit governance, we examine the correlations between the 17 variables (to create Governance Index12 and Governance Index7), as well as the associations between the 17 variables and total contributions in a stepwise regression (to create Governance Index5, Governance Index4, and Governance Index3). To help ensure that the within-sample analysis does not drive our results, we replicate our analysis for a period outside our sample window.

We obtain governance disclosures from the Form 990 for all organizations in the annual SOI files in 2013 and 2014 and apply the same selection criteria as our primary sample.22 This sample has an industry composition similar to our primary sample. We replicate our analysis by re-estimating models (1) through (5) on the 2013–2014 sample.

For space considerations, we do not report the coefficients for every model but instead simply report the resulting fit statistics in Table 5 (excluding the model where all variables are included simultaneously because of multicollinearity concerns). Untabulated results indicate that the coefficients on our governance indices are significant in the predicted direction for the Donations, Gov Grants, Zero Fundraising, and Pay Ratio models. However, the untabulated results provide no evidence of a positive relationship between governance and Prog Ratio in the more recent sample period. Collectively, the evidence from Table 3 and Table 5 indicates that further work needs to be done establishing a causal link between the Form 990 governance policies and program efficiency before we can recommend any particular governance measure as a control in a model attempting to explain drivers of mission-related spending. As a result, we do not conduct any further analysis of the program efficiency model.

TABLE 5

Robustness Tests Using Out-of-Sample Data

With respect to the fit statistics for models (1) through (4), the key takeaways from Table 5 are consistent with those from Tables 3 and 4. The simple indices perform well, notably Governance Index5 in the donations, government grants, and reporting quality models. Governance Index3 and CEO Salary by itself are best out of the simple approaches for the compensation model in this sample period, but Governance Index4 and Governance Index5 also appear to be reasonable alternatives. In summary, the out-of-sample tests provide further evidence that, among the measures we study, simple indices are sufficient controls for governance in models of nonprofit public support, executive compensation, and reporting quality.

#### Model-Specific Governance Measures

The evidence in Table 3 suggests that simple composite indices perform well relative to more complex governance measures when modeling donations, government grants, reporting quality, and executive compensation. All of our measures, however, capture several aspects of governance broadly. It is possible that specific Form 990 items relate more directly to financial reporting and executive pay, and a composite index of these specific items may perform better as a control than the composite measures in Table 3.23 We address this possibility in Table 6.

TABLE 6

Robustness Tests Examining Alternative Governance Measures

Table 6, Panel A reports on specific Form 990 items pertaining to reporting quality (measured by Zero Fundraising). The first column reports the results from controlling for governance with Governance Index5, which is the best simple index from Table 3, Panel C and serves as a benchmark here. Yetman and Yetman (2012) report that five governance items (Review 990, Review or Audit, Audit Comm, No Outsource, and Indep Board) reduce the likelihood of expense misreporting, so we focus on these items. In the second column, we include these five variables simultaneously and, in the third column, we include Audit Index, which is the sum of these five variables. Finally, in the fourth column, we include just Audit Factor. The results in Panel A indicate that Governance Index5 produces the lowest AIC and highest ROC values. In other words, Governance Index5, which represents a wider set of policies, performs better than the more specific measures, although there are not large differences across the measures.

We next examine the Pay Ratio in Panel B of Table 6. In this case, Governance Index12 is the simple index that performed best in Table 3 and serves as the benchmark. We then specifically examine the review and approval of CEO and other key personnel compensation by including CEO Salary and Officer Salary simultaneously (second column) and as an index (third column). In the last column, we include just the Salary Factor from the factor analysis. Including Governance Index12 results in the lowest AIC and highest adjusted R2. In fact, all of the composite measures from Table 3, Panel D perform better than the specific measures in this panel. Once again, the results suggest that a simple index representing a broader set of governance policies can serve as an effective control in nonprofit compensation models.

We do not include board size, significant changes to organizational documents, and asset diversions in the development of our main governance measures for reasons discussed in footnote 8. These three items are disclosed in Part VI of the Form 990, but their role in governance is less clear than the policies we use in the main analysis. In an untabulated robustness test, we replicate the analyses from Table 3, Panels A–D but also include variables for board size, document changes, and asset diversions.24 When examining the charitable support, government support, and executive compensation models, our inferences do not change with the inclusion of these three variables. Nor does the inclusion of these three variables result in any improvement in fit statistics. When we include the three additional variables in the reporting quality model, our second approach, which uses factor analysis, yields the lowest AIC and highest ROC values. However, these new factors are only a slight improvement over the simple indices previously discussed.25 Overall, the results from this analysis suggest that the 17 Form 990 items that we consider in our main analysis are adequate.

### Evaluation Criteria and Recommendation

Overall, we find that relatively simple governance indices (i.e., those from APPROACH 4) perform as well as, or better than, more complex measures. An ideal empirical measure of nonprofit governance exhibits validity, consistency, replicability, and comprehensiveness. With respect to construct validity, three important sub-dimensions are substantive validity, convergent validity, and predictive validity. Substantive validity refers to a theoretical link between the unobservable construct of nonprofit governance and the information contained in the simple indices. Because the simple indices represent governance policies (1) identified by the IRS as important, and (2) recommended by nonprofit governance practitioners such as BoardSource, we believe the simple indices reflect substantive validity. Convergent validity refers to the extent to which our various composite indices correlate with each other. In Table 7, we report that the correlations between our indices are quite high and statistically significant. Thus, the simple indices reflect convergent validity within the Form 990 governance information. Finally, predictive validity refers to the extent to which a simple governance index predicts or co-varies with other variables that the governance index is expected to predict. We present strong evidence of associations between the governance indices and Donations, Gov Grants, Zero Fundraising, and Pay Ratio as expected, which is indicative of predictive validity.26

TABLE 7

Convergent Validity within Form 990 Measures

Furthermore, these indices can be measured consistently across time and organizations. In addition, the simple governance indices are relatively easy for researchers to replicate using free, machine-readable IRS SOI data. Finally, as Table 7 suggests, even the indices with few components (e.g., Governance Index5, Governance Index4, and Governance Index3) appear to exhibit comprehensiveness as they are highly correlated with both All 990 Weighted Index and Governance Index17.

When considering our results together, we acknowledge that there is very little difference across the measures of Form 990 governance in terms of explanatory power in the models considered in this study. Considering our results as a whole, we recommend Governance Index5, the simple sum of five binary indicators for audit committee, majority independent board, no outsourcing, CEO salary review, and information available on the organization's website.

## V. DISCUSSION AND CONCLUSION

We aim to shed light on measures that can be used by nonprofit researchers to control for corporate governance in the nonprofit sector. Ideally, such measures reflect all of the important dimensions of governance in a parsimonious form. Overall, our empirical analysis suggests that governance indices created as the sum of indicators of key policies have similar predictive value as more complex governance measures for some key nonprofit response variables. We acknowledge there are several important caveats.

First, the Form 990 may be biased because nonprofit managers know the “correct” answer and check the box without actually implementing good governance. If so, our governance indices will suffer from measurement bias. Additionally, the Form 990 reports on internal governance mechanisms. External governance mechanisms include state regulations and creditor monitoring. Our choice of governance polices to consider is driven by a practical issue—the availability of data. We opt to focus our analysis on policies disclosed on the Form 990 because all major nonprofit organizations in the United States disclose this information.

Second, our analysis is sample specific. We use the SOI data, which include a broad cross-section of nonprofit organizations. The period of our two samples covers 2009 through 2014, which includes the years immediately following the IRS change. It is possible that this sample period is not representative of governance structures in effect today. Thus, future research should document how nonprofit governance evolves in the long run. Relatedly, our analysis suggests a need for further research on the link between governance and program efficiency, particularly in the last decade as watchdogs move away from a myopic focus on higher program ratios and understand that an optimal program ratio may exist.

Third, this paper focuses on developing and recommending a measure that can be used to control for governance in different contexts. We do not recommend how to measure governance when governance is the key variable of interest. In these cases, the specific empirical measure should be driven by the nature of the research question, economic theory, and logic.

Bhagat, Bolton, and Romano (2008) note for public companies, there is no one “best” measure of governance. Similarly, we recognize that there is no one-size-fits-all definition of good governance in the nonprofit sector, but rather good governance depends on the organization-specific characteristics and decision context. Interestingly, Bhagat and Bolton (2008) find that governance indices are not superior to a single governance variable (stock ownership by board members) in predicting accounting performance and CEO turnover in the for-profit settings. Measures that are more complex are not always more suitable measures.

With these caveats in mind, our results are consistent with the idea that an index developed using the simple sum indicators for a few key policies are valid controls for governance in many cases. Such an index has three chief benefits. It embodies several different governance dimensions. It can be consistently computed across time and organizations. Finally, other researchers can easily replicate a simple index.

## REFERENCES

Aggarwal,
R. K.,
Evans
M. E.,
and
Nanda
D.
2012
.
Nonprofit boards: Size, performance and managerial incentives
.
Journal of Accounting and Economics
53
(
1/2
):
466
487
.
Akaike,
H.
1983
.
Information measures and model selection
.
Bulletin of the International Statistical Institute
50
(
1
):
277
291
.
Andreoni,
J.,
and
Payne
A. A.
2003
.
Do government grants to private charities crowd out giving or fund-raising?
The American Economic Review
93
(
3
):
792
812
.
Balsam,
S.,
Harris
E. E.,
and
Saxton
G.
2020
.
The use and consequences of perquisites in nonprofit organizations
.
Journal of Accounting and Public Policy
39
(
4
):
106737
.
Bhagat,
S.,
and
Bolton
B.
2008
.
Corporate governance and firm performance
.
Journal of Corporate Finance
14
(
3
):
257
273
.
Bhagat,
S.,
Bolton
B.,
and
Romano
R.
2008
.
The promise and peril of corporate governance indices
.
Columbia Law Review
108
(
8
):
1803
1882
.
BoardSource.
2016
.
Recommended governance practices
.
Boland,
C. M.,
Hogan
C. E.,
and
Johnson
M. F.
2018
.
Motivating compliance: Firm response to mandatory existence disclosure policies
.
Accounting Horizons
32
(
2
):
103
119
.
Burnham,
K. P.,
and
Anderson
D. R.
1998
.
Model Selection and Inference: A Practical Information-Theoretic Approach
.
New York, NY
:
Springer-Verlag
.
Callen,
J.,
Klein
A.,
and
Tinkelman
D.
2003
.
Board composition, committees, and organizational efficiency: The case of nonprofits
.
Nonprofit and Voluntary Sector Quarterly
32
(
4
):
493
520
.
Carroll,
C. W.
1961
.
The created response surface technique for optimizing nonlinear, restrained systems
.
Operations Research
9
(
2
):
169
184
.
Cattell,
R. B.
1952
.
The three basic factor-analytic research designs—Their interrelations and derivatives
.
Psychological Bulletin
49
(
5
):
499
520
.
Charity Navigator.
2019
.
How do we rate charities' accountability and transparency? Available at: https://www.charitynavigator.org/index.cfm?bay=content.view&cpid=1093
Comrey,
A. L.,
and
Levonian
E.
1958
.
A comparison of three point coefficients in factor analysis of MMPI
.
Educational and Psychological Measurement
18
(
4
):
739
755
.
Desai,
M. A.,
and
Yetman
R. J.
2015
.
Constraining managers without owners: Governance of the not-for-profit enterprise
.
Journal of Governmental & Nonprofit Accounting
4
(
1
):
53
72
.
Gompers,
P.,
Ishii
J.,
and
Metrick
A.
2003
.
Corporate governance and equity prices
.
The Quarterly Journal of Economics
118
(
1
):
107
156
.
Greene,
W. H.
2003
.
Econometric Analysis
.
5th edition
.
:
Prentice Hall
.
Gregory,
A. G.,
and
Howard
D.
2009
.
The nonprofit starvation cycle
.
Stanford Social Innovation Review
7
(
4
):
49
53
.
Griffiths,
W. E.,
Hill
R. C.,
and
Judge
G. G.
1993
.
Learning and Practicing Econometrics
.
Hoboken, NJ
:
John Wiley & Sons
.
Harris,
E.,
Leece
R.,
and
Neely
D.
2017
a.
Nonprofit lobby expense reporting
.
Journal of Public Budgeting, Accounting & Financial Management
29
(
4
):
522
554
.
Harris,
E.,
Petrovits
C.,
and
Yetman
M. H.
2015
.
The effect of nonprofit governance on donations: Evidence from the revised Form 990
.
The Accounting Review
90
(
2
):
579
610
.
Harris,
E.,
Petrovits
C.,
and
Yetman
M. H.
2017
b.
Why bad things happen to good organizations: The link between governance and asset diversions in public charities
.
146
(
1
):
149
166
.
Hosmer,
D. W.,
and
Lemeshow
S.
2000
.
Applied Logistic Regression
.
2nd edition
.
New York, NY
:
Wiley-Interscience
.
Independent Sector.
2016
.
Principles for Good Governance and Ethical Practice
.
Internal Revenue Service (IRS).
2008
.
Background paper: Summary of Form 990 redesign process
.
Kitching,
K.
2009
.
Audit value and charitable organizations
.
Journal of Accounting and Public Policy
28
(
6
):
510
524
.
Krishnan,
R.,
Yetman
M. H.,
and
Yetman
R. J.
2006
.
Expense misreporting in nonprofit organizations
.
The Accounting Review
81
(
2
):
399
420
.
Mitchell,
G. E.,
and
Calabrese
T. D.
2018
.
Proverbs of nonprofit financial management
.
The American Review of Public Administration
49
(
6
):
649
661
.
NCCS Data Archive.
2016
.
.
Newton,
A. N.
2015
.
Executive compensation, organizational performance, and governance quality in the absence of owners
.
Journal of Corporate Finance
30
:
195
222
.
Olson,
D. E.
2000
.
Agency theory in the not-for-profit sector: Its role at independent colleges
.
Nonprofit and Voluntary Sector Quarterly
29
(
2
):
280
296
.
Percy,
L.
1976
.
An argument in support of ordinary factor analysis of dichotomous variables
.
3
(
1
):
143
148
.
Petrovits,
C.,
Shakespeare
C.,
and
Shih
A.
2011
.
The causes and consequences of internal control problems in nonprofit organizations
.
The Accounting Review
86
(
1
):
325
357
.
Saxton,
G. D.,
Neely
D. G.,
and
Guo
C.
2014
.
Web disclosure and the market for charitable contributions
.
Journal of Accounting and Public Policy
33
(
2
):
127
144
.
Taylor,
A.,
Harold
J.,
and
Berger
K.
2013
.
.
Trussel,
J. M.,
and
Parsons
L. M.
2007
.
Financial reporting factors affecting donations to charitable organizations
.
23
:
263
285
.
Weisbrod,
B.,
and
Dominguez
N.
1986
.
Demand for collective goods in private nonprofit markets, can fund-raising expenditures help overcome free-rider behavior?
Journal of Public Economics
30
(
1
):
83
96
.
White,
H.
1980
.
A heteroskedasticity-consistent covariance matrix estimator and a direct test for heteroskedasticity
.
Econometrica
48
(
4
):
817
838
.
Yetman,
M. H.,
and
Yetman
R. J.
2012
.
The effects of governance on the accuracy of charitable expenses reported by nonprofit organizations
.
Contemporary Accounting Research
29
(
3
):
738
767
.
APPENDIX A

Variable Definitions

### APPENDIX B

Appendix B discusses the use of binary variables in factor analysis and presents the factor loading results from conducting an exploratory factor analysis of the 17 Form 990 governance variables defined in Panel A of Appendix A.

#### Factor Analysis and Binary Variables

There is debate among researchers over whether it is appropriate to use binary structural measures in a traditional exploratory factor analysis. Exploratory factor analysis uses a Pearson correlation matrix, which assumes continuous, normally distributed variables. The primary issue under debate is the effect on the factor model when the assumption of continuous, normally distributed variables underlying the Pearson correlation matrix is violated through the use of binary measures. Opponents argue that, when utilizing binary structural measures in factor analysis, factors will be based on structural measures with similar distributions rather than similar attributes, thus reducing the meaningfulness of the factors. Alternative correlation methods include the tetrachoric (favored by Carroll [1961]) or phi-over-phi-max (favored by Cattell [1952]), which create normally distributed factors from binary and ordered-category variables. Comparing these three correlation methods (Pearson, tetrachoric, phi-max), Comrey and Levonian (1958) determine that the same factors result from all three methods, therefore debunking the issue of non-normally distributed, binary structural measures as inputs in exploratory factor analysis and concluding that Pearson correlations are the most reasonable choice.

Beyond the issue of structural measure distribution and choice of correlation matrix is the perceived “loss of information” from binary verses continuous measures. Percy (1976) addresses this issue by testing variations in response intervals from a seven-point scale to dichotomous variables, finding that rotated factors' loadings were consistent across all scale intervals. That is, factors were identical when variables were defined either using a continuous seven-point scale, six-point, five-point, or simple yes/no binary measures, alleviating the concern of information loss from dichotomous structural measures.

In sum, while questions have been raised regarding the suitability of using binary structural measures to perform factor analysis, it appears that the outcome of such analysis is not significantly different from employing continuous, normally distributed measures. Moreover, the resulting factors from our factor analysis “make sense” in that the underlying governance variables load together in an intuitively reasonable way (e.g., different compensation policies load in the same factor; different audit practices load in the same factor), which alleviates the concerns regarding the use of binary variables.

#### Factor Analysis

Following Harris et al. (2015), we use principal component factor analysis with promax rotation and identify factors with eigenvalues greater than 1. The bold values represent the highest factor loading for each variable that is greater than 0.4. We assign a name to each factor that reflects the underlying Form 990 governance variables with the highest factor loadings.

1

As discussed later in the paper, we focus on controlling for governance and do not make any recommendations on the choice of a governance measure when governance is a primary variable of interest because, in such cases, the choice should be driven by the specific research question and motivating theory. For example, when examining the extent to which strong governance improves financial reporting quality, Yetman and Yetman (2012) focus only on specific governance constructs that would theoretically be expected to affect financial reporting quality (e.g., board review of the Form 990 and existence of an audit). However, as part of our analysis, we do compare broad composite measures of governance to specific governance policies relevant to a specific question and find that the broad, general governance indices perform as well as model-specific governance policies.

2

We discuss the construction of our measures in a Section III subsection, “Development of Governance Measures: Four Approaches.”

3

We do not consistently document a significantly positive association between governance and the program ratio, suggesting there is a need for further research examining the link between the relative amount of charitable spending and governance.

4

Houses of worship are exempt from filing the Form 990.

5

Specifically, Kitching (2009) uses an indicator variable signifying whether the organization uses a Big 5 audit firm, and Aggarwal et al. (2012) use board size. Yetman and Yetman (2012) simultaneously include over ten governance indicators (e.g., board independence, audit committee, board size). Harris et al. (2015, 2017b) use the set of governance policies and procedures from the revised Form 990, which are listed in Appendix A, Panel A of this paper. Saxton et al. (2014) measure transparency using the total number of financial and performance items voluntarily disclosed by the organization on its website. Finally, Newton (2015) creates a weighted index of 16 components most related to executive compensation (e.g., board independence, conflict of interest policy, executive reimbursement policy).

6

We begin our analyses in 2008 with the Form 990 redesign. We end our primary analysis in 2012 and use subsequent years with data available (2013–2014) for an out-of-sample robustness test.

7

The “final sample” refers to the observations used to examine empirically the public and government support models. To examine reporting quality, the sample size drops to 51,884 because of missing Form 990 data for variables in that model. To examine program ratio efficiency, we exclude observations with program ratios greater than one as they represent obvious errors in the data, resulting in a sample of 51,817. Finally, for executive compensation analysis, the sample size drops to 30,056 observations because the IRS does not require all organizations to report Schedule J compensation data.

8

There are four items in Part VI of the Form 990 that we do not include in our main analysis: board size, significant changes to organizational documents (i.e., articles of incorporation and bylaws), assets diversions, and contact information. These items are not, in and of themselves, governance policies. They lack substantive validity (i.e., a theoretical link between the Form 990 items and good governance). First, the effect of board size on governance quality is ambiguous. Larger boards are likely to possess necessary expertise, but very large boards may be detached and less effective monitors. For example, Aggarwal et al. (2012) document that nonprofit board size is inversely related to pay-for-performance incentives. Further, Harris et al. (2015) report a concave relation between donations and board size. Second, the effect of document changes is also ambiguous; changes in bylaws may increase or decrease governance depending on the change. Third, an asset diversion is not a governance policy but instead is an outcome of poor governance (Harris et al. 2017a). Additionally, document changes and asset diversions are nonrecurring events and not an ongoing governance mechanism. Finally, contact information is not a replicable variable as it is not included in SOI data. Harris et al. (2015) use a custom dataset and include contact information as part of their “kitchen sink” approach” but, to our knowledge, no practitioners describe providing contact information as an important governance mechanism. In a robustness test, we extend our analysis to include the three items available in the SOI data (board size, document changes, and asset diversions) and find that our inferences are the same as our main analysis.

9

We include all relevant governance variables from the Form 990 in our analysis in order to be complete and to shed light on how to operationalize the various governance variables. It is not necessarily surprising that Conflict Policy and Officers Conflict are highly correlated, as Officers Conflict is a sub-question under Conflict Policy on the Form 990.

10

Harris et al. (2015) include 21 variables in their factor analysis, resulting in seven governance factors. Our factors are similar but not identical because we include four fewer underlying Form 990 governance variables as discussed in footnote 8. Specifically, we exclude board size, document changes, asset diversions, and contact information in order to focus on key governance policies. Because we exclude board size, our analysis does not produce a board factor, which is the seventh factor in Harris et al. (2015). Relatedly, one of our governance variables—Indep Board—does not substantively load with any particular factor but it does contribute some information to each of the other factors, notably the Audit Factor and the Minutes Factor.

11

Because of the specific research question, Newton (2015) focuses on policies likely to influence compensation (e.g., executive reimbursement policies), and her sample includes only organizations that complete Schedule J of the Form 990. Because we are interested in creating a control measure that captures governance more broadly and is applicable to a wider range of nonprofit organizations, we use the more comprehensive set of governance disclosures from the main Form 990 in the calculation of our weighted index.

12

Over 130,000 different combinations of the 17 Form 990 variables exist, and we acknowledge that we do not examine every possible combination. Instead we deliberately select approaches that create combinations whereby the resulting index likely reflects the most governance information.

13

Our inferences are robust to using different correlation cutoffs (from 0.45 to 0.75) to create a simple governance index in this way.

14

We estimate regressions of total contributions (i.e., Form 990, Part VIII, Line 1h) on the 17 governance variables using a stepwise method and limiting the number of predictors to five, four, and three, respectively. This approach identifies the variables that provide the best fit. We then create indices by summing the variables identified by the stepwise regressions. There is no theoretical reason why we use total contributions to develop these simple indices instead of other dependent variables of interest. Practically, total contributions includes two of the dependent variables we analyze (donations and government grants), and the empirical specification for the contributions model is well established and frequently used in the accounting and economics literature.

15

While there is a model linking asset diversions to governance (Harris et al. 2017b), we exclude the model in this paper because, from a practical standpoint, asset diversions are rare in the SOI data. In our sample, less than 1 percent (0.33 percent to be precise) of the observations reported a diversion.

16

For one model, we report the area under the ROC curve instead of adjusted R2 because the dependent variable is a binary variable.

17

An alternative to AIC is Bayesian Information Criterion (BIC). Inferences using BIC in this study are equivalent and, thus, we only report AIC.

18

Some nonprofit donations models use “price” instead of the program ratio (e.g., Petrovits et al. 2011). In these models, “price” is simply defined as the inverse of the program ratio.

19

For APPROACHES 2–4 in Panel A of Table 3, we compute the variance inflation factors (VIFs) for every model. The largest VIF we observe is 1.62, which is well below the threshold for concern about multicollinearity. We further compute the VIFs for all panels of Table 3 and Table 4, APPROACHES 2–4, and likewise observe no evidence of multicollinearity.

20

This result is consistent with the results in Harris et al. (2015), who also report an insignificant coefficient on the Minutes Factor in this model. The factors are by construction not correlated with each other and, thus, the insignificant sign on the coefficient is not due to multicollinearity but likely due to a lack of a relationship between Minutes Factor and Gov Grants.

21

Note that our sample size drops to 47,334 firm-year observations because Age is unavailable for our full sample.

22

The sample size starts with 13,696 observations. For the charitable contributions, government grants, reporting quality, and program efficiency models, the sample size drops slightly to 12,742 because of missing Form 990 data for variables in those models. For the analysis of the executive compensation model, the sample size drops to 10,949 observations because these tests require Schedule J data, which are not required to be completed by all organizations.

23

Because Harris et al. (2015) provide evidence that donors and government grantors value many dimensions of governance, we do not identify specific governance policies for the charitable giving and government grant models.

24

Following a past recommendation from BoardSource (2016), we define good governance for board size as a board with between five and 22 members. Smaller boards may not have the appropriate expertise or manpower, and larger boards may be detached, ineffective monitors. Further, we define good governance as the absence of changes to organizational documents and the absence of asset diversions.

25

For example, Governance Index5 is 99.8 percent as likely to be the best model as the factors formed with these variables.

26

While we admittedly find no evidence that our simple indices co-vary with reported program ratios, we are less certain that they should in our sample period. More research is needed to better understand the link between governance and the optimal amount of overhead spending.

## Author notes

We thank Vaughan S. Radcliffe (editor), two anonymous reviewers, Dana Forgione, Linda Parsons, Chih-Ling Tsai, and workshop participants at the 2018 American Accounting Association Government and Nonprofit Section Midyear Meeting and the 2018 American Accounting Association Annual Meeting for their thoughtful comments and suggestions.

Colleen M. Boland, University of Wisconsin–Milwaukee, Lubar School of Business, Milwaukee, WI, USA; Erica E. Harris, Florida International University, College of Business, School of Accounting, Miami, FL, USA; Christine M. Petrovits, The College of William & Mary, Raymond A. Mason School of Business, Williamsburg, VA, USA; Michelle H. Yetman, University of California, Davis, Graduate School of Management, Davis, CA, USA.

Editor's note: Accepted by Vaughan S. Radcliffe.