Policy evaluation focuses on the assessment of policy-related personal, family, and societal changes or benefits that follow as a result of the interventions, services, and supports provided to those persons to whom the policy is directed. This article describes a systematic approach to policy evaluation based on an evaluation framework and an evaluation process that combine the use of logic models and systems thinking. The article also includes an example of how the framework and process have recently been used in policy development and evaluation in Flanders (Belgium), as well as four policy evaluation guidelines based on relevant published literature.
Introduction and Overview
Policy evaluation focuses on the assessment of policy-related personal, family, and societal changes or benefits that follow as a result of the interventions and supports provided to those persons to whom the policy is directed. Policy evaluation logically follows policy development and implementation. As discussed in preceding articles, policy development involves the decision process by which individuals, groups, or institutions establish policies that align basic concepts, principles, procedures, or protocols, and policy-specific goals and associated outcomes. In contrast, policy implementation is based on a contextual analysis, employs a value-based approach, aligns the service delivery system both horizontally and vertically, and is implemented through a partnership.
Policy evaluation is a complex process that is influenced by numerous contextual issues and challenges associated with operationalizing measureable outcome indicators, deciding on what constitutes credible evidence, developing the approach taken to outcome evaluation, enhancing the capability of organizations and systems to assess policy-related outcomes, and using the evaluation results for multiple purposes. The intent of this article is to address these issues and challenges by describing a policy evaluation framework and a policy evaluation process based on the use of logic models and systems thinking. In addition, the article presents an example of how the framework and process have recently been used in policy development and evaluation in Flanders (Belgium), and discusses four policy evaluation guidelines based on relevant published literature.
Policy Evaluation Framework
Logic models are used widely in policy evaluation because of their utility in articulating the operative relations among policy goals, program services, and desired outcomes; enabling policy makers and provider organizations to understand what must be done to achieve policy outcomes; identifying critical factors that can influence policy outcomes; and clarifying for policy implementers the sequence of policy-related inputs, throughput, outputs, and outcomes (Donaldson, 2007; Funnell & Rogers, 2011; Schalock & Verdugo, 2012; Schalock, Verdugo & Gomez, 2011; van Loon et al., 2013). Figure 1 summarizes the four components of a logic model applied to policy evaluation.
The input component involves a value-based policy that leads to the development and implementation of interventions, services, and supports to enhance personal, family, and/or societal valued outcomes. Values are characterized by their ideological origin, resistance to change over time, goal-oriented nature, ability to affect one's choice and interest, and subjectivity (Shams, Akbari Sari & Yazdani, 2016).
The throughput component involves a system of supports that encompasses interventions, services, and individualized support strategies that aim to promote the development, independence, interests, and well-being of a person, and to enhance the individual's functioning, participation within society, and engagement in life activities. A system of supports is the planned and integrated use of an array of strategies and resources that include professionally based interventions, agency-provided services, and individually focused support strategies. These support strategies encompass natural supports, technology, prosthetics, education across the lifespan, reasonable accommodations, dignity and respect, personal strengths/assets, and professional services (Chiu, Lombardi, Claes, & Schalock, 2017). A system of supports provides a structure to enhance elements of human performance that are interdependent and cumulative and built around the individual's needs and aspirations.
The output component of the evaluation framework includes the structures and environments that provide opportunities and support a person's participation, involvement, and development, and enhance personal, family, or societal well-being. The outcome component involves personal, family, or societal changes or benefits that follow as a result or consequence of some activity, intervention, support, or service. These outcomes are reflected in measures of personal well-being such as enhanced quality of life and socio-economic status and are in line with the basic principles and articles of the United Nations Convention on the Rights of Persons With Disabilities (UNCRPD; United Nations, 2006).
Policy Evaluation Process
The described policy evaluation framework is a way of integrating theoretical components of a logic model applied to policy evaluation. This section of the article discusses the six steps that are involved in a systematic approach to policy evaluation. These six steps are summarized in Figure 2.
Step 1: Identify Policy-Related Goals and/or Objectives
The first step in the policy evaluation process involves identifying policy-related goals and/or objectives. In this step, policy rules and regulations are analyzed according to their intended value-based outcomes. This is an important step as it gives an indication in which way the actual policy is focusing on long-term, sustainable quality of life improvement (Costanza et al., 2008). The role of the government is not “to make people happier,” but to create conditions in order to meet basic human needs related to a valued life of quality (Nussbaum, 2015). Improvement of quality of life is the result of the extent to which basic needs are met (objective) in relation to personal or group perceptions (subjective; Costanza et al., 2008; Hagerty et al., 2001).
Step 2: Operationalize Goals/Objectives Into Outcome Areas
The second step involves operationalizing goals and objectives into outcome areas associated with personal, family, or societal changes. In this phase, the alignment between value-based goals and outcome areas is made explicit (Leichsenring, 2004). Table 1 lists common outcome areas associated with these changes.
Step 3: Select Measurable Indicators
Step three involves selecting measureable outcome indicators per outcome area. The selection of measurable indicators is not an easy exercise. Indicators should be valid (actually measure what they are intended to), reliable (provide the same information if measured by different persons), sensitive (able to measure change), and specific (reflect changes only in the situation concerned; Bowen & Kreidler, 2008). The biggest challenge is to find indicators asking the right questions, instead of using indicators that are already available. Therefore, indicator selection and development should be a collaborative process, including important contextual information and expertise of different stakeholders. Commonly used categories of indicators are structure, process, and outcome (Hung & Jerng, 2014). Structure indicators reflect capacities available for interventions, whereas process indicators provide information on how well the intervention has been established. Outcome indicators are essential in policy evaluation because they allow one to assess the effect(s) of the policy. They also represent the validity of the process as defined, and the adequacy of the structure as put forward (Deerberg-Wittram, Guth, & Porter, 2013).
Step 4: Gather Evidence
In previous work, we elaborated on evidence-gathering strategies that can be organized into two broad measurement approaches: quantitative or qualitative. Quantitative research designs include experimental-control designs (e.g., equivalent groups, randomized control trials, repeated measures, multivariate), quasi-experimental designs (e.g., time series designs, multiple baseline designs, pre-post comparisons, nonequivalent control group, counterbalanced), and nonexperimental designs (e.g., descriptive research, meta-analysis, consumer surveys; Claes, van Loon, Vandevelde, & Schalock, 2015). Qualitative research designs include grounded theory, ethnography, participation research, and case studies. A detailed description of these designs and their use is published in Neutens and Rubinson (2010) and Norwood (2010).
The specific evidence-gathering strategy employed is influenced primarily by the perspective on evidence taken, the practice(s) being evaluated, the statutory/regulatory environment, the constituents involved in the evidence-gathering strategy, the expertise of the researchers, and the receptivity of the consumers to the information provided (Schalock, Gomez, Verdugo, & Claes, in press). Regardless of the evidence-gathering strategy employed, establishing the relation between specific practices and measured outcomes (i.e., an evidence-based practice) requires demonstrating application fidelity of the practice(s) in question. As discussed by Hogue and Dauber (2013), fidelity consists of three related factors: adherence, competence, and differentiation. Adherence is the extent to which the practice is implemented using current best practices. Competence is the quality of the evidence-gathering process. Differentiation is the degree to which the practice employed is clearly differentiated from a potentially related practice (e.g., focusing on quality of life vs. emphasizing quality of care).
Step 5: Establish the Credibility of the Evidence
Establishing the credibility of the evidence involves being sensitive to three different perspectives on the credibility of evidence: the empirical-analytical, the phenomenological-existential, and the post-structural (Broekaert, Autrique, Vanderplasschen, & Colpaert, 2010; Claes et al., 2015). These three perspectives relate to different approaches and, thereby, how disability-related policy is evaluated. The empirical-analytical perspective focuses on experimental or scientific evidence (Blayney, Kalyuga, & Sweller, 2010; Brailsford & Williams, 2001; Cohen, Stavri, & Hersh, 2004). In distinction, the phenomenological-existential perspective emphasizes evidence based on the reported experiences of well-being (Kinash & Hoffman, 2009; Mesibov & Shea, 2010; Parker, 2005). From a post-structural perspective, the credibility of evidence is based on public policy principles such as inclusion, self-determination, participation, and empowerment (Broekaert, Van Hove, Bayliss, & D'Oosterlinck, 2004; Goldman & Azrin, 2003; Shogren & Turnbull, 2010).
Regardless of the perspective taken, establishing the credibility of evidence is based on its quality, its robustness, and its relevance (Claes et al., 2015). The quality of evidence is related to the methodology or type of research design. Based on the methodology used, the quality of evidence can be ranked from high to low as follows (Sackett, Richardson, Rosenberg, & Haynes, 2005): randomized trials and experimental/control designs, quasi-experimental designs, pre-post comparisons, correlational studies, case studies, surveys. The robustness of evidence refers to the magnitude of the observed effect. The magnitude of the observed effect(s) can be determined from: (a) probability statements (e.g., the probability that the results are due to chance is less than 1 time in 100, p < .01); (b) the percent of variance explained in the dependent variable by variation in the independent variable; and/or (c) the statistically derived effect size. When qualitative research methods are used, other standards can be employed to evaluate the robustness of the evidence (cf. Brantlinger, Jimenez, Klingner, Pugach, & Richardson, 2005; Claes et al., 2015). The relevance of evidence is related to purpose. Major purposes involve clinical, managerial, and policy decision making. Evaluating the relevance evidence needs to be done within the context of the questions being asked, what is best for whom, and what is best for what (Biesta, 2010; Brantlinger et al., 2005; Bouffard & Reid, 2012).
Step 6: Use the Evidence/Outcomes for Multiple Purposes
Policy-related evaluation is defined as assessing personal, family, or societal changes or benefits that follow as a result or consequence of some activity, intervention, service, or support. These outcomes can be used for multiple purposes, including summative evaluation, formative evaluation, and research. Table 2 provides examples of each of these uses. The material presented in Table 2 is based on the published work of Azzam and Levine (2015), Claes et al. (2015), Cullen et al. (2016), Deerberg-Wittram et al. (2013), and Gugiu & Rodriguez-Campos (2007).
Example From Flanders
Since 2014, the law on personal budgets has been approved by the Flemish government. The purpose and goal of this law is to give people with a disability more control over their lives. As part of a new system of support, the use of personal budgets is seen as a vehicle for change. This change aims to empower people with disabilities and give them more control. The implementation of personal budgets is one part of a social policy that is outcome-driven, and one that strives for the enhancement of quality of life in line with the UNCRPD (Claes, Vandenbussche, & Lombardi, 2016; Schalock & Keith, 2016; Vlaams Parlement, 2013-2014).
In terms of evidence-based policy, the Flemish government seeks an answer to one main question: “What is the impact of personal budgets on the quality of life of persons with disabilities?” We used the 6-step policy evaluation process depicted in Figure 2 to determine potential outcomes for each policy subgoal. Table 3 summarizes these potential outcomes based on document analyses, case studies, expert panels, and an international Delphi study.
Policy Evaluation Guidelines
Policy evaluation is not done in a vacuum. In addition to the structured approach reflected in Figures 1 and 2 regarding a policy evaluation framework and process, there are at least four factors that signfiicantly influence policy evaluation and the use of policy evaluation results. These four involve: (a) contextual variables that influence disability policy at the micro-, meso-, and macro-system levels; (b) different perspectives on evidence; (c) the fidelity of the policy's implementation; and (d) the evaluation capability of the organization or system involved in the implementation and evaluation of policy.
Be Sensitive to Contextual Variables
Contextual variables can influence policy evaluation at the micro-, meso-, and macro-system level. At the microsystem level, for example, consumer empowerment, self-advocacy, and personal and family-centered planning have brought about changes in the focus of interventions, services, and supports; self-directed funding and personal budgets; and the criteria by which policy outcomes are evaluated (Shogren, Luckasson, & Schalock, 2015; Shogren, Schalock, & Luckasson, in press).
At the mesosytem level, organizations and systems are changing their policies and practices to conform to the transformation era, whose characteristics include being more person/family centered, streamlined and horizontally structured, and performance based (Schalock & Verdugo, 2014). Concurrently, we are seeing the emergence of new public management that views the market as the prime regulatory instrument in the public domain, with an associated emphasis on decentralization, quality control, effectiveness, and efficiency (DiRita, Parmenter, and Stancliffe, 2008; Schalock & Verdugo, 2012).
At the macrosystem level, both human service organizations and larger service delivery systems are being challenged by changes in the social-political-fiscal environments within which people with disabilities and their families live and service/support delivery systems operate. These challenges and change are reflected in an increased emphasis on continuous quality improvement, demonstrated policy accountability, a focus on organization and system sustainability, and multiple performance-based perspectives (Schalock, Verdugo, & Lee, 2016).
Agree on Perspective on Evidence
One of the major results of these contextual variables has been the emergence of different perspectives on evidence. The perspective one takes on evidence will influence not only how one evaluates the credibility of policy-related outcome data, but also its potential use (Archibald, 2015; Biesta, 2010; Mertens, 2016; Morrow & Nkwake, 2016). As discussed previously, the primary focus of the empirical-analytical perspective is on experimental or scientific results obtained from data-gathering strategies such as random trials, experimental/control designs, quasi-experimental designs, multiple baseline designs, and/or multivariate designs. The primary focus of the phenomenological-existential perspective is on reported experiences and enhanced human functioning, social particpation, and/or personal well-being, with associated data-gathering strategies such as self-reports, case studies, ethnographics, participatory action research multivariate designs, and/or grounded theory. The primary focus of the poststructural perspective is on desired public policy outcomes assessed via mixed methods designs, multivariate designs, population surveys, meta-analyses, and/or data registers.
These different perspectives reflect a number of philosophical assumptions on the nature of knowledge, practice, and reality; frame one's approach to data collection, analysis, and interpretation; determine one's sensitivity to different world views; shape one's thinking; and represent the intersection of evaluation and application (Schalock et al., in press). As an important policy evaluation guideline, stakeholders need to be familiar with the different perspectives on evidence and frame policy evaluation to be aligned with the emphasized perspective.
Ensure Application Fidelity
The effectiveness of a given policy is related in large part to whether it is implemented in reference to three application fidelity critria: adherence, competence, and differentiation. As discussed by Claes et al. (2015) and Hogue and Dauber (2013), adherence refers to the quality or extent to which the policy is actually implemented within the organization or system's policies and practices. Competence refers to the quality of skill delivery and whether the policy was implemented by organization and systems-level personnel who have those attitudes, skills, and knowledge required for knowledge transfer and effective implementation. Differentiation refers to the degree to which organization- and systems-level policies and practices reflect the logic model parameters depicted in Figure 1, rather than previous service/support delivery approaches. As an important policy evaluation guideline, unless a policy is implemented consistent with its stated parameters and meets these three application fidelity criteria, there is no way to accurately evaluate its intended outcome.
Build Evaluation Capacity
Disability policy is implemented largely through service/support provider organizations and the large systems that provide statutory rules, regulations, and funding. With the increasing focus on outcomes-driven policy formulation and outcomes evaluation, a critical issue that emerges is the level of evaluation capability (i.e., capacity) of those organizations and systems that are expected to provide outcome information. The term “evaluation capacity” refers to developing in organizations and systems the necessary skills to conduct ongoing, rigorous evaluation (Cousins, Goh, Elliott, Aubry, & Gilbert, 2014). A recent analysis (Norton, Milat, Edwards, & Giffin, 2016) identified those factors associated with successful capacity building. These factors were: training and professional development as an element of evaluation capacity building, participatory approaches to evaluation, linking training with practical application, partnerships among evaluators and key stakeholders, embedding evaluation into routine practices, and tailoring the evaluation capacity building strategy to the organization or system's context.
The strong connection between successful capacity building and practical application underscores the distinction between capacity to do (i.e., building) vs. capacity to use (i.e., utilization). As discussed by Bourgeois, Whynot, & Theriault (2015), Cousins et al. (2014), and Schalock et al. (2016), integrating the results of policy evaluation into organization and system routines and cultures is associated closely with a commitment to quality improvement that involves a continous process of enhancing valued outcomes through a quality improvement loop consisting of assessing, planning, doing, and evaluating.
These four policy evaluation guidelines will help overcome many of the barriers to policy evaluation reported in the literature (cf. Flitcroft, Gillespie, Salkeid, Carter, & Trevena, 2011; Trochim, 2009). In a recent analysis of specific policy evaluation barriers from a systems perspective, Schneider, Milat, & Moore (2016) reported that: (a) at the macrosystem level, barriers involve political influence/sensitivity, limited funding, and time constraints; (b) at the mesosystem level, barriers involve staff retention/turnover, approval process, culture of evaluation, tools, training, intellectual property regulation, and changing liaisons; and (c) at the microsystem level, barriers involve the skill level of staff, confidence, staff trust, career priorities, and motivation.
This article has stressed the need to use a structured approach to policy evaluation that is based on a clearly described and operationalized evaluation framework (Figure 1) and an evaluation process (Figure 2). Logic models provide the framework to design theoretical relations among input, throughput, output, and outcome components of public policy. This framework incorporates values; policy-related interventions, services, and supports; structures and environments that facilitate growth, development, and enhancement; and personal, family, and societal changes resulting from these inputs, throughputs, and outputs. The six-step policy evaluation process and evaluation capacity are important factors that evaluators need to be sensitive to, as are the previously discussed policy evaluation guidelines. A structured approach such as the one described in this article provides a policy evaluation framework and also brings together the necessary triade of policy, practice, and research.
The authors gratefully acknowledge the collaboration with the Flemish Gouvernment, more specifically the Flemish Agency for Persons with a Disability (VAPH).