## Abstract

Traditional measures that quantify variation in natural resource systems include both upside and downside deviations as contributing to variability, such as standard deviation or the coefficient of variation. Here we introduce three risk measures from investment theory, which quantify variability in natural resource systems by analyzing either upside or downside outcomes and typical or extreme outcomes separately: semideviation, conditional value-at-risk, and probability of ruin. Risk measures can be custom tailored to frame variability as a performance measure in terms directly meaningful to specific management objectives, such as presenting risk as harvest expected in an extreme bad year, or by characterizing risk as the probability of fishery escapement falling below a prescribed threshold. In this paper, we present formulae, empirical examples from commercial fisheries, and R code to calculate three risk measures. In addition, we evaluated risk measure performance with simulated data, and we found that risk measures can provide unbiased estimates at small sample sizes. By decomposing complex variability into quantitative metrics, we envision risk measures to be useful across a range of wildlife management scenarios, including policy decision analyses, comparative analyses across systems, and tracking the state of natural resource systems through time.

## Introduction

Variability is an important performance measure in fisheries and natural resource management (Perrings 1998 and articles therein; Landres et al. 1999). In commercial fishing, where harvest is a key objective, the fishing industry would prefer to have stable income flow year to year, and fisheries managers would prefer to have stable populations year to year. Accordingly, when examining harvest implications of policies, fisheries analysts represent variability as a negative performance measure (e.g., Punt et al. 2002), and they typically summarize it as the standard deviation or the coefficient of variation (the standard deviation of the data divided by the mean) of a time series of catches (e.g., Essington 2010). In other systems, managers may be focused on extreme events, regardless of whether they are upside or downside outcomes, and seek to understand the nature of variability in an ecosystem in detail (e.g., when managing riverine systems for natural hydrological flow; Poff et al. 1997). In these and many other natural-resource management scenarios, traditional metrics for variability, such as the standard deviation and the coefficient of variation do not provide sufficient resolution to characterize and manage the dynamics of ecosystems. In this paper, we introduce risk measures from investment theory, which can be used to decompose variability in natural resource systems into upside or downside measures and which can focus on either extreme or typical outcomes.

The standard deviation and coefficient of variation are defensible measures of variability because their properties are well-known and they are familiar metrics to most quantitative analysts; however, they have several shortcomings. First, these measures are symmetric insofar as they equally weight both downside (e.g., a catch below the mean) and upside (e.g., a catch above the mean) deviations. For example, a year where harvest was 50% above the long-term mean would contribute equally to a standard deviation measure of variability as would a harvest 50% below the long-term mean. Second, it is not straightforward to associate biological or socioeconomic meaning to these variability measures because they frame outcomes purely in statistical measures; for instance, “As indicated by the coefficient of variation, policy *A* is expected to result in a standard deviation of harvest that is *X*% of the long-term mean harvest.” Quantitative risk measures from investment theory provide metrics for variability that can avoid these problems by focusing on downside or upside outcomes separately and by framing results in terms relevant to management decisions, such as expected population size in the worst-case scenario or the chance that harvest falls below a critical threshold.

Here we outline three risk metrics for natural-resource performance data (e.g., catch, abundance, or revenues) that can be tailored to focus either on upside or downside and typical or extreme outcomes, and which frame variability in terms that make it possible to intuit biological or socioeconomic meaning (albeit their technical names wouldn't suggest so): “semideviation,” “conditional value-at-risk,” and “probability of ruin.” Semideviation and conditional value-at-risk quantify risk in terms of the magnitude of a downside outcome and probability of ruin gives a probability of a specified downside outcome. All three measures can also be formulated to measure upside outcomes (see below). Quantitative risk measures have been applied to natural resource systems in a few cases, (Webby et al. 2007; Jones et al. 2011; Sethi et al. 2012, in press); however, their use is not yet widespread. In this paper, we synthesize information on risk measures and provide simulation testing to examine the performance of risk measures with simulated data. The “Theory” section of this article describes the three risk measures, and provides mathematical formulae and brief empirical examples from commercial fisheries. R code (R Development Core Team 2010) to calculate each risk measure is provided in Text S1 (*Supplemental Materials, *http://dx.doi.org/10.3996/122011-JFWM-072.S1). The “Simulation Testing” section provides performance testing of risk measure bias and stability using data modeled after fisheries catches (i.e., small samples and potentially containing zeros for fishery closures). While risk measures are presented in terms of downside deviations in commercial fisheries harvest here, they can be readily extended for other wildlife resource applications.

## Theory

In the following discussion, we present the concept of risk in terms of natural resource management as the chance of something “undesired” happening, focusing on downside outcomes (e.g., an outcome below the long-term mean). This is a natural application of risk measures consistent with the commercial fishery data examples we provide below, where fishermen view harvest as a primary performance measure of interest and undesirable outcomes can be measured as catch below some performance benchmark (e.g., long-term mean harvest). In other natural resource management or wildlife applications, judgment of variability as desirable or undesirable may not be relevant. In such cases, risk measures can be parameterized to focus on either upside or downside measures (Text S1, *Supplemental Materials, *http://dx.doi.org/10.3996/122011-JFWM-072.S1), and the concept of a “risk” could be viewed more generically as the chance of any defined event happening (e.g., the chance of an extreme event).

To quantify a risk, a risk measure needs to incorporate two components: a measure of the “probability” and the “severity” of an event; for example, the probability of realizing annual harvest that is <70% of the long-term mean. Many risk measures have been proposed (Brachinger and Weber 1997; Szego 2002), with some focusing on the probability of an event and others focusing on the severity of an event. By characterizing only downside (or upside) outcomes, risk measures are well-suited to accommodate skewed performance data where the occurrences of upside and downside deviations are not symmetric.

In the following description of risk measures, let *R* represent a random variable for system performance, *r* an individual draw from *R,* and *r _{n}* a sample of

*n*draws from

*R.*For example, in terms of catch risk,

*R*could represent annual harvest as a random variable (e.g., pink salmon

*Oncorhynchus gorbuscha*harvest in Prince William Sound, Alaska),

*r*would be the realization of an single annual harvest (e.g., 2010 Prince William Sound pink salmon harvest), and

*r*a time series of annual harvests (e.g., 2000–2010 Prince William Sound pink salmon harvest). Risk could be computed on fishery-wide data, or could be calculated for a single fisherman's performance time series to calculate individual-level risk. In computing performance risk from time-series data, it is assumed that the unobserved process generating outcomes does not change over the time period for which risk is characterized. Conceptually, each return datum is considered to be a random draw from the distribution characterizing the true, time-invariant fishery performance behavior. This assumption is necessary, such that characterization of risk using historical data applies to future management scenarios, although this could be relaxed for purely retrospective analyses where the goal is to characterize historical variability. If necessary, analysts could detrend time series to achieve stationarity, in which case risk measures would be in reference to deviations about an estimated time trend. Natural resource managers could also use risk measures to characterize the variability in outcomes across space.

_{n}### Semideviation

Semideviation (Markowitz 1959; Porter 1974; Estrada 2007), σ_, focuses on the magnitude of a “typical” downside event. Semideviation is similar to standard deviation; however, only downside deviations below some specified threshold called “minimum acceptable return” (MAR) are characterized. In continuous form, semideviation is the square root of a lower partial second moment of a random variable:

where *f* is the probability density function describing the system performance random variable distribution (e.g., log normal). Often, *MAR* is the mean return so that semideviation characterizes the typical loss relative to mean performance. Semideviation risk can be calculated by fitting a distribution to performance data and then characterizing partial moments. Alternatively, the empirical formula for sample semideviation can be employed (as in the R code for the *SEMIDEV* function in Text S1, *Supplemental Materials, *http://dx.doi.org/10.3996/122011-JFWM-072.S1):

where *n* is the total number of returns considered and not only those <*MAR*. Note that by dividing by the total number of returns considered, the empirical formula for semideviation down-weights risk for systems that only occasionally experience downside (relative to *MAR*) outcomes.

Semideviation can be presented in terms of absolute performance units or scaled to some meaningful reference level, such as the mean of a data set. To give an example, consider fishery-wide harvest data for the Bristol Bay drift-gillnet sockeye salmon *Oncorhynchus nerka* fishery 1985–2005 (Figure 1; data publicly available from the Alaska Department of Fish and Game). These data have a mean catch of 56,947 metric tons (mt)/year. Semideviation presented in absolute catch units with *MAR* set to the long-term mean is 14,961 mt for the Bristol Bay example data (calculated using the R function *SEMIDEV*(performance = 2, M.A.R. = mean(data, na.rm = T), denominator = “full”) provided in Text S1, *Supplemental Materials, *http://dx.doi.org/10.3996/122011-JFWM-072.S1), or in words, the typical below-average year results in bay-wide catch of 14,961 mt below the long-term mean (or a catch of only 41,986 mt). Alternatively, semideviation can be scaled to the long-term mean, which indicates that, for the example data, the typical bad year will have catch that is 26% below the mean (14,961/56,947 = 0.263).

### Conditional value-at-risk (CVaR)

Whereas semideviation characterizes typical downside events, CVaR, , measures the magnitude of extreme downside events (Rockafellar and Uryasev 2000, 2002). It is the expected return conditional on being in the α% worst-case scenarios (i.e., in terms of downside outcomes, the α% least-positive realizations of a random entity performance):

where *F* is the cumulative probability and *F*^{−1} the inverse cumulative probability function for the distribution describing the performance behavior of a system. Tail probabilities are difficult to measure, so an analyst must balance choice of α value (the smaller the α, the more extreme the downside outcome) with precision in the estimate of CVaR. In our experiences in applying CVaR to fisheries data (Sethi et al. 2012, in press), we find that α values on the order of 10–25% provide a reasonable balance between degree of extreme and ability to estimate tail probabilities, though no hard and fast rules for the choice of α are available.

Conditional value-at-risk gives an indication of the expected worst-case scenario, where the α-level defines the degree of “worst.” Similar to semideviation, CVaR can be presented in terms of absolute performance units or scaled relative to some reference level, such as the long-term mean. For the Bristol Bay example data, 25% catch CVaR in absolute catch units is 25,891 mt below the long-term mean catch (expected catch of 31,056 mt; calculated using the R function *CVAR*(performance = 2, q.tile = .25, iterations = 10) provided in Text S1, *Supplemental Materials, *http://dx.doi.org/10.3996/122011-JFWM-072.S1). Or in words, in one-quarter of the years, you expect a catch that is 25,891 mt below the mean catch. Scaled to the mean catch, 25% CVaR for the example data indicates that during about a quarter of the years, you expect a bay-wide harvest that is 45.5% less than the long-term mean (25,891/56,947 = 0.455), a substantial downside deviation when put in perspective that Bristol Bay is regarded as one of the most productive salmon fisheries in the world (e.g., Hilborn et al. 2003).

### Probability of ruin

Instead of framing risk in terms of the magnitude of a downside event, the probability of ruin, *ψ*, (e.g., Brachinger and Weber 1997) focuses on the “chance” component of the risk of a catastrophic downside outcome, as measured as the probability of realizing an outcome below a specified ruinous threshold:

where *k* is the reference ruinous performance level. In the context of a fishery, this could be measured as the probability that a stock will fall below some threshold size such that inbreeding effects could become acute or the population could go extinct (also referred to as “population viability analysis”; e.g., Shaffer 1981). Alternatively, an analyst could provide risk information useful to fishermen by estimating the probability of realizing annual gross revenues or series of annual gross revenues in a particular fishery below the minimum level required to attain a reasonable return on their efforts (e.g., revenues net of crew, maintenance, and capital costs).

Setting the ruin threshold to 35,000 mt (about 60% of the long-term mean harvest), the estimated probability of ruin for the Bristol Bay example data is 0.334 (calculated using the R function *PRUIN*(performance = 2, threshold = 35000, iterations = 10) provided in Text S1, *Supplemental Materials, *http://dx.doi.org/10.3996/122011-JFWM-072.S1). That is, in any given year, there is about a 33% chance that bay-wide catch will be <35,000 mt.

The risk measures presented here focus on downside outcomes, as would be appropriate for a wide range of fisheries and other natural-resource management scenarios where harvest or abundance are key performance goals; however, in some cases, large upside deviations could also have undesirable effects on natural resource systems (e.g., if they lead to an increase in fleet size or power and subsequent increased fishing pressure in following years; Botsford et al. 1997). In these cases, the quantitative risk measures presented in Equations 1–4 can be modified to measure upside outcomes, such as measuring typical upside deviations (cf. semideviation), upper-tail conditional expectations (cf. conditional value-at-risk) or the probability of observing outcomes above some performance threshold (cf. probability of ruin). Text S1 (*Supplemental Materials, *http://dx.doi.org/10.3996/122011-JFWM-072.S1) presents versions of Equations 1–4 parameterized in terms of upside outcomes.

## Simulation Testing

We expect that typical applications of the aforementioned risk measures in natural resource systems will involve short time series of annual data, such as catches. In this section, we examine the bias and stability of risk measures by simulating data and comparing sample-derived risk calculations against truth over a range of small sample sizes. Simulated data are modeled after Alaskan commercial fisheries (see below).

In addition to examining the performance of risk measures across sample sizes, we also examine problems with data containing zeros. Typically, to estimate the probability of ruin and CVaR, a probability distribution is fit to performance data. In a fishery context, where catches or gross revenues are all positive and often right-skewed (with many small to medium values and a few very large values), the Gamma distribution provides a natural choice, allowing for both left- or right-skewness in weakly positive (i.e., ≥0) data. Gamma distributions, however, break down when attempting to fit to data containing zeros, as might occur if a fishery were closed for multiple seasons. To accommodate this possibility, the R functions provided in Text S1 (*Supplemental Materials, *http://dx.doi.org/10.3996/122011-JFWM-072.S1) for probability of ruin and CVaR contain a method to compute resampling-based empirical estimates of risk, in addition to fitting a parametric distribution to data. Briefly, the resampling routine within the R functions generates a resampled data set of size *n* = 10,000 from the observed data set and empirical histogram densities are calculated (probability of ruin) or empirical histogram tail expectations are calculated (CVaR; details provided in Text S1, *Supplemental Materials, *http://dx.doi.org/10.3996/122011-JFWM-072.S1). We tested the relative performance of resampling-based risk calculations over a parametric fitted-distribution estimate by calculating semideviation and CVaR on simulated data containing zeros. All simulations are carried out in the R statistical programming environment.

### Simulation trials

Fishery performance data are modeled after fishery-wide annual catches in Alaskan commercial fisheries. For simulations on data that do not contain zeros, we defined the true underlying distribution for catch behavior (i.e., the data-generating process) as a Gamma distribution with shape and scale parameters chosen to match the average variance and mean of annual catches (thousand mt) across 90 different Alaskan commercial fisheries from 1985 to 2005 (data set from Sethi et al. 2012) and provide some right-skewness to the data:

For simulations on data that contain zeros, we simulated a two-part data-generating process:

where *p* is the probability of a success (*X* = 1) for a Bernoulli trial. The data-generating process in Equation 6 results in a “catch” data distribution with about 10% of the density at zero and 90% of the density in catches distributed *Gamma*(*shape* = 2.75, *scale* = 9; Figure 2). True risk measures associated with the specified data-generating process were calculated by using exact properties of the underlying data distribution or by computing empirical quantities based on very large random samples from the underlying data distribution, which asymptotically approach the true population quantities (Efron and Tibshirani 1993).

For each simulation trial, a data set was generated by sampling under the data-generating processes from Equations 5 and 6 and the following risk measures were calculated: 1) semideviation with the minimum acceptable return set to the mean of the data and presenting magnitude of downside outcome in mean catch units; 2) 25% CVaR presenting the magnitude of downside outcome in mean catch units; and 3) probability of ruin with the ruin threshold set at a catch that is 35% below the mean of the data. These quantities were generated with the following calls to the R risk functions detailed in Text S1 (*Supplemental Materials, *http://dx.doi.org/10.3996/122011-JFWM-072.S1): *SEMIDEV*(performance = 3, M.A.R. = 1, denominator = “full”), *CVAR*(performance = 3, q.tile = 0.25, iterations = 5), and *PRUIN*(performance = 3, threshold = 0.35, iterations = 5).

Simulation trials constructed 750 data sets not containing zeros (Equation 5) and 750 data sets containing zeros (Equation 6) at each sample size ranging from *n* = 5–50. For each sample size, the mean ± 1 SD of bias (sample-derived quantity – true quantity) across the 750 trials was calculated. For the purposes of these simulations, we consider an estimator to exhibit unbiased performance if mean bias across the 750 trials at a given sample size (*n* = 5 to 50) is within ±5% of the true quantity. We also calculated the proportion of estimated risk measures across simulation iterations that were within ±5% or 10% of the true quantity at a selection of sample sizes. For comparison, we calculated the coefficient of variation (sample standard deviation/ sample mean) for simulated data sets and calculated the proportion of estimates within ±5% or 10% of the true coefficient of variation value (calculated based upon a random sample of size *n* = 10^{6} from the underlying data-generating processes outlined in Equations 5 and 6).

### Simulation results

#### Data without zeros

For the shape, location, and spread of the data modeled after Alaskan commercial fisheries catches, resampling-based risk measures were unbiased for sample sizes of ≥10 when data do not contain zeros, as evidenced by mean bias that was within ±5% of the true value across 750 simulation trials (Figure 3a–c, black lines). Estimates of the probability of ruin and 25% CVaR based upon parametric fitted distributions had mean bias across simulation trials within ±5% of the true value with sample sizes of ≥15, or ≥25, respectively (Figure 3b–c, red lines). Probability of ruin and CVaR measures based on parametric fitted distributions resulted in more precise distributions of risk measures at each sample size (Figure 3b–c). For example, the inner quartile range of the percent difference between sample-based probability of ruin and true probability of ruin at a sample size of 20 was −51.0% to 44.3% when computed using the empirical estimate, compared to −39.1% to 19.2% when using the parametric fitted distribution estimate. Similarly, for 25% conditional value at risk, the inner quartile range for the percent difference at sample size of 20 for the empirical estimate was −13.5% to 16.0%, compared to −7.6% to 17.2% for the parametric fitted distribution estimate. Measures based on parametric fitted distributions also stabilized more quickly than the respective resampling-based quantities.

Semideviation and 25% CVaR performed comparably to the coefficient of variation in terms of the proportion of simulation trials at a range of sample sizes that resulted in estimated risk measures within ±5% or 10% of the true value, whereas probability of ruin resulted in a wider spread of risk estimates and a lower rate of estimates within 5% or 10% of the true quantity (Table 1). Metric performance generally increased with larger sample sizes, although all measures, including the coefficient of variation, had relatively low rates of estimates within 5% or 10% of the true quantity at sample sizes below 30.

#### Data with ∼10% zeros

The semideviation risk-measure function, which does not fit a distribution to data but only uses a resampling routine, performed nearly equally as well in terms of mean bias when sample data contained 10% zeros as compared with no-zero data, in both cases exhibiting mean bias within ±5% of the true value with sample sizes of ≥10 (Figure 3a–d). Semideviation measures were less precise when data contained zeros, although the difference in precision was not great. For example, the inner quartile range of the percent difference between sample-based semideviation and true semideviation at a sample size of 20 was −11.8% to 8.1% when data contained zeros, compared to −10.6% to 7.3% when data did not contain zeros. In contrast, the probability of ruin and CVaR risk measures based on parametric fitted distributions were biased at all sample sizes tested (Figure 3e–f, red lines). Fortunately, the resampling-based estimates performed better, where probability of ruin had mean bias within ±5% of the true value at sample sizes of ≥10 (Figure 3e, black lines). The resampling-based 25% CVaR measure had mean bias that was more variable than semideviation or probability of ruin measures across the 750 simulation trials, but mean bias stabilized to be within ±5% of the true value with sample sizes of ≥15 (Figure 3f, black lines). Similar to semideviation, the resampling-based probability of ruin and 25% CVaR measures were less precise when data contained zeros.

Results of the proportion of risk estimates within 5% or 10% of the true quantities across simulation iterations when data contain zeros demonstrate that semideviation and the coefficient of variation performed comparably, and both outperformed the probability of ruin and 25% CVaR (Table 1). Resampling-based quantities for probability of ruin and 25% CVaR outperformed estimates based upon parametric fitted distributions.

## Discussion

The R functions for semideviation, CVaR, and probability of ruin performed well under simulation testing, at least when using data modeled after Alaska commercial fishery catches (Equations 5–6). Resampling-based estimates of risk for all three measures had expected bias within ±5% of the true value at small sample sizes when zeros are not present (i.e., *n* ≈ 10 or greater), and resampling-based estimates were also unbiased at small sample sizes when zeros were present (i.e., mean bias within ±5% of the true value when *n* ≈ 10 or greater for semideviation and probability of ruin and *n* ≈ 15 or greater for 25% CVaR). Furthermore, risk-measure precision stabilized at sample sizes on the order of ≥20. Semideviation performed equally as well as a traditional measure of variability, the coefficient of variation, in terms of the percentage of simulation trials that resulted in estimates within 5% or 10% of the true value, although CVaR and the probability of ruin, which attempt to characterize extreme events, did not perform as strongly. Taken together, simulation results suggest that risk measures can perform reasonably well with small data sets, although the likelihood that a risk estimate from any one realization of a data-generating process is within 5% or 10% of the true quantity may be low unless data sets are relatively large (e.g., with data modeled after Alaskan commercial fisheries catches, *n* larger than ≈ 30). Typically, natural resource managers deal with a single replicate of a small sample size; we caution that risk estimates (or any other statistics) generated from such data likely contain error and some bias.

Risk measures such as those outlined in this article allow analysts to break down the variability observed in socioecological systems into finer scale components than do symmetric measures of variability such as standard deviation or coefficient of variation. Risk measures quantify upside or downside outcome variability and typical or extreme outcome variability, providing data to make comparative analyses of risk-profiles across natural resource systems (e.g., fisheries), to monitor changes in risks in a system through time, and to model the effects of proposed regulations (e.g., through management strategy evaluation; Smith et al. 1999; Rademeyer et al. 2008).

An important feature of risk measures, such as those presented in this article, is that they can be custom-tailored to evaluate policy performance in light of specific management objectives in decision analysis applications. Decision analysis in natural resource management is a generic process that involves digesting quantitative and qualitative data to select policy options that satisfy multiple objectives in complex problems involving multiple stakeholders. Decision analysis techniques range widely (e.g., Sethi 2010), including management strategy evaluation (Smith et al. 1999; Rademeyer et al. 2008), multiobjective optimization (Mardle et al. 2000), and scenario planning (Peterson et al. 2003); however, they all include a statement of management objectives agreed upon by the relevant stakeholders and performance criteria with which to evaluate policy selections in light of the problem objectives (Lahdelma et al. 2000). Risk measures as outlined here provide quantitative criteria to directly evaluate management objectives. In one example, management strategy evaluation by Jones et al. (2011) included a risk metric analogous to the probability of ruin to examine the probability that changes to salmon escapement regulations in a system modeled after the Kuskokwim River, United States, would result in a population failing to achieve escapement goals in 4 of 5 years, which would trigger an undesirable official “stock of concern” designation by the State of Alaska. Webby et al. (2007) evaluated the performance of policies to restrict water levels on the Mekong River, Cambodia, in light of economic development objectives that use conditional value-at-risk to estimate potential fishery harvest losses.

## Supplemental Materials

Please note: The *Journal of Fish and Wildlife Management* is not responsible for the content or functionality of any supplemental material. Queries should be directed to the corresponding author for the article.

**Text S1.** R code to calculate semideviation, conditional value-at-risk, and probability of ruin; formulae for Equations 1–4 parameterized in terms of upside deviations.

Found at DOI: http://dx.doi.org/10.3996/122011-JFWM-072.S1 (50 KB DOCX).

## Acknowledgments

We thank the Alaska Fisheries Science Center and the Region 7 U.S. Fish and Wildlife Service for support during this work. We thank three anonymous reviewers and the *JFWM* Subject Editors for their thoughtful comments, which improved this manuscript. S.A.S. was partially supported under a National Science Foundation Graduate Fellowship during this work.

The use of trade, firm, or product names is for descriptive purposes only and does not imply endorsement by the U.S. Government.

## References

## Author notes

Sethi SA, Dalton M. 2012. Risk measures for natural resource management: description, simulation testing, and R code with fisheries examples. *Journal of Fish and Wildlife Management* 3(1):150-157; e1944-687X. doi: 10.3996/122011-JFWM-072

The findings and conclusions in this article are those of the author(s) and do not necessarily represent the views of the U.S. Fish and Wildlife Service.