## ABSTRACT

One useful tool for assessing the potential adverse effects of oil spills on aquatic and marine environments are computational models that calculate the transport, fate and effects of substances. The use of computational models is authorized by 43 CFR 11, and has been codified in the Natural Resources Damage Assessment for Coastal and Marine Environments (NRDAM/CME). The code in the NRDAM/CME is incorporated into modules that compute how a spilled substance moves through a marine environment and affects biological resources. This paper describes an apparent error in one module of the NRDAM code, the one that calculates the mortality rate of exposed organisms to concentrations that are less than the medial lethal concentration (LC50) of a mixture of dissolved petroleum hydrocarbons. Based on a review of the original publication, we have concluded the original mean “hill slopes” range from 4.34 to 24.16, which are much greater than the slope (functions) of 1.1 to 1.7 that are used in a module of the NRDAM/CME code. The use of a probit slope function as a logistic slope has the effect of overestimating the mortality rate for marine species exposed to low hydrocarbon concentrations and underestimating mortality of marine species exposed to concentrations greater than the LC50. The “shallow” slope function also calculates a much lower threshold concentration at which adverse effects are expected, than is supported by slope values indicated by the original source. The error in the slope of this module may also explain, in part, why field personnel have not observed fish kills when the NRDAM/CME has predicted mortalities occurred.

## INTRODUCTION

The procedures for assessing damages to natural resources following an unpermitted release of oil or a hazardous substance include computational methods such as the NRDAM/CME (French et al. 1997) for coastal and marine environments, and a companion model for the Great Lakes environment (NRDAM/GLE). Code from these models have also been incorporated into the Spill Impact Model Application Package (SIMAP) and the Chemical/Oil Spill Impact Module (COSIM) for use in Natural Resource Damage Assessment (NRDA) cases. The biological effects modules of these models include two key sub-modules: 1) equations that calculate the LC50 for a substance or mixture that is based on the properties of the material spilled, the species exposed and conditions in the water body; and 2) a sigmoid, “hill-slope” equation that calculates the proportion of an exposed population that is expected to experience lethality at exposures above or below the LC50. The LC50 sub-module is based on well-established toxicological principles of bioconcentration, narcosis, additive effects, and critical body burdens, (McCarty 1986, McCarty et al. 1991; Di Toro et al. 2000) as described by French-McCay (2002). The hill slope sub-module is described the NRDAM/CME technical documentation (French et al. 1997), and is the topic of this paper. In some past NRDA cases, computational models predicted substantial mortality of marine organisms that were inconsistent with field observations. This inconsistency suggested the model code was overly conservative or possibly erroneous. In this paper we critically review the original source of the hill slope used in the NRDAM code (Rossi and Neff, 1978) and identify what appears to be a misinterpretation that has introduced an error in that sub-module that was propagated in SIMAP and an early version of COSIM. In this paper we examine whether Rossi and Neff reported a slope factor (S) instead of a slope (b) in their original paper.

Dose-response models are used to represent the relationship between exposure (dose or concentration) and response among test organisms under controlled laboratory conditions. While there are different families of models and specialized desktop software that facilitates exploring the statistical goodness-of-fit of alternative models, the selection of a “final” model can also be influenced by historical practice. For example, two of the more commonly used models are the probit (for acute effects) and logit (for chronic effects). The probit and logit models both share the following properties: Sigmoidal shape with an inflection point at 50% response; can be bounded at 0% and 100% response; and can be fully defined by a slope parameter and a point on the curve (e.g., EC_{50,} dose that yields a response in 50% of exposed individuals).

## METHODS

Many dose-response curves share the same format in which the y-axis is given by the cumulative percentage of responses (e.g., deaths) and the x-axis is given by the log of the dose. A fundamental concept underlying the probit model is that the dose that yields a response (i.e., “effective dose”) varies by individual, and that the variability in effective doses can be described by a lognormal distribution. Therefore, if the parameters of the lognormal distribution are defined at any given dose, the probability of response for an individual can be represented, or similarly the frequency of responses for a group of individuals exposed to the same dose. By convention, the specific points on the distribution of effective doses are:

i. EC

_{50}- the dose that yields a response in 50% of the individuals, which is the inflection point of most dose-response models;ii. EC

_{16}- the dose that yields a response in 16% of the individuals, which is 1 standard deviation (SD) below the EC_{50}for data are normal or lognormal; and,iii. EC

_{84}- the dose that yields a response in 84% of the individuals, which is 1 SD above the EC_{50}for normal or lognormal data.

For a probit model, the doses are assumed to be lognormal, and the interval given by log EC_{16} to log EC_{84} encompasses 68% of the dose-response curve, equivalent to ± 1 SD of the mean (log EC_{50}). The y-axis is given by the probit, which is a transformation of the response variable. Specifically, a probit transformation is derived from the normal equivalent deviation (NED), which allows the proportion dying to be expressed in units of standard deviations from the mean of a standard normal distribution. At doses below the log EC_{50}, probit values would be negative; therefore, by convention a constant of +5 is typically added so that the probit values are nonnegative. Adding this constant changes the intercept of the best-fit line, but not the slope.

Probit analysis is performed using the log dose (or log concentration) versus probit and fitting a straight line with the following equation:

where ** p** is the proportion of individuals that respond, Φ

^{−1}is the inverse of the cumulative distribution function (CDF) of the standard normal distribution (If a variable X has a normal distribution N(μ,σ

^{2}) with mean μ and variance σ

^{2}, then the standardized variable has a normal distribution N(0,1) with mean 0 and variance 1), and

**and**

*a***are the intercept and slope parameters of the line, respectively. If log(D) is equal to log EC**

*b*_{50}, then

**= 50%, which is the midpoint of the standard normal distribution, and**

*p***Φ**

^{−1}(

**p**)= 0. Substituting these values into Equation 1 yields:

So, the EC_{50} can be derived from the antilog of the ratio of the intercept divided by the slope. Similarly, the intercept can be calculated from EC_{50} and ** b** (Equation 2):

The slope ** b** is equivalent to the reciprocal of the standard deviation of the effective doses, σ:

Substituting Equation 2 and 3 into Equation 1 yields an alternate expression for the probit model:

In addition to ** b** and σ, a third term called the

*slope function*,

**, has also been introduced in the literature on probit analysis. The term**

*S***is equivalent to the antilog of 1/b, or equivalently the antilog of σ:**

*S*The historical use of alternative and interrelated parameters (S, ** b**, and σ) when describing probit models has led to some confusion and potential misinterpretation of dose-response modeling results reported in the literature.

One original use of the slope function, **S**, can be traced back to a semigraphical method for estimating probit model parameters developed by (Litchfield and Wilcoxon, 1949). This reference is frequently cited and, prior to the ease of calculating probit and log-transformations with personal computers, was a common method for probit dose-response analysis. We contacted the authors regarding the original calculation and learned that computers were not available to Rossi and Neff for their 1978 publication, which launched us on a reconstructive analysis of the graphical options available at the time. With the semigraphical approach, data are graphed first and ** S** is calculated based on ratios of selected points on the best-fit line applied to a log-probit relationship:

Where: ED_{84}, ED_{50}, and ED_{16} are the doses in mg/L that correspond to 84%, 50%, and 16% mortality. The ratios that are in the numerator are individually considered to be estimates of ** S** and represent the difference in log doses that would yield one standard deviation away from the mean (log EC

_{50}), as discussed above. Note that applying a log transformation to the first ratio yields the following expression:

Expressed as the log-transformation, we see how b =1/** log S** and, therefore,

**equals the antilog of 1/b (Equation 5).**

*S*By including two ratios, both of which represent the log dose interval that corresponds with one probit unit, Equation 6 provides the “average” estimate of ** S**, which was considered by Litchfield and Wilcoxon to be more reliable than either individual estimate alone.

The log-logistic (logit) model is used to describe the behavior of organisms to repeated exposures. The logit model is written as:

Where: the parameters are analogous (but not equivalent) to the slope and intercept that define the probit model. In statistical terms, log [p/(1-p)] is referred to as the “log odds metameter” or “logit of p”. Equation 7 is a logistic regression equation in which logit of ** p** is a linear function of the logarithm of the dose. Unlike ordinary regression, logistic regression is used for predicting binary outcomes (Bernoulli trials) rather than continuous outcomes. Therefore, mathematically, the logit model is well suited to acute toxicity studies in which the endpoint is mortality.

The logit transformation is used to provide a scale that facilitates linear regression analysis. Assume the probability distribution for *Y _{i}* given a log dose yields an estimate of the number of “successes” (i.e., deaths) for dose group

*i*comprised of

*n*animals. We can state that the distribution of

*Y*is

_{i}*binomial*with parameters

*n*and

_{i}*p*, which we write as

_{i}*Y*~

_{i}*B*(

*n*,

_{i}*p*), where

_{i}*n*animals in dose group

*i*are receive dose log(D). One problem with this model is that

*p*is confined to the interval [0, 1], which does not allow for linear regression analysis. Transforming

_{i}*p*to the logit scale (Equation 7) allows the outcomes to be defined across the interval [−∞, +∞]. This now enables to define the logistic regression model by assuming that the logit of the probability

_{i}*p*, rather than the probability itself, follows a linear model. The inverse of the logit model allows to go back from logits to probabilities.

_{i}Both probit and logit models are a function of the logarithm of the dose, and on a semi-logarithm scale they yield a sigmoidal curve (Figure 1). A comparison between the two models at four different slopes is provided in Table 1 and Figure 2. The sum Toxic Unit (TU) is a measure of dose (exposure/EC50) for all components in a mixture (Di Toro et al. 2000; French-McCay 2002). By using a sum TU approach, the results in Table 1 are scaled to the EC_{50}, and are independent of the actual EC_{50} concentrations for specific substances or mixtures.

Table 1 illustrates how probit and logit models that share the same parameters will yield different responses. Differences between model predictions of response are most evident at the tails of the curves (Figure 1). Both models have the same inflection point at EC_{50}. Logit yields lower response for the low-end of the curve (R < 50%) and higher response at the high-end of the curve (R > 50%). Using an equivalent EC_{50}, as the slope increases the difference between the two models decreases (Figure 2). The greatest discrepancy between the models is observed at the lowest slope (b=0.5).

The NRDAM/CME code and derivative models use a logistic dose-response equation for the hill slope sub-module that calculates the percent mortality in the exposed population. The equation can be applied in the following form:

where ** R** is the predicted response (e.g., % mortality), C

_{mix}(μmol/L) is the sum of the molar concentrations of all constituents in the mixture, and LC

_{50mix}is the molar weighted average of the individual LC

_{50}for each of the constituents in the mixture, and σ is the standard deviation of the responses.

Through a series of transformations, Equation 8 can be converted to an equivalent logistic model expressed as a function of sum TU and a slope (** b** from Equation 7):

Substituting Equation 3 and dividing by C_{mix}^{b} yields:

Solving ratios equal to 1 yields:

The molar weighted average of the individual LC_{50} (i.e. LC_{50, mix}) is expressed with the following equation:

Where: C_{w,i} is the molar concentration of analyte ** i**, C

_{w,mix}is the sum of the molar concentrations of all chemicals in the mixture, and LC

_{50,i}is the molar concentration of the i

^{th}analyte that yields 50% mortality.

Then the ratio of C_{mix} and LC_{50, mix} in the denominator of Equation 10 can be expressed as:

So Equation 10 becomes:

Using the expression for sum, Equation 13 becomes:

The ‘b’ value is attributed to Rossi and Neff (1978), who applied a probit analysis to data from 96-hr toxicity studies of individual polycyclic aromatic hydrocarbons (PAHs), which are the most potent components of hydrocarbon mixtures, administered to immature young adult polychaetes (*Neanthes arenaceodentata*). Specifically, the probit parameter value ** S** reflects the single study with 1-methylphenanthrene that yielded both the lowest LC

_{50}and lowest

**. This and other results from Table 2 from Rossi and Neff (1978) are reproduced in Table 2 below. The last two columns for σ and**

*S***are added here to facilitate comparisons with**

*b***(they were not included in the original table).**

*S*While the title of Table 2 clearly implies that values presented in the table are ** S** (the slope function) the field name that is used is “slope” rather than “slope function”. This introduces an ambiguity and raises a question - did Rossi and Neff convert

**to**

*S***as discussed above (Equation 5), or did they simply report**

*b***and just abbreviate the name for the column header?**

*S*While the original dose-response data are not reported by Rossi and Neff (1978) that would allow for a recalculation of the parameter values, the presentation of confidence limits on the LC_{50} does provide an additional clue and suggests that the values in Table 2 are in fact ** S** and not

**. Litchfield and Wilcoxon (1949) provide the following equations for the upper and lower confidence interval.**

*b*Upper 95% CI = ED

_{50}x f_{ED50}Lower 95% CI = ED

_{50}/ f_{ED50}

Where:

And, N′ is the total number of animals tested in three dose groups (ED_{84}, ED_{50}, and ED_{16}). Rearranging Equation 15 to solve for N′ and the expressions for the confidence limits yields:

For example, if Rossi and Neff are presenting ** S** in Table 2, the estimate of the sample size for naphthalene using Equation 16 is:

Whereas, if Rossi and Neff are presenting ** b** in Table 2, the estimate of the sample size for naphthalene using Equation 16 is:

## RESULTs

The assumption that Rossi and Neff report ** S** in Table 2 would suggest that the sample size for the three dose groups is 219, or approximately 70 animals per dose group. This is probably an overestimate, but within a plausible range given the uncertainty introduced by the rounding error since the results in Table 2 are only reported to one decimal. By contrast, if we assume that Rossi and Neff report

**in Table 2, this would suggest an implausible study design with more than 1,000 animals per dose group. The complete set of estimated sample sizes for**

*b***and**

*S***are given in Table 3 below.**

*b*The more plausible estimates of total sample size are given by the slope function, ** S** . Finally, the following language is used by Litchfield and Wilcoxon (1948), which strongly suggests that Table 2 of Rossi and Neff is populated with

**rather than**

*S***:**

*b*Results from this study were originally cited by authors of the 1996 NRDAM model (French et al., 1997). From page I.4–11 of that document:

Neff and Anderson (1981), in a review of toxicity tests on oils and polycyclic aromatic hydrocarbons, report slope values ranging from 1.0 to 2.0, with a median value of about 1.2. Since a majority of spills involve petroleum products or constituent chemicals…and slope values for other chemicals are not available, 1/σ = 1.2 is assumed to apply to all oils and chemicals in the model database.

## DISCUSSION

The misinterpretation of the parameters reported by Rossi and Neff coupled with the application of the probit model slope to the logit model yields a final dose-response relationship in the NRDAM code that is uncertain and appears to be erroneous. The net effect of these substitutions on the hill slope is further complicated by the fact that the dose-response relationship is non-linear. Errors will vary depending on where the exposure dose falls on the dose-response curve. For this same reason, it is difficult to *a priori* rank the individual studies on the basis of a reported ** S** values alone. In cases where the hill equation is important for identifying a conservative threshold for PAH exposure, it would be appropriate to select a study that will yield a high response for an exposure dose that is below the ED

_{50}, or equivalently a sum TU less than 1.0 (Equation 14), we would focus on dose-response curves that have shallower slopes (i.e., lower values of

**) and, therefore, higher values of**

*b***.**

*S*It may be preferable to calculate a summary statistic of the set of values reported by Rossi and Neff, such as the geometric mean or median. French et al., 1997 adopted this approach and observed that the values range between 1.0 and 2.0 “, *with a median of about 1.2*”. Incidentally, the seven ** S** values are presented in Table 2; the median is 1.4 (not 1.2) and the geometric mean is 1.34. When we examine the equivalent logistic slopes from Neff and Rossi (1978) as the values that are applicable to the (logistic) NRDAM hill equation, the median value is 6.84 and the geometric mean is 8.91. (Table 2).

## CONCLUSIONS

The NRDAM report cites (Neff and Anderson, 1981), which in turn cites the original work by (Rossi and Neff, 1978) as summarized above. It is now clear that 1997 NRDAM report misinterpreted the article by Rossi and Neff. We conclude from this analysis that the authors of NRDAM assumed Rossi and Neff were reporting the slope, ** b** (or 1/σ), rather than the slope function,

**.**

*S*The difference in interpretation of the resulting hill equation in the NRDAM/CME family of models is striking (Figure 3). Application of the median logistic slope (** b** = 6.84) from the Rossi studies (Table 4) yields an exposure-response curve in which the 10 percent mortality occurs at 0.7 summed toxic units (70% of the LC50). In contrast, when the probit slope function (

**= 1.4) is substituted for the logistic slope b, the exposure-response curve predicts 10% mortality at 0.2 summed toxic units. The current NRDAM hill slope sub-module in which a slope factor predicts 10% mortality in a population of aquatic species that is exposed to a concentration equal to 20% of the LC50; however, when the correct logistic slope is used, no mortality is predicted at that same exposure. Errors are observed at exposures greater than the LC50 as well. The current NRDAM code predicts 90% mortality at exposures 5 times greater than the LC50, while the correct slope estimates the same level of mortality at approximately 1.4 times the LC50. We recommend the use of a central value from the Rossi and Neff (1978) data set, such as the median of 6.84 or the geometric mean o 8.91 for the logistic slope in the NRDAM hill slope sub-module. This practice would align the assessment method with the original studies used for its development and likely generate results that are more consistent with field observations at exposures less than the LC50 value for a spilled substance.**

*S*