Abstract
The Sierra Nevada red fox Vulpes vulpes necator is a native subspecies associated with subalpine regions in the Sierra Nevada and Cascade mountain ranges of California and Oregon. In the past century, the Sierra Nevada red fox experienced a major range contraction and decline in California. However, the number, size, and connectivity of populations extant in Oregon remain unclear. This knowledge gap impedes efficient monitoring and hinders development of a cohesive conservation strategy for the subspecies. The historical range is large and includes rugged terrain with low accessibility; therefore, a predictive model is needed to facilitate more comprehensive and systematic surveys in the future. We initiated a multiagency collaborative effort to survey portions of the range in the Oregon Cascades during 2011–2016 (verified genetic and photographic detections) and to assemble existing sighting reports dating back to 1985 (unverified), which we used to create Maxent models to predict the potential distribution of Sierra Nevada red fox within Oregon. To identify optimal levels of model complexity, we compared cross-validation accuracy of models that varied in levels of protection against overfitting (regularization). The highest-performing models utilized intermediate regularization, and included minimum January temperature and land-cover type. Regardless of regularization or data set (verified detections, all putative detections), all models agreed in predictions of a high-probability region covering approximately 3,470 km2 or 6% of the Cascade region, corresponding to the high-elevation portion of the crest. With the exception of a gap between Mount Hood and Mt. Jefferson, this core area of predicted presence was continuous along the north–south extent of the crest, suggesting a capacity for high connectivity among observed clusters of occurrence. Use of modeled potential distributions in future survey design will improve efficiency of field data collection, facilitating more precise evaluations of the distribution, abundance, and genetic integrity and connectivity of Sierra Nevada red fox in Oregon.
Introduction
The Sierra Nevada red fox Vulpes vulpes necator is one of three montane red fox subspecies endemic to the mountain ranges of western North America. During the Last Glacial Maximum (18,000–20,000 y ago) the ancestral montane lineage was widespread throughout the western United States, only becoming isolated in disjunct mountain ranges as glaciers retreated during the recent Holocene (Aubry et al. 2009). The most significant distributional gaps arising from this fragmentation led to recognition of three contemporary montane subspecies (Merriam 1900; Sacks et al. 2010): the Cascade red fox V. v. cascadensis, restricted to the Washington portion of the Cascades; the Rocky Mountain red fox V. v. macroura, occupying the largest of the montane subspecies' ranges; and the Sierra Nevada red fox, which historically occurred throughout the high-elevation areas of the Pacific Crest mountain ranges in California and Oregon. All three subspecies have declined in the past century, but none more precipitously than the Sierra Nevada subspecies (Perrine et al. 2010; Sacks et al. 2010). As a consequence, the distribution of the Sierra Nevada red fox has become increasingly fragmented over the past century.
At present, California has only two known populations of Sierra Nevada red fox, with a total abundance estimate of <50 individuals (Perrine et al. 2010; Quinn and Sacks 2014); whereas in Oregon, the status and distribution is poorly understood (Hiller et al. 2015; U.S. Fish and Wildlife Service 2015a). In 2015, the U.S. Fish and Wildlife Service determined that the remnant population in the Sierra Nevada of California warranted listing as a distinct population segment under the U.S. Endangered Species Act (ESA 1973, as amended), although listing was precluded by higher priorities. However, for the populations in northern California and Oregon, which were grouped together into a single Southern Cascade distinct population segment, the U.S. Fish and Wildlife Service declined to recommend listing (U.S. Fish and Wildlife Service 2015b). This decision not to recommend listing for the Southern Cascade distinct population segment was based largely on the absence of information regarding the number, size, and connectivity of Sierra Nevada red fox populations in Oregon.
At the outset of our study, knowledge of Sierra Nevada red fox distribution in Oregon was primarily based on descriptive, historical accounts. The greatest challenge to developing a more precise model of its distribution and abundance in Oregon for conservation planning has been the sheer paucity of field data. Surveys for Sierra Nevada red fox are challenging because of several factors common to rare carnivores: few individuals spread over large spatial extents, imperfect and heterogeneous detection probabilities, uneven access to remote terrain, and budgetary constraints (Gu and Swihart 2004; Harmsen et al. 2011; Gerber et al. 2014). Annual trapping records are available for a broader timescale (1946–present) but have been recorded at the county level in Oregon, precluding habitat-specific mapping. As a result, distributional knowledge has been limited to anecdotal accounts of Sierra Nevada red fox in association with park-like woodlands and meadows that characterize the subalpine and alpine regions (elevation >1,300 m) of the Cascade Range (Bailey 1936; Ingles 1965; Aubry 1983; Verts and Carraway 1998).
A predictive modeling approach that objectively stratified the landscape in terms of relative probabilities of occurrence would help to improve future survey efficiency for Sierra Nevada red fox in Oregon (e.g., Guisan et al. 2006), promoting collection of larger sample sizes of genetic and demographic data for more direct connectivity analyses and hypothesis testing. Such approaches have been used to design surveys for the Sierra Nevada red fox in California (Cleve et al. 2011), the Cascade red fox in Washington (Akins 2017), as well as the low-elevation Sacramento Valley red fox V. v. patwin (Sacks et al. 2017). Ultimately, the development of an accurate predictive model depends on the availability of reliable and representative occurrence data. The ideal source of such data is occupancy surveys consisting of repeated visits to randomly selected locations throughout the possible range that document absence as well as presence data (MacKenzie et al. 2002; MacKenzie and Royle 2005). In the absence of systematic presence–absence data, however, presence-only data can be used as a cost-effective alternative or preliminary means of producing hypothetical distribution maps, which can later be tested and refined in more localized, smaller scale occupancy studies (e.g., Peterman et al. 2013). As with presence–absence modeling, the reliability of presence-only distribution models depends on accurate and representative data, as well as on inclusion of sufficient numbers of influential environmental covariates so as to capture ecological relationships, but not so many variables as to overfit models (Warren and Seifert 2011; Merow et al. 2013; Yackulic et al. 2013; Radosavljevic and Anderson 2014).
In the present study, our objective was to generate a predictive map of the potential distribution of Sierra Nevada red fox throughout the Cascades, to facilitate monitoring and future data collection. We collaborated with a large group of wildlife biologists for several years to obtain photographic and genetic samples from wherever we could throughout the Oregon Cascades. Although some of these data came from small-scale occupancy (i.e., presence–absence) surveys, the vast majority of our data were presence-only, necessitating a presence-only modeling approach, for which we used Maxent. We also obtained putative (unverified) sighting records that enabled us to broaden the geographic scope of our data set, albeit with the likely cost of reduced species accuracy. To assess robustness of models to potential hidden geographic bias, false-positive reports, and under- or overfitting, we employed multiple partitions of the data over a range of regularization parameterizations.
A caveat of our study was that although our objective was to model the distribution of the Sierra Nevada red fox subspecies in Oregon, we did so including any red fox detected within the Cascade study area. Although the genetic integrity of Sierra Nevada red fox is an important consideration in conservation planning, thorough investigation of the extent and impacts of introgression constitutes a major study in its own right, well beyond the scope of this one. Additionally, our methodological decision to treat all red fox detections as equivalent was both necessary to incorporate photographic detections, as well as appropriate for our objective of developing a tool to guide genetic surveys for the Sierra Nevada subspecies. Genetic surveys include genetic characterization of introgression and, therefore, should encompass all red foxes occurring in the putative range of the Sierra Nevada subspecies. Nevertheless, we report haplotypes of the genetic samples collected during this study and consider their consistency with Sierra Nevada red foxes.
Study Area
Our study area comprised the West and East Cascade ecoregions of Oregon, with the majority of survey effort concentrated along the crest of the range (56,600 km2; Oregon Department of Fish and Wildlife 2016; Figure 1). The West Cascade ecoregion encompasses the steep, densely forested ridges and valleys of the western flank as well as the high-elevation peaks of the volcanic crest. This ecoregion experiences abundant precipitation from the maritime climate, with annual totals averaging approximately 130 cm, increasing upslope to >300 cm where the majority of precipitation falls as snow between November and March (Parameter-elevation Regressions on Independent Slopes Model; PRISM; Daly et al. 1994). The western flank is dominated by dense coniferous forests of Douglas fir Pseudotsuga menziesii and western hemlock Tsuga heterophylla, grading at higher elevations to forests of mountain hemlock T. mertensiana, noble fir Abies procera, and Pacific silver fir A. amabilis. The crest of the Cascades is defined by prominent volcanic peaks (2660–3426 m) that rise >1 km above the surrounding highlands (1500–2000 m). These conical peaks of bare rock and permanent snowfields are typically ringed around their bases by open parklands of alpine meadows, dwarf shrubs, and sparse stands of mountain hemlock, subalpine fir A. lasiocarpa, and whitebark pine Pinus albicaulis. In contrast to the West Cascade ecoregion, rainshadows cause the East Cascade ecoregion to receive lower amounts of annual mean precipitation (∼110 cm) and greater extremes in temperature (average minimum to maximum annual temperature range: −6.5–25.3°C; PRISM). Forests on the east side are primarily composed of lodgepole Pinus contorta and ponderosa P. ponderosa pines that are adapted to a drier climate. Volcanic cones and buttes are scattered across the eastern slope, but are more isolated and lower in elevation (1,800–2,100 m) relative to the high peaks of the Cascade Crest.
Study area in the Oregon Cascades, USA, and historical ranges of Sierra Nevada Vulpes vulpes necator and Rocky Mountain Vulpes vulpes macroura red fox subspecies adapted from Bailey (1936).
Study area in the Oregon Cascades, USA, and historical ranges of Sierra Nevada Vulpes vulpes necator and Rocky Mountain Vulpes vulpes macroura red fox subspecies adapted from Bailey (1936).
Methods
We modeled Sierra Nevada red fox distribution in Maxent version 3.2.1 (Phillips et al. 2006; Phillips and Dudík 2008), which we chose for its flexibility in use of presence-only data and robustness to small sample sizes (Hernandez et al. 2006; Wisz et al. 2008). As with other approaches to modeling species distributions, Maxent uses statistical associations between occurrence localities and environmental correlates to project regions in environmental space similar to detection sites. Additionally, the maximum entropy framework dictates that the predicted distribution be as spatially diffuse as possible while meeting the constraint that the environmental properties of the prediction match those of the occurrence data (Elith et al. 2011; Merow et al. 2013).
Field collection and DNA verification
In collaboration with biologists from multiple state and federal agencies and nongovernmental organizations, we collected red fox observations during 2011–2016 on the Deschutes, Mount Hood, Umpqua, and Willamette national forests, and Crater Lake National Park (Figure 1). Detections included digital images from ad hoc and systematic camera surveys, and genetic samples from scat searches, baited hair snares, and a small number of vehicle-struck red foxes near roads. Only a fraction of detections (see Results) originated from remote camera surveys (e.g., Hiller et al. 2015), and these surveys were conducted at local scales using differing protocols; therefore, the quantity and quality of presence–absence data were insufficient for modeling range-wide distribution in an occupancy framework. In addition to recent field detections, we incorporated opportunistic sightings reported to the U.S. Forest Service Northwest Research Station during 1985–2013.
For genetic samples, we conducted deoxyribonucleic acid (DNA) extraction, polymerase chain reaction (PCR) amplification, and sequencing at the Mammalian Ecology and Conservation Unit of the Veterinary Genetics Laboratory at the University of California, Davis. Upon collection, we stored fecal samples in 95% ethanol, and hair and tissue in desiccant. To prevent and detect contamination, we extracted DNA from feces and hair with negative controls on a bench dedicated to low-quantity DNA. We extracted DNA from feces (n = 362) using QIAamp® Stool Kit, and from hair (n = 24) and tissue (n = 3) using DNeasy Blood and Tissue Kits (Qiagen Inc, Valencia, CA). For fecal and tissue samples, we followed manufacturer's instructions for the corresponding kit, except that we eluted fecal DNA samples in 50 μL of buffer. For hair samples, we digested a 0.5–1.0-cm length of hair, including the follicle, from 1 to 20 hairs to 300 μL Buffer ATL, 20 μL proteinase K, and 20 μL 1M DTT; subsequent steps followed manufacturer's protocol.
To confirm that noninvasive samples were from red fox, we sequenced a conserved portion of the mammalian mitochondrial cytochrome b (cyt b) gene, which readily distinguished red fox from other sympatric mesocarnivores (e.g., Miles et al. 2015). We sequenced samples using previously published primers and laboratory protocols (RF14724, 5′-CAACTATAAGAACATTAATGACC-3′; RF15149, 5′-CTCAGAATGATATTTGTCCTC-3′; Perrine et al. 2007). Briefly, we used both primers for PCR amplification, after which we sequenced the product from primer RF14724 using BigDye Terminator v3.1 (Applied Biosystems, Foster City, CA), and reagents and manufacturer's instructions for cycling conditions (Perrine et al. 2007). We trimmed all sequences to 354 bp to obtain sequences directly comparable to published V. vulpes haplotypes world-wide. We then used the Basic Local Alignment Search Tool (Altschul et al. 1990) to search for sequences homologous to red fox in Genbank (98% identity). Finally, we matched each verified red fox sequence to red fox haplotypes previously accessioned in Genbank (100% identity).
Data partitions
The various types of occurrence records likely entailed different biases. In particular, restricting the data set to the most reliable presence data (digital images, genetic detections) entailed greater spatial bias because surveys collectively reflected a relatively small portion of the available landscape. Conversely, sightings, which represented opportunistic sampling of a comparably large fraction of the landscape, were more vulnerable to false positives (Aubry et al. 2017). We therefore built models based on two data sets and considered the extent of model agreement to be an indicator of model robustness and, to the extent that they disagreed, considered each to reflect an opposite end of a range of potential distributions. The occurrence data set of verified detections included recently collected (2011–2016) digital images and genetic samples, whereas the inclusive data set was composed of these records plus additional records of unverified sightings available from the Pacific Northwest database and spanning a wider interval of time (1985–2013). To reduce pseudoreplication owing to spatially nonrandom sampling, we filtered occurrence records to retain only one record in a 4-km grid cell (Boria et al. 2014), a resolution that approximates the home range size of montane red foxes (Perrine et al. 2010; Quinn and Sacks 2014) and provided a reasonable balance between spatial independence and sample size.
Spatial extent and background points
Maxent models are sensitive to the spatial extent of available habitat used for model training (Lobo et al. 2008; Elith et al. 2011). To limit model training to areas more likely to have been surveyed, we restricted background extent to the vicinity of occurrence localities. Specifically, for each of the data sets (i.e., verified, inclusive) we buffered occurrence points using a 20-km radius and randomly sampled 10,000 points within the cumulative buffered area for use as background points in model development. The distribution of genetic samples from nontarget species (not used in modeling) provided an approximate sense of the survey area beyond our red fox detections, which corresponded well to the 20-km buffer distance (Figure 2). We then projected distribution models to the entirety of both East and West Oregon Cascades ecoregions (hereafter, the “Cascades ecoregion”).
Locations of verified red fox Vulpes vulpes detections in the East and West Cascade ecoregions prior to spatial filtering (genetic n = 72; digital images n = 52), Oregon, USA, during 2011–2016. Genetic samples that failed to amplify or were identified as nontarget species (n = 232) are shown as an index of survey effort for genetic scat searches.
Locations of verified red fox Vulpes vulpes detections in the East and West Cascade ecoregions prior to spatial filtering (genetic n = 72; digital images n = 52), Oregon, USA, during 2011–2016. Genetic samples that failed to amplify or were identified as nontarget species (n = 232) are shown as an index of survey effort for genetic scat searches.
Environmental variables
We initially selected 29 environmental layers to characterize aspects of climate, topography, and land cover across the available landscape (Table S1, Supplemental Material). We referenced all raster layers using the WGS84 geographic coordinate system and resampled them to a resolution of 30 arc-seconds (∼1 km). We extracted 14 bioclimatic variables from the WorldClim database (Hijmans et al. 2005) derived from temperature and precipitation. We substituted 5 WorldClim values (BIO1, BIO5, BIO6, BIO12, BIO 19) for corresponding climate data from PRISM. PRISM is another climate model that uses point data and digital elevation models to generate gridded estimates, and has performed well in mountainous terrain (Daly et al. 1994). Specifically, we used annual precipitation totals and temperature extremes, as well as monthly estimates for winter (December–February) and summer (June–August) seasons, calculated from conditions averaged over the most recent three full decades (30-y normals). We derived three topographic layers from the National Elevation Dataset (U.S. Geological Survey 2009). We used the terrain function in Program R package raster (Hijmans 2016) to calculate slope, roughness (the difference between the maximum and minimum elevation value of a cell and its surrounding neighbors), and aspect. We transformed aspect to a continuous variable between zero and one by taking the absolute value of degrees after normalization (McCune et al. 2002; Kalle et al. 2013). This transformed aspect depicts incident radiation, but can be interpreted as “southness” because the value approaches zero at northerly aspects and one at southerly aspects. We used two variable sets to characterize land cover and vertical vegetation structure. We obtained land-cover data from Northwest ReGAP (2010), a national project that uses 30-m Landsat satellite data to map ecological vegetation classes (http://gapanalysis.usgs.gov). We adjusted existing land-cover categories to consolidate agriculture and development into a single land-disturbance category, and grouped Siskiyou Mixed Conifer Forests and Woodlands with Mixed Conifer Forests (Table S2, Supplemental Material). Additionally, we used Gradient Nearest Neighbor (GNN) structure maps obtained from the Landscape Ecology Mapping, Modeling, and Analysis group (LEMMA 2009) to obtain a continuous percent canopy cover layer that characterized vegetation density at a finer scale within forest types.
After assembling and resampling all layers to a resolution of 30 arc-seconds (∼0.83 km2), we reduced the full suite of variables to the least correlated subset to support simpler, more interpretable models (Dormann et al. 2013). Using the vif.cor function in Program R package usdm (Naimi 2015), we identified the pair of variables with the highest absolute linear correlation, and excluded the variable with the highest variance inflation factor, which is an index of collinearity derived from regressing the predictor variable against all other variables. We iterated this process until all remaining variables had a pairwise correlation below an absolute threshold of 0.7 (Table S3, Supplemental Material), a level demonstrated to sufficiently detect detrimental levels of collinearity in simulated data sets (Dormann et al. 2013). After pruning for collinearity, we conducted no further selection of variables prior to modeling, but allowed the internal regularization function within the Maxent algorithm to select the most informative variable features.
Finally, as a heuristic check on modeled relationships, we investigated the relationships between the raw occurrence data and range of environmental conditions available on the study area. We created selection indices by log-transforming the ratio of observed to expected proportion of occurrence records (+1 to avoid division by zero), and subtracting log(2) so that positive and negative values corresponded to greater and less use relative to expectations, respectively (e.g., Neale and Sacks 2001). We visually inspected selection indices for evidence of strong relationships (e.g., linear, quadratic) between fox occurrence and environmental conditions, and compared these relationships to response curves modeled in Maxent. We also compared the identity of the variables with the strongest relationships to their relative importance to model predictions. To assess variable importance to model predictions, we considered percent contribution, a path-dependent proxy for variable importance calculated during the fitting process in Maxent, and changes in training gain calculated by jackknifing the predictor variables. For jackknife tests, high gain scores calculated for models built from a single variable indicated that a variable sufficiently captured a majority of the distributional signal, and a large reduction in gain when a variable was excluded indicated that it possessed unique information not contained in other variables (Phillips 2005).
Modeling
Maxent selects variable features and protects against overfitting internally through regularization (see Text S1, Supplemental Material, for details). To identify optimal levels of model complexity, we built models in the R package ENMeval (Muscarella et al. 2014) for each data set using a range of values for the regularization multiplier (β = 0.5–6 in increments of 0.5), and considered only linear, quadratic, and product feature classes. We selected the optimal regularization setting based on the ability of the model to predict data withheld from model training. Rather than randomly withholding data points, we manually partitioned presence and background points into spatially independent folds to minimize the spatial correlation between training and testing points. Points close in geographic space are typically close in environmental space, and as a result random partitioning of clustered data tends to inflate estimates of model performance (Hijmans 2012; Radosavljevic and Anderson 2014). We thus used latitude (decimal degrees) to split points into three bins (i.e., spatial folds) of equal sample size (latitudes for inclusive data set: Bin A, <43.150; Bin B, 43.150–43.987; Bin C, >43.987; latitudes for verified data set: Bin A, <43.600; Bin B, 43.600–44.194; Bin C, >44.194). We trained each modeling iteration using two of three partitions and withheld the third partition for model evaluation.
We assessed relative overfitting using the average area under the receiver operating characteristic curve (AUC) values. The AUC indicates how well the model correctly ranks test presence points higher than background points (Phillips et al. 2006), and serves as a threshold-independent indicator of the ability of a model to discriminate between low and high probabilities of occurrence. With presence-only data, AUC treats background points as absences, which can create misleading evaluations when used as an absolute indication of model quality for taxa, like the Sierra Nevada red fox, with low prevalence and low sampling effort (Lobo et al. 2008; Peterson et al. 2008; Bean et al. 2012; Merow et al. 2013). Here, we used AUC to rank models built from the same data sets with differing levels of complexity. As the penalty against complexity increased toward its optimum, we expected AUC calculated using training data (AUCTRAIN) to decline and AUC calculated using spatially independent test data (AUCTEST) to increase, until eventually the difference between the training and test AUC values (AUCDIFF) no longer exhibited meaningful changes. We thus selected optimal regularization settings by maximizing gains in average AUCTEST (predictive power) and minimizing the difference in AUCDIFF (overfitting). To allow comparison between spatial folds, we trained models using the spatially binned background points described above, but calculated AUC values using the full set of background points.
We applied optimal regularization settings to the final models, which we then parameterized using the entire data set. We transformed raw predictions using the logistic transformation for visualization purposes and comparison among models, but treated these values as ordinal indices of the probability of occurrence as opposed to true probabilities of occurrence (Yackulic et al. 2013). Finally, we used two levels of model thresholds for the purposes of qualitatively comparing models and stratifying the landscape for future surveys and initial management and conservation efforts. Thresholding decisions always entail some degree of arbitrariness, and some of the most commonly used approaches (e.g., lowest presence threshold) are sensitive to small sample size (Loiselle et al. 2003; Bean et al. 2012). We determined thresholds for the most restrictive model using an approach that attempts to balance sensitivity and the total expanse of predicted area (Engler et al. 2004), and then applied these criteria uniformly to the other models to facilitate comparison. Specifically, we chose threshold values that maximized the difference between the proportion of presence sites versus the proportion of the study area where we predicted presence (i.e., maximizing the discriminatory power of the model).
Results
Presence data
In total, we obtained 124 recent (2011–2016) red fox detections, including 52 digital images and 72 genetically verified scat samples (Figure 2; Data S1, Supplemental Material). All genetically detected red foxes carried previously observed (e.g., Sacks et al. 2010) cytochrome b haplotypes. Most of these were haplotype “A,” which indicates Sierra Nevada red fox or Rocky Mountain red fox matrilines; we also found haplotype “G,” a fur-farm haplotype, on Mount Hood, indicative of nonnative introgression in that locality (Sacks et al. 2010). Additionally, we obtained 45 records of unverified sightings available from the Pacific Northwest database and spanning a wider interval of time (1985–2013). The filtered data set resulted in 33 occurrences for the high-reliability data set consisting of only verified detections (mean distance between nearest neighboring points = 4.5 km, range = 0.7–17.6 km) and 62 occurrences for the high-inclusivity data set that incorporated sightings (mean distance between nearest neighboring points = 5.0 km, range = 0.3–28.8 km).
Environmental variables and associations
After filtering correlated variables based on variance inflation factor, we employed the following nine variables in our models: minimum January temperature, total December precipitation, precipitation seasonality (coefficient of variation of precipitation), temperature seasonality (standard deviation of temperature), isothermality (ratio of mean diurnal to mean annual temperature range × 100), aspect (transformed), roughness, percentage of canopy cover, and land-cover type. Based on either the spatially filtered subset of verified or inclusive data set of detections, red fox occurrence was nonrandomly associated with land-cover type and minimum January temperature (Figure 3; Figure S1, Supplemental Material). Montane meadows, silver fir and mountain hemlock forests, subalpine woodlands, alpine habitats, and lava all had more observed red fox detections than expected based on the composition of available habitat. In contrast, although disturbed land-cover types, mixed conifer forest, and ponderosa pine together accounted for 40% of red fox presence points, they were observed less frequently in these land-cover types than expected. Red foxes occurred more frequently than expected in locations with intermediate minimum January temperatures, specifically in the range of −7.5 to −4.5°C. No strong selection trends were apparent relative to other environmental gradients.
Comparisons of observed and modeled occurrence of red foxes Vulpes vulpes relative to environmental covariates in the Oregon Cascades, USA, for 33 spatially independent verified records (2011–2016). Selection (gray bars, right ordinal axis) indicates observed use relative to expectations based on composition of the available background. The relative probability of presence (red points, left ordinal access) indicates the modeled influence of a given land-cover type on the relative probability of presence when all other variables are held at their respective average values (shown for optimal regularization settings).
Comparisons of observed and modeled occurrence of red foxes Vulpes vulpes relative to environmental covariates in the Oregon Cascades, USA, for 33 spatially independent verified records (2011–2016). Selection (gray bars, right ordinal axis) indicates observed use relative to expectations based on composition of the available background. The relative probability of presence (red points, left ordinal access) indicates the modeled influence of a given land-cover type on the relative probability of presence when all other variables are held at their respective average values (shown for optimal regularization settings).
Maxent models
Land-cover type and minimum January temperature were the most informative predictors in all Maxent models according to both jackknife tests and path-dependent contribution scores, together accounting for 70–90% contribution across final models (Figure 4; Figure S2, Supplemental Material). In all Maxent iterations, montane meadows had large positive responses, whereas ponderosa forest consistently negatively influenced the probability of red fox occurrence (Figure 3; Figure S1, Supplemental Material). Higher minimum January temperatures exhibited a strong negative influence on the probability of occurrence in all models (Figure 3; Figure S1, Supplemental Material). Other environmental variables had weak interactive effects, particularly in models built from the inclusive data set, but contained little unique information (Figures 3, 4; Figures S1, S2, Supplemental Material).
Importance of environmental predictor variables to Maxent distribution modeling for red foxes Vulpes vulpes in the Oregon Cascades, USA, using optimal regularization settings for 33 spatially independent verified records (2011–2016). Variable importance is estimated by percent contribution (right ordinate axis), training gain (left ordinate axis) when a variable is removed from the full model, and training gain when a variable is modeled on its own.
Importance of environmental predictor variables to Maxent distribution modeling for red foxes Vulpes vulpes in the Oregon Cascades, USA, using optimal regularization settings for 33 spatially independent verified records (2011–2016). Variable importance is estimated by percent contribution (right ordinate axis), training gain (left ordinate axis) when a variable is removed from the full model, and training gain when a variable is modeled on its own.
In both data sets, intermediate regularization settings improved predictions of spatially independent data (Table 1; Figure 5; Figure S3, Supplemental Material). We selected optimal regularization values at the point where increasing regularization no longer exhibited significant rates of change in AUCTEST and AUCDIFF (β = 3.0 for verified; β = 2.5 for inclusive). Optimizing regularization effectively reduced the number of parameters from 15 to 7 in the verified data set, and 19 to 11 in the inclusive data set. Across all regularization settings, the verified data set showed higher variability of performance among spatial folds, as well as larger AUCDIFF values, suggesting an overall greater statistical vulnerability to overfitting relative to the inclusive data set. Visual inspection of model projections over a range of regularization settings consistently revealed a narrow longitudinal strip along the Cascade crest as the region with the highest relative probabilities of occurrence (Figure 6; Figure S4, Supplemental Material). In contrast, occurrence probabilities for the eastern slope of the Cascades were highly sensitive to regularization, but, in general, rarely exceeded intermediate probability values. Predictions developed from the full inclusive data set were broadly similar to those from the verified data set, although use of sighting data marginally extended the longitudinal width of the high-probability core region.
Average area under the receiver operating characteristic curve (AUC) values estimated from spatial cross-validation of Maxent predictions for locations of red foxes Vulpes vulpes in the Oregon Cascades, USA. Performance based on the full or training data (AUCFULL, AUCTRAIN) was greatest for models using default regularization (β), but models using increased regularization were better able to predict spatially independent test data (AUCTEST), and showed a smaller difference between test and training performance (AUCDIFF). Occurrence data sets used in modeling included 33 spatially independent verified records (2011–2016) and 62 inclusive records that additionally incorporated unverified sightings (1985–2016).

Average area under the receiver operating characteristic curve (AUC) of spatially binned training (solid line) and test (dotted line) data across a range of regularization values (β) for Maxent distribution models of red fox Vulpes vulpes in the Oregon Cascades, USA. The AUC values were calculated from models developed from a data set of 33 spatially independent verified records (2011–2016). We chose optimal values (*) where gains in AUCTEST plateaued and the difference between test and training scores were minimal.
Average area under the receiver operating characteristic curve (AUC) of spatially binned training (solid line) and test (dotted line) data across a range of regularization values (β) for Maxent distribution models of red fox Vulpes vulpes in the Oregon Cascades, USA. The AUC values were calculated from models developed from a data set of 33 spatially independent verified records (2011–2016). We chose optimal values (*) where gains in AUCTEST plateaued and the difference between test and training scores were minimal.
Relative probabilities of occurrence for red fox Vulpes vulpes in the Oregon Cascades, USA, based on 33 spatially independent verified records (•; 2011–2016). Predictive surfaces are shown for Maxent models that used (A) default regularization (β = 1) and (B) optimal regularization (β = 3).
Relative probabilities of occurrence for red fox Vulpes vulpes in the Oregon Cascades, USA, based on 33 spatially independent verified records (•; 2011–2016). Predictive surfaces are shown for Maxent models that used (A) default regularization (β = 1) and (B) optimal regularization (β = 3).
We chose threshold probability values from the verified data set with default regularization to dichotomize the model because it predicted the most restricted distribution of the final four models. We identified two clear maxima in the difference between the proportion of study area and the proportion of verified occurrence locations included in the predicted area (Figure S5, Supplemental Material). The lower, more permissive threshold (0.27) predicted 12% or 6,914 km2 of the Cascades ecoregion (35% of buffered background points) as probable presence and correctly assigned 91% of verified red fox points (Table 2; Figure 7; Figure S6, Supplemental Material). The more restrictive probability threshold (0.50) assigned 4% or 2,439 km2 of the Cascade ecoregion (16% of buffered background points) as probable presence and resulted in 73% of the verified occurrence locations correctly predicted. When we applied the more restrictive threshold across all models, we predicted on average 6% (3,470 km2; SD = 846 km2) of the Cascades ecoregion as high probability of presence.
Percentages of red fox Vulpes vulpes occurrence records and the area of the Cascades ecoregion, Oregon, USA, predicted as presence when threshold decisions were applied to Maxent distribution models. Models varied by regularization setting (β) and occurrence records, including a verified-only data set (n = 33 digital images and genetic samples; 2011–2016) and an inclusive data set (n = 62 digital images, genetic samples, and unverified sightings; 1985–2016).

Predicted presence of red fox Vulpes vulpes in the Oregon Cascades, USA, after applying a restrictive threshold (>0.50; dark gray) and relaxed threshold (>0.27; light gray) to continuous probabilities generated by Maxent based on 33 spatially independent verified records (•; 2011–2016). Predictive presence is shown for models that used (A) default regularization (β = 1) and (B) optimal regularization (β = 3).
Predicted presence of red fox Vulpes vulpes in the Oregon Cascades, USA, after applying a restrictive threshold (>0.50; dark gray) and relaxed threshold (>0.27; light gray) to continuous probabilities generated by Maxent based on 33 spatially independent verified records (•; 2011–2016). Predictive presence is shown for models that used (A) default regularization (β = 1) and (B) optimal regularization (β = 3).
Discussion
This study marks a significant step in Sierra Nevada red fox conservation. Previously, the only indications of their historical and contemporary range in Oregon were scattered specimen records or reports and a general understanding that they primarily occurred in the subalpine zone. The distribution models developed in this study explicitly reflect habitat variables, providing clear predictions that can be tested and refined through subsequent surveys. Spatially explicit predictions also facilitate more efficient survey efforts and aid in preliminary aspects of conservation planning. Going forward, we suggest the narrow band of high-probability presence corresponding to the Cascade crest that was consistently predicted by our models serve as a working model of the potential core distribution of Sierra Nevada red fox in Oregon. Below, we elaborate on the sensitivities of our modeling to various assumptions and discuss the most robust conclusions about potential and realized distributions.
Taxonomic assumption
The decision to develop models based on red fox detections of unverified subspecies was unavoidable and warrants some discussion. As mentioned above, photographs and the evolutionarily conserved cytochrome b sequences did not provide sufficient information to diagnose subspecies. Although we detected nonnative haplotypes localized to Mount Hood, their occurrence only indicated introgression of nonnative alleles, and not necessarily that the red foxes carrying those haplotypes were primarily of nonnative ancestry. A companion genetic study, including nuclear markers (currently in progress) will be necessary to differentiate Sierra Nevada from Rocky Mountain red foxes and to assess the degree of nonnative ancestry on Mount Hood. Nevertheless, for the purposes of the present study—to produce a model to guide surveys for red foxes in the historical Sierra Nevada red fox range—the problem of red fox ancestry and genetic integrity is tangential. Future surveys will need to target genetic material from all red foxes, regardless of ancestry, within the historical range of the Sierra Nevada red fox.
Technical aspects of modeling
Understanding the assumptions and potential vulnerabilities of our modeling process is important to future survey design and model refinement. The principle challenge we confronted in distribution modeling was that records were clustered in a few geographic locations along the north–south axis of the range. Such geographic clustering can pose problems for presence-only modeling, particularly when caused by uneven sampling. One consequence of uneven sampling is an elevated risk of overfitting as a result of pseudoreplication (Phillips et al. 2006; Anderson and Gonzalez 2011). On the other hand, if the clustering of records reflected the true distribution of red fox populations in the Oregon Cascades today (i.e., a fragmented distribution), it would not equate to pseudoreplication, and a prediction inferred from clustered data may effectively represent the potential distribution today. That is, if populations have declined since historical times and contracted to regions of the highest environmental suitability, a model inferred from these locations would be expected to illuminate the best, next-best-, etc., habitat potentially accommodating a recovering population (Bean et al. 2012). In the present case, we know sampling was uneven and therefore could have contributed to overfitting of the models. However, as evidenced by the greater dispersion of nontarget samples detected in our surveys, some of this clustering also likely reflected a truly fragmented distribution of red foxes.
We could not be certain of the extent to which uneven sampling influenced our predictions, so we developed several models that likely bracketed the potential distribution of Sierra Nevada red fox in the Oregon Cascades (i.e., one of which could have been too restrictive, others, too expansive). We accomplished this in two ways. In our first approach, we used regularization to vary how faithfully the environmental space of predicted presence reflected the environmental space of clustered locations. By increasing the level of protection against overfitting (regularization), our modeling framework supported models with fewer parameters and greater generality, presumably better able to predict spatially independent occurrence. In a second effort to promote a more permissive distribution, we widened the temporal span of occurrence data to include the past three decades, during the earlier part of which red foxes may have been more widespread across the landscape. By including older anecdotal data, we expanded the geographic extent of occurrence records to include locations in the lower elevation mountains south of Crater Lake, and regions farther east of the crest. However, this approach necessitated relaxing standards of data quality to include unverified sightings, introducing potential contamination of the data set by false reports (Aubry et al. 2017). The results of these modeling decisions were four predictive surfaces that collectively agreed in projections of a longitudinal band encompassing the crest, but varied substantively in predictions over the eastern portion of the Cascades. We attributed high and low levels of confidence to these regions accordingly, discussed in detail below.
Implications for Present and Potential Distributions and Future Surveys
The high-elevation zone along the Cascade crest (i.e., “core” potential distribution) exhibited high probabilities of presence according to all models and was therefore our most robust inference about the potential distribution of Sierra Nevada red fox. One explanation for the stability of the core band of high-probability across models was the dominance of land-cover type and minimum winter temperatures in all model variations. Both variables were correlated with demarcation of the subalpine–alpine zones and therefore represent logical predictor variables given previously described habitat associations of Sierra Nevada red fox (Bailey 1936; Grinnell et al. 1937; Aubry 1983; Perrine 2005). Previous distribution models for montane red fox populations in California and Washington also have attributed the greatest influence to climatic variables reflecting extreme winter conditions characteristic of alpine regions, including minimum winter temperatures, winter precipitation, and spring snow cover (Cleve et al. 2011; Akins 2017). In addition to climate, both our unmodeled selection indices and our model coefficients supported an affinity of Sierra Nevada red fox to open cover-types typical of higher elevations, particularly montane meadows and subalpine woodlands, and a negative association with densely canopied forests common to the midelevation western slope. The core region of high probability occurrence was thus internally consistent irrespective of modeling choices, as well as consistent with available information about montane populations in the historical Oregon Cascades and contemporary western mountain ranges.
In contrast to the Cascade crest, the suitability of the east Cascades was unclear. On one hand, the large portions of the east Cascades over which the more permissive models predicted presence with moderate probability did indeed appear to support Sierra Nevada red foxes historically (Bailey 1936; Aubry 1983). On the other hand, predicted probabilities of these models never exceeded intermediate values, disagreed with one another in terms of specific locations on the east Cascades, and failed to predict the eastern-most sighting records, which potentially represented false reports, nonnative red foxes, or dispersal events. Ultimately, additional surveys that explicitly target the east Cascades and incorporate genetic confirmation are needed to test and refine this portion of the model.
As with any habitat-based correlative model, an important caveat of our predictions is that they reflect a potential, rather than realized distribution. Thus, although models predicted high habitat connectivity along the north–south extent of the Cascade crest, direct survey data are needed to evaluate the actual continuity of the Sierra Nevada red fox distribution. Several nonenvironmental factors potentially influence the realized distribution of Sierra Nevada red foxes within this area, including demographic factors (e.g., inbreeding depression), competition with coyotes Canis latrans, anthropogenic barriers, and historical contingency. However, our models also predicted a gap in the potential distribution between Mount Hood and Mount Jefferson, which correspondingly predicts a gap in the realized distribution. Determining whether such a distributional gap corresponds to a barrier to connectivity requires empirical testing, such as with genetic data. With accumulation of larger sample sizes and use of higher resolution genetic markers, future analyses can more directly inform the structure of populations in the Oregon Cascades, incorporating estimates of relatedness, diversity, effective population size, and assignment of ancestry. Altogether, the spatial hypotheses of potential connectivity represented by our models facilitate the design of future surveys and population studies needed to characterize genetic connectivity directly.
Resolving the population status in Oregon is essential to management, conservation, and planning efforts for the Sierra Nevada red fox. Improving our understanding of the potential distribution is a fundamental step in this progression because the extent and arrangement of potential habitat is an underlying constraint for abundance and connectivity. More directly, concentrating future survey efforts on areas where red foxes are most likely to occur facilitates the collection of demographic data. The predictive surfaces generated by our presence-only modeling confirm the high-elevation crest as the highest priority for concentrating survey efforts, especially in regions where predicted occurrence is high and previous sampling effort was low. We consider our models to represent the first generation of models to be refined in an adaptive process, whereby future detections can be iteratively integrated to test environmental correlates and improve accuracy and generality (e.g., Guisan et al. 2006).
Supplemental Material
Please note: The Journal of Fish and Wildlife Management is not responsible for the content or functionality of any supplemental material. Queries should be directed to the corresponding author for the article.
Data S1. Occurrence records of red fox Vulpes vulpes (n = 169) in the Oregon Cascades, USA, collected during 1985–2016. To model species distributions in Maxent, we considered a verified data set consisting of genetic samples (n = 72) and digital images (n = 52), and an inclusive data set that additionally incorporated visual sightings (n = 45). Prior to modeling, we filtered data sets to retain 1 record/4-km grid cell, resulting in spatially independent verified (n = 33) and inclusive (n = 62) subsets. Locations of occurrence records are truncated to two decimal points because of the rarity and potential vulnerability of populations. Cytochrome b haplotypes “A” and “G” corresponds to GenBank accessions EF064207.1 and EF064212.1 respectively.
Found at DOI: https://doi.org/10.3996/082017-JFWM-067.S1 (18 KB XLSX).
Table S1. Environmental layers considered for Maxent modeling of red fox Vulpes vulpes distribution in the Oregon Cascades, USA, 1985–2016, prior to using variance inflation factor (VIF) to exclude variables with pairwise correlations >|0.7|.
Found at DOI: https://doi.org/10.3996/082017-JFWM-067.S2 (11 KB XLSX).
Table S2. Reclassification of ReGAP land-cover types cover type for use in Maxent modeling of red fox Vulpes vulpes distribution in the Oregon Cascades, USA. Proportion of training background points (randomly drawn from 20-km buffered region around occurrence points; n = 10,000) and red fox occurrence locations (inclusive data set, n = 62; 1985–2016) are shown for each Maxent class.
Found at DOI: https://doi.org/10.3996/082017-JFWM-067.S3 (11 KB XLSX).
Table S3. Pairwise correlation of environmental variables considered for Maxent modeling of red fox Vulpes vulpes distribution in the Oregon Cascades, USA, 1985–2016. Variables with pairwise correlations >|0.7| were pruned for collinearity using variance inflation factor (VIF).
Found at DOI: https://doi.org/10.3996/082017-JFWM-067.S4 (23 KB XLSX).
Text S1. Explanation of how regularization performs variable selection and limits model complexity in Maxent.
Found at DOI: https://doi.org/10.3996/082017-JFWM-067.S5 (342 KB PDF).
Figure S1. Comparisons of observed and modeled occurrence of red foxes Vulpes vulpes relative to environmental covariates in the Oregon Cascades, USA, for 62 spatially independent inclusive records (1985–2016) that incorporated unverified sightings in addition to verified records. Selection (gray bars, right ordinal axis) indicates observed use relative to expectations based on composition of the available background. The relative probability of presence (red points, left ordinal access) indicates the modeled influence of a given land-cover type on the relative probability of presence when all other variables are held at their respective average values (shown for optimal regularization settings).
Found at DOI: https://doi.org/10.3996/082017-JFWM-067.S6 (8.11 MB PDF).
Figure S2. Importance of environmental predictor variables to Maxent distribution modeling for red foxes Vulpes vulpes in the Oregon Cascades, USA, using optimal regularization settings, for 62 spatially independent inclusive records (1985–2016) that incorporated unverified sightings in addition to verified records. Variable importance is estimated by percent contribution (right ordinate axis), training gain (left ordinate axis) when a variable is removed from the full model, and training gain when a variable is modeled on its own.
Found at DOI: https://doi.org/10.3996/082017-JFWM-067.S6 (8.11 MB PDF).
Figure S3. Average area under the receiver operating characteristic curve (AUC) of spatially binned training (solid line) and test (dotted line) data across a range of regularization values (β) for Maxent distribution models of red fox Vulpes vulpes in the Oregon Cascades, USA. The AUC values are calculated from models developed from a data set of 62 spatially independent inclusive records (1985–2016) that incorporated unverified sightings in addition to verified records. We chose optimal values (*) where gains in AUCTEST plateaued and the difference between test and training scores were minimal.
Found at DOI: https://doi.org/10.3996/082017-JFWM-067.S6 (8.11 MB PDF).
Figure S4. Relative probabilities of occurrence for red fox Vulpes vulpes in the Oregon Cascades, USA, based on 62 spatially independent inclusive records (1985–2016) that incorporated unverified sightings (+) in addition to verified records (•). Predictive surfaces are shown for Maxent models that used (A) default regularization (β = 1) and (B) optimal regularization (β = 2.5).
Found at DOI: https://doi.org/10.3996/082017-JFWM-067.S6 (8.11 MB PDF).
Figure S5. Decision criteria for dichotomizing species distribution models of red fox Vulpes vulpes in the Oregon Cascades, USA, based on the (A) proportions of verified occurrence (n = 33) and background points (n = 10,000) predicted as presence relative to the probability values used as a threshold for predicted presence and predicted absence. Optimal thresholds (black filled circles) were chosen to promote discriminatory power, as determined by (B) maximizing the difference between the proportions of occurrence and background points included in the predicted presence surface. In principle, optimal thresholds should balance the extent of area predicted as presence (related to specificity) and the proportion of known occurrence records omitted from predicted area (1 − sensitivity).
Found at DOI: https://doi.org/10.3996/082017-JFWM-067.S6 (8.11 MB PDF).
Figure S6. Predicted presence of red fox Vulpes vulpes in the Oregon Cascades, USA, after applying a restrictive threshold (>0.50; dark gray) and relaxed threshold (>0.27; light gray) to continuous probabilities generated by Maxent based on 62 spatially independent inclusive records (1985–2016) that incorporated unverified sightings (+) in addition to verified records (•). Predictive presence is shown for models that used (A) default regularization (β = 1) and (B) optimal regularization (β = 2.5).
Found at DOI: https://doi.org/10.3996/082017-JFWM-067.S6 (8.11 MB PDF).
Reference S1. Perrine JD, Campbell LA, Green GA. 2010. Sierra Nevada red fox (Vulpes vulpes necator): a conservation assessment. Report R5-FR-010. U.S. Department of Agriculture.
Found at DOI: https://doi.org/10.3996/082017-JFWM-067.S7 (4.39 MB KB PDF).
Reference S2. Quinn CB, Sacks BN. 2014. Ecology, distribution, and genetics of Sierra Nevada red fox. Draft Report to California Department of Fish and Wildlife.
Found at DOI: https://doi.org/10.3996/082017-JFWM-067.S8 (5.31 MB PDF).
Acknowledgments
We thank our many collaborators for contributing occurrence data and collecting genetic samples: J. Doerr, C. Ferland, and R. Seitz (Willamette National Forest); L. Turner and M. Gregg (Deschutes National Forest); S. Colyer, J. VonKienast, and D. Clayton (Rogue River–Siskiyou National Forest); A. Dyck (Mt. Hood National Forest); J. Chapman (Region 6 U.S. Department of Agriculture [USDA] Forest Service); J. Nelson (High Desert Museum); T. Lysak (Cascadia Wild); S. Mohren and M. Immel (Crater Lake National Park); C. Heath, L. Erickson, and J. Vaughn (Oregon Department of Fish and Wildlife); C. Gumtow-Farrior and D. Gumtow-Farrior (Oregon State University—Cascades); G. Green; P. Alden; and J. McFadden. We thank Z. Lounsberry for laboratory assistance. Funding for this research was provided by Oregon Department of Fish and Wildlife (Agreement 379-15) through a Pittman Robertson grant and the USDA Forest Service (Agreement No. 15-CR-11060120-029), along with additional funding and support from the USDA Forest Service, National Park Service, Cascade Carnivore Project, Cascadia Wild, and the Mammalian Ecology and Conservation Unit of the Veterinary Genetics Laboratory in the University of California, Davis School of Veterinary Medicine. We thank an Associate Editor and two anonymous reviewers for comments that helped improve and clarify this manuscript.
Any use of trade, firm, or product names is for descriptive purposes only and does not imply endorsement by the U.S. Government.
References
Author notes
Citation: Quinn CB, Akins JR, Hiller TL, Sacks BN. 2018. Predicting the potential distribution of the Sierra Nevada red fox in the Oregon Cascades. Journal of Fish and Wildlife Management 9(2):351–366; e1944-687X. doi:10.3996/082017-JFWM-067