Understanding the distribution of pathogens across landscapes and their prevalence within host populations is a common aim of wildlife managers. Despite the need for unbiased estimates of pathogen occurrence and prevalence for planning effective management interventions, many researchers fail to account for imperfect pathogen detection. Instead raw data are often reported, which may lead to ineffective, or even detrimental, management actions. We illustrate the utility of occupancy models for generating unbiased estimates of disease parameters by 1) providing a written tutorial describing how to fit these models in Program PRESENCE and 2) presenting a case study with the pathogen ranavirus. We analyzed ranavirus detection data from a wildlife refuge (Maryland, US) using occupancy modeling, which yields unbiased estimates of pathogen occurrence and prevalence. We found ranavirus prevalence was underestimated by up to 30% if imperfect pathogen detection was ignored. The unbiased estimate of ranavirus prevalence in larval wood frog (Lithobates sylvaticus; 0.73) populations was higher than in larval spotted salamander (Ambystoma maculatum; 0.56) populations. In addition, the odds of detecting ranavirus in tail samples were 6.7 times higher than detecting ranavirus in liver samples. Therefore, tail samples presented a nonlethal sampling method for ranavirus that may be able to detect early (nonsystemic) infections.
Emerging infectious diseases (EIDs) in wildlife populations are an increasingly common threat to biodiversity (Daszak et al. 2000; Fisher et al. 2012). Developing appropriate management responses to EIDs is an important goal for many natural resource managers and researchers (Grant et al. 2017; Russell et al. 2017; Gerber et al. 2018). Assessing disease risks and implementing management interventions to improve a population's resilience to disease requires reliable understanding of a pathogen's distribution at multiple spatial and temporal scales (McClintock et al. 2010).
The coarsest scale of interest to resource managers is often where a pathogen occurs on the landscape and which factors shape that distribution through space and time (Adams et al. 2010; Lorch et al. 2013). The focus is on estimating the proportion of sites (i.e., discrete habitat patches) occupied by infected host populations, or the probability that a randomly selected site with certain characteristics is occupied by an infected host population. It is possible to estimate changes in the distribution of infected host populations through time, using multiyear studies, giving rise to estimates of the pathogen's local transmission and extinction rates. Investigating shifts in a pathogen's spatial or temporal distribution in hosts is useful for understanding the pathogen's ecology and origins (Rachowicz et al. 2005), assessing host species extinction risk (Rödder et al. 2009), and predicting host-pathogen dynamics under future climate and management scenarios (Rogers and Randolph 2000; Gerber et al. 2018).
Researchers and managers may also seek to understand within-population disease dynamics. In these cases, the focus is on variation in the proportion of individuals infected in a population (i.e., prevalence), and in the characteristics (e.g., stage or age, sex, size) that may influence infection probability. Comparing prevalence estimates for a single host population through time can provide insights about host impacts, the transmission process, and the spread of disease (Caley and Ramsey 2001). Alternatively, comparing estimates of prevalence across populations and species yields information about habitat-mediated disease dynamics, differential susceptibility (Scheele et al. 2017), and dynamics of the disease and its transmission in multi-host systems (Keesing et al. 2006). Monitoring changes in prevalence can help identify enzootic from epizootic infections, which often call for different management actions (Langwig et al. 2015).
Variables such as site occurrence and prevalence cannot be observed without error; failure to detect a pathogen can occur at several stages in the observation process (Fig. 1; McClintock et al. 2010). At the landscape and site levels, a pathogen might go undetected if an occupied habitat or infected individuals are not sampled, a concern when prevalence or population size are low, or when prevalence is low and population size is large (Green et al. 2009; MacKenzie et al. 2018). The problem is amplified if infected animals are more difficult to detect than uninfected individuals (Jennelle et al. 2007; Valenzuela-Sánchez et al. 2017), or more likely to permanently emigrate (Schmidt 2010). Pathogen occurrence may go undetected at the individual level either because the sample collected (e.g., a skin swab, tail clip) did not contain the pathogen or because the diagnostic test used was imperfect (i.e., low sensitivity; Thompson 2007). For instance, molecular assays can fail to amplify pathogens due to low quantities of DNA or the presence of assay inhibitors (McKee et al. 2015; Goldberg et al. 2016). All detection methods are imperfect to some extent (Lachish et al. 2012; Miller et al. 2012), and this must be considered when inferences on occurrence and prevalence are to be used for management strategies or for increased ecologic understanding.
Underestimating pathogen distributions could result in management actions such as reintroductions occurring in suboptimal locations (Muths et al. 2014) and makes predicting pathogen spread difficult, leaving managers to deal with more expensive reactive control measures rather than proactive prevention measures (Johnson et al. 2015). Underestimates of prevalence, stemming from a failure to diagnose individuals as infected, might delay the initiation of management actions like culling, visitor-use restrictions, or chemical treatments (Wasserberg et al. 2009; Woodhams et al. 2011; Langwig et al. 2015). Despite the need for unbiased estimates of host-pathogen disease dynamics, naïve detection data (that do not account for imperfect and variable detection probabilities) are often the only metric reported in pathogen distribution studies.
Occupancy models allow for estimation of pathogen occurrence while explicitly accounting for imperfect and variable detection probabilities (MacKenzie et al. 2002; Tyre et al. 2003). The models account for false negative detection errors by relying on multiple surveys over space or time during a window where the true pathogen occurrence status of the study unit (a site or an individual) is closed to changes (MacKenzie et al. 2017).
Our goal was to illustrate an analytical framework for estimation of unbiased pathogen parameters of a wildlife disease using a standard sampling design and a case study on amphibian ranavirus in the northeastern US. We used three occupancy analyses of pathogen detection data to demonstrate the estimation of site-level occurrence of infected populations, individual-level occurrence (prevalence) of pathogens, or both quantities simultaneously. We discuss how our results inform ranavirus monitoring and surveillance programs. We focused on a ranavirus case study but these methods are readily transferrable to other pathogen systems, and we provide a tutorial (see Supplementary Materials) to help readers fit these models to data from other systems.
Ranaviruses (Family Iridoviridae) are generalist pathogens that have been linked to amphibian die-offs and population declines (Daszak et al. 2003; Duffus et al. 2015). Larval amphibians are especially susceptible to ranavirus infections (Hoverman et al. 2011), although adults can be negatively affected as well (Price et al. 2014; Sutton et al. 2014). Ranaviruses have infection affinity for vascular epithelial cells, and cell death results in widespread organ necrosis, failure of blood vessels, and internal hemorrhaging (Chinchar 2002; Miller et al. 2015). Ranavirus-induced declines can be severe; mortality approaches 100% at some locations (Green et al. 2002; Price et al. 2014; Wheelwright et al. 2014). Host death can occur within only a few days of infection and die-offs may occur rapidly (Hoverman et al. 2011; Wheelwright et al. 2014). Because mortality occurs around the same time as metamorphosis and amphibians decompose or are scavenged rapidly (Brunner et al. 2015), it can be difficult to determine whether nondetection of larval amphibians on a late-season survey is due to a ranavirus die-off or successful metamorphosis.
Although a gold standard tissue type has not been identified by the World Organization for Animal Health for ranavirus testing in amphibians (Miller et al. 2015), liver, spleen, and kidney tissues are most commonly used. Ranavirus detection in the liver generally indicates systemic infection (Miller et al. 2015), but tail clips are a common nonlethal sample assumed to have a lower detection probability than liver samples (Gray et al. 2012; Sutton et al. 2015).
MATERIALS AND METHODS
Ranavirus survey field methods
Our study was conducted at the US Fish and Wildlife Service's 5,200-ha Patuxent Research Refuge (PRR) in central Maryland, US (39°1′N, 76°47′W). The refuge is covered in bottomland hardwood and mixed forest, and is estimated to contain around 2,250 vernal pool breeding sites (Van Meter et al. 2008). We sampled 22 randomly selected wetlands for larval amphibians about 60 d after egg mass deposition when we expected to find late-stage tadpoles and larvae. Each wetland was sampled on a single day from 24 May 2011 to 26 May 2011. We used dip nets to capture wood frog (Lithobates sylvaticus) tadpoles and spotted salamander (Ambystoma maculatum) larvae. Each animal was isolated and handled with a new pair of sterile nitrile gloves to reduce the chance of cross-contamination of ranavirus. The pH of each wetland was also measured using a PC Testr 35 Multiparameter meter (Oakton Instruments, Vernon Hills, Illinois, USA).
Captured animals were immediately humanely euthanized via transdermal exposure to benzocaine hydrochloride, and placed into individual sterile plastic bags containing 95% ethyl alcohol (Green et al. 2009). Animals were sent to the University of Tennessee Center for Wildlife Health (Knoxville, Tennessee, USA) where they were necropsied; three samples were collected from each specimen: one from the liver and two from the tail.
To detect ranavirus, we followed the protocol of Sutton et al. (2014) targeting a 70-base pair region of the virus's major capsid protein and running the quantitative PCR for 40 cycles. This method detects the presence of ranavirus DNA within the samples collected, and does not detect the presence of disease. Only samples with a threshold cycle <33 were considered positive based on a standard curve of known ranavirus quantities (Miller et al. 2015). Each quantitative PCR assay included two positive controls (cultured virus and genomic DNA from a known positive tadpole) and two negative controls (DNA-free water and genomic DNA from a known negative tadpole).
Statistical methods and terminology
Researchers may be interested in the probability of pathogen occurrence in a site (ψsite), in an individual (ψind or θ), or both. These parameters are binomial random variables. We first illustrate how these quantities can be estimated in separate analyses using standard occupancy models (Mac-Kenzie et al. 2002). Then, we demonstrate how the data can be analyzed in a multiscale occupancy model (Nichols et al. 2008) to estimate both occurrence and prevalence simultaneously. These methods account for false negative detection errors, and assume that false positive detections do not occur.
Site occurrence estimation
The naïve estimate of pathogen site occurrence (ψnaive) is the proportion of wetlands where a pathogen was detected. This estimate will be biased low if detection (p) is imperfect. Alternatively, one can use replicate pathogen surveys to account for imperfect detection (ψsite). Replicate pathogen surveys may correspond to multiple environmental (e.g., environmental DNA [eDNA]; Mosher et al. 2017), host-based (e.g., skin swab; DiRenzo et al. 2018), or sub–host-based (e.g., tissue; Elmore et al. 2014) samples and must be collected over a time period when the site is assumed to be closed to changes in pathogen occurrence (Table 1).
The probability of obtaining detection history 00110 over five surveys (where 0s and 1s denote pathogen nondetection and detection, respectively) is as follows:
This history indicates that the pathogen was present (ψsite), was undetected on surveys 1 and 2 (1-psite)2, was detected on surveys 3 and 4 (psite2), and was again undetected on survey 5 (1-psite). Sites with all zero detection histories (i.e., 00000) are ambiguous because observations could arise in two ways; either because the pathogen is truly absent (1-ψsite) or because the pathogen was present but went undetected on all visits (ψsite[1-psite]5). Information from sites where both detections (value=1) and nondetections (value=0) occur help to resolve these ambiguities (MacKenzie et al. 2002).
When multiple individuals are sampled for a pathogen at a single site, the resulting data can be used to investigate the prevalence (ψind) of the pathogen in that population using the same occupancy model described above (Table 2). In this case, prevalence is the probability that a pathogen occurs in an individual rather than in a site.
Pathogen occurrence (at sites), prevalence (within populations), and detection (within samples) compose a hierarchical research question (Fig. 1). A multiscale extension of the standard occupancy model (Nichols et al. 2008) allows for estimation of all three parameters simultaneously and avoids nonindependence issues that may be present in other formulations. Traditional singleseason occupancy models assume closure across all observations at a site during the survey period (in this case, that 100% of amphibians at a site are infected or uninfected). The multiscale model alleviates this assumption by allowing some animals to be infected while others are not. This model was designed to account for variation in occurrence across space or time, but the concept is also readily applicable to sites and individuals in a pathogen occurrence framework (Schmidt et al. 2013; Elmore et al. 2016; Mosher et al. 2017). Conditional on pathogen occurrence at the site (ψsite), some proportion of individuals may also be infected (θ; prevalence). The probability of detecting the pathogen in a single sample (pmult) is conditional on pathogen presence in both the site and individual. Using the multiscale model, the probability of obtaining a detection history of 010000 for a site with two sampled individuals and three tissue samples per individual is as follows:
This history indicates that the pathogen was present at the site (ψsite) and in the first sampled individual (θ), despite the fact that it was only detected on one tissue sample. The second individual's history (000) is ambiguous, and could have arisen because the individual was uninfected (1-θ), or because the individual was infected (θ) but the laboratory methods failed to detect the pathogen on all three trials ([1-pmult]3).
Each analysis (site occurrence, prevalence, multiscale) includes an estimation of detection probability, p. The interpretation of these detection probabilities varies depending on the study design and analysis, which we note by subscripting p based on the analysis being discussed. In the site occurrence analysis, the detection probability estimated (psite) is in fact a product of the true prevalence and detection probability, also called the positive predictive value (Nichols et al. 2008), rather than the diagnostic sensitivity of the assay. In the prevalence analysis, the replicates within a sample unit come from only one individual, so the detection probability (pind) is no longer confounded with prevalence, but in our example, inference is restricted to the one site examined. Finally, in the multiscale occupancy model, prevalence and detection are no longer confounded, and pmult is the diagnostic sensitivity of the assay.
Software and multi-model inference
We used all of the ranavirus data (22 wetlands) for the site occurrence and multiscale analyses, and used data from one wetland (with 24 sampled individuals) for the prevalence analysis. We generated a set of biologic hypotheses for each analysis (Table 3), and evaluated the strength of those hypotheses by fitting competing models. Models were fit using single-season standard and multiscale occupancy models in Program PRESENCE (Hines 2006). We evaluated support for competing hypotheses using Akaike's information criterion adjusted for small sample sizes (AICc) and associated model weights (Burnham and Anderson 2003).
We collected up to 12 individuals of each species per site (Fig. 1). This sample size was optimal (based on Monte Carlo simulations) given our joint objectives of maximizing the precision of both prevalence and site occurrence estimates simultaneously. We collected a total of 321 larval amphibians (117 spotted salamander larvae and 204 wood frog tadpoles). Two of the collected animals were recently deceased, while all others were alive. Without accounting for imperfect detection, the naïve estimate of ranavirus prevalence was 42% for each species (85 of 204 wood frogs and 49 of 117 spotted salamanders). Naïve ranavirus site occurrence was 50% (detections at 11 of 22 sites).
The best-supported model (Table 4; weight=0.81) suggested that ranavirus occurrence was constant across the study area (site=0.54, 95% confidence interval [CI]=0.34, 0.74) and was comparable to the naïve estimate (site=0.50). We did not find strong support for an effect of pH on the occurrence of ranavirus (Table 4; weight=0.17, =0.36, 95% CI=-0.90, 1.62). Ranavirus detection probability was imperfect, and varied by both species and tissue type (Fig. 2), potentially reflecting both prevalence and detection differences that could not be separated without using the multiscale model.
At the single focal site, naïve estimates of ranavirus prevalence were 0.75 for wood frogs and 0.17 for spotted salamanders. Model selection results from the prevalence analysis supported the idea that prevalence differed by species (Table 4), but estimates of prevalence were considerably higher than the naïve values. At this single site, ranavirus was estimated to be ubiquitous in wood frogs (ind,frog=1, SE=0) and was less prevalent in spotted salamanders (ind,salamander=0.26, 95% CI=0.06,0.68). In addition, the best-supported model suggested that detection probability was constant, and did not vary by tissue type or species at this site (Table 4; ind=0.29, 95% CI=0.17, 0.44).
Multiscale: occupancy and prevalence
Without accounting for detection probability, the naïve estimates of prevalence across the study area were identical for both species (42%; Fig. 3). However, the best-supported multiscale occupancy model produced prevalence (θ) estimates that differed by species and were much higher than naïve estimates (Table 4 and Fig. 3). In accordance with the site occurrence analysis presented, the multiscale analysis produced an occurrence estimate that was very similar to that estimated using the standard occupancy model (site=0.55, 95% CI=0.35, 0.74) and found no effect of pH on ranavirus occurrence (Table 4). Ranavirus detection probability in tail samples was higher than in liver samples (mult,liver=0.60, 95% CI=0.52, 0.68, mult,tail= 0.91, 95% CI=0.87, 0.94; Fig. 2). We provide a tutorial (Supplementary Materials) that readers can use to perform the analyses we present in Program PRESENCE.
Detecting pathogens in the environment and understanding among-site differences in the proportion of infected hosts are primary goals in surveillance studies (Richgels et al. 2016; Gray et al. 2017), and the consequences of underestimating pathogen occurrence can be large. While studies frequently acknowledge that pathogen detection is imperfect, probabilistic approaches to dealing with this uncertainty are rarely used. We demonstrated the utility of occupancy models for addressing pathogen occurrence questions using readily available data from an easily employed sampling design.
Liver is the diagnostic tissue of choice for detecting ranavirus infection, especially in larval amphibians (Miller et al. 2015). Gray et al. (2012) reported a higher detection rate for ranavirus in liver samples compared to swabs and tail clips, but assumed that liver tissue was a perfect diagnostic sample (p=1) and that any detection in a tail sample when the liver tested negative was a false positive. False positives could occur due to external virions from the environment adhering to the amphibian's skin (Gray et al. 2012). However, it is also possible that early stages of infection can be detected in vascularized tail samples as the animal becomes viremic (Miller et al. 2015). When we detected ranavirus in a liver sample, there was a 94% probability of subsequent detection in at least one tail sample, suggesting that tail samples may be good indicators of host infection (i.e., diagnostic of true disease state, and not simply pathogen presence). Differing detection probabilities between sample types are likely to converge as ranavirus infections become systemic (Greer and Collins 2007), but given the value of detecting ranavirus before population declines occur, early detection at a site may be desirable even if not directly linked to host infections. Nonlethal tail samples are a reasonable alternative to lethal sampling for ranavirus, especially when lethal means are prohibited or when site-level occurrence dynamics are the focal question. Using a more sensitive detection method also has important study design benefits in that fewer animals need to be detected, captured, and sampled for robust inferences.
Detection probabilities were imperfect (<1) across all models, and the best-supported models suggested detection variation by tissue type. Thus, pathogen prevalence was underestimated when imperfect detection was not accounted for, as expected (Greer and Collins 2007; McClintock et al. 2010; Lachish et al. 2012). The detection probability estimated by the site occurrence model set was lower, and had a different best-supported structure (i.e., lowest AICc), than that estimated using the same data in the multiscale model. This difference occurred because the estimate of p from the basic occupancy model was the product of prevalence and detection probability (i.e., the positive predictive value). This was because the standard site analysis of occurrence violated the closure assumption and assumed that there was an opportunity for pathogen detection in every tissue sample (though some individuals may not have been infected, eliminating the possibility of detection). The multiscale model accounted for this violation by allowing the subunits sampled (individuals) to have different disease states. Without this accommodation, detection estimates will be biased low and occurrence estimates will be biased high (Kendall and White 2009).
Identifying the variables that shape ranavirus disease dynamics is critical for developing effective in situ management actions to cope with amphibian die-offs. While we found no relationship between pH and ranavirus occurrence, there was a narrow range of pH values at the PRR (range=4.43–6.96). Future studies should explore additional habitat characteristics predicted to influence pathogen occurrence and disease dynamics, including host community composition, abiotic characteristics such as contaminants (including endocrine-disrupting compounds), wetland permanence, or the effect of habitat manipulations. The occupancy modeling framework we present yielded unbiased estimates of covariate effects on pathogen occurrence and prevalence, and can be used to identify important drivers of disease dynamics.
Within ranavirus-positive wetlands, the ranavirus prevalence differed by species and was similar to the naïve prevalence estimates in controlled experiments for these species (Hoverman et al. 2011). Therefore, susceptibility trials might provide reasonable insights into field epidemiology (Gray et al. 2017) if imperfect detection is accounted for. Prevalence alone is not necessarily an indication of the degree of morbidity expected, because species may be differentially susceptible to infection (Hoverman et al. 2011), and may have variable infection loads (Hall et al. 2016). However, because wood frogs are known to have a strong correlation between infection and mortality (Hoverman et al. 2011), pathogen prevalence might be a reasonable estimate of disease impacts for this species. Spotted salamander larvae are more tolerant to ranavirus infection (Hoverman et al. 2011), meaning that they suffer fewer fitness consequences from ranavirus infection than other tested amphibian species. Prolonged ranavirus infections could make spotted salamanders biologic reservoirs that contribute to the emergence (and reemergence) of ranavirus in coexisting wood frog tadpole populations. Our estimates of ranavirus prevalence were higher than those in some field studies (Crespi et al. 2015), likely because other studies have not accounted for detection probability. While failing to account for heterogeneity in detection makes comparisons across landscapes and among studies difficult, the adoption of tools presented in this manuscript will facilitate future comparisons, which can be accomplished by accounting for study-specific sampling error.
An occupancy-based study design frees the researcher from the need to sample during disease-induced die-offs, as is recommended in some other designs (Simon and Schill 1984; Lavilla-Pitogo and de la Pen˜ a 2004). While pathogen detection may be maximized during a die-off, resulting data offer few insights about enzootic dynamics or spatiotemporal variation in prevalence (Greer et al. 2009; Hoverman et al. 2012). In addition, there is a risk of missing a die-off completely, especially in systems where mortality can occur rapidly, such as with ranavirus (Wheelwright et al. 2014; Hall et al. 2016), leaving uncertainty about the presence of a population or the influence of disease on persistence. Instead, an occupancy framework allows for a wider sampling window because detection probability is explicitly modeled and may be allowed to vary with time and sampling method. Estimates of detection probability are likely related to infection intensity (Miller et al. 2012) and therefore may differ based on sampling conditions like time until metamorphosis, time of year, or time since ranavirus invasion. However, the occupancy framework allows researchers to investigate the influences of these conditions, provided the study design is adequate (i.e., with a longitudinal study).
One should consider the tradeoffs between sampling effort, cost, and timing when designing studies to address questions about pathogen occurrence. Occupancy-based findings from a study in a single year can be used as pilot data to inform the timing, location, and sample sizes for future sampling efforts, potentially conserving or optimizing resource expenditures. Given the prevalence of ranavirus in woods frogs at the PRR and the tail-clip detection probability, we can be 95% confident in identifying ranavirus at least once at a site by collecting single tail clip samples from just three individuals rather than the 12 that were part of our initial study design (confidence =1-[1-θp]samples).
In our study we focused on estimating pathogen occurrence at a single point in time. However, static snapshots of dynamic systems (i.e., cross-sectional studies) can sometimes be misleading (Yackulic et al. 2015). We recommend that the sampling designs presented here be incorporated into monitoring efforts (i.e., longitudinal studies) so that pathogen dynamics (i.e., annual colonization and extinction of sites, changes in infection prevalence or incidence) can be estimated and used in conservation decision-making. The occupancy models we presented can be extended to include multiple seasons (MacKenzie et al. 2003), false positive errors (Miller et al. 2011), and multiple host-pathogen states (Mosher et al. 2018).
When hosts are scarce, detection of shed ranavirus DNA is possible using eDNA methods (Hall et al. 2016). Standard and multiscale occupancy models are able to accommodate eDNA data (Schmidt et al. 2013; Mosher et al. 2017), and the framework and tutorial we present here can be used as a guide for making unbiased inference from these kinds of data.
The study was supported by the US Geological Survey's Amphibian Research and Monitoring Initiative (ARMI), the US Fish and Wildlife Service Region 5, and the US Department of Agriculture National Institute of Food and Agriculture (Hatch project 1012932). This is contribution number 669 of the ARMI program. We thank the Northeast Amphibian Research and Monitoring Initiative field crew for collecting field data. The research was approved by the Patuxent Research Refuge Animal Care and Use Committee. Any use of trade, firm, or product name is for descriptive purposes only and does not imply endorsement by the US Government.
Supplementary material for this article is online at http://dx.doi.org/10.7589/2018-02-042.