The ability to describe the length distribution of a fish population requires sampling an adequate number of individuals, but collecting more fish than needed is inefficient. While fisheries managers have assessed sample size requirements for many sport fishes, these requirements are not routinely described for small-bodied fishes (i.e., maximum length ≤200 mm), particularly larval lampreys. To improve the efficiency of data collection for these fishes, we used resampling analyses to asses sample size requirements for accurately describing length distributions of larval (freshwater-dwelling) Pacific lamprey Entosphenus tridentatus, an anadromous fish native to western North America (total length 60–156 mm). We found that the highest increases in accuracy occurred with sample sizes <50, and that we needed sample sizes of 40 to 130 to describe length frequency with 95% confidence, depending on length interval used for performing length-frequency analyses. From these results, we recommend collecting 100 individuals if using 5-mm length intervals to examine length frequency of larval lamprey. These findings can also be used to estimate the relative accuracy of sample sizes in existing assessments and develop and refine monitoring programs for larval lampreys and other small-bodied fishes.
Length-frequency information (size structure) is generally easily attainable by fisheries biologists for most fishes (Neumann et al. 2012) and can provide insights into the dynamics of a fish population (Neumann and Allen 2007). A lack of smaller size classes of fish can indicate recruitment deficiencies, while infrequency of larger size classes might suggest high mortality of adult fish (Neumann and Allen 2007). Researchers can further combine length-frequency data with other assessment tools (abundance or catch per unit effort, body condition, recruitment) to better resolve patterns of year-class strength, growth, or mortality (Miranda and Bettoli 2007). Understanding these dynamics is critical to understanding the status of a fishery (Schaefer 1957), and how populations respond to environmental stressors (Adams et al. 1992) or conservation actions (Quist et al. 2005). Therefore, accurately describing the size structure of a population is important, but it requires capturing an adequate sample size of individuals to reliably characterize its length-frequency distribution (Vokoun et al. 2001). Collecting more individuals than is necessary is inefficient (Hansen and Jones 2008) and adds undue handling stress.
The sample size requirements for accurately describing length frequency of several North American sport fishes (>250 mm maximum length) has been evaluated by researchers using statistical resampling of simulated and empirical data sets (Vokoun et al. 2001; Miranda 2007). Vokoun et al. (2001) suggested sample sizes of 300–400 individuals were suitable to describe population length frequencies, and Miranda (2007) suggested that length-frequency distributions can be described with smaller sample sizes for smaller sport fishes (i.e., those with narrower length-frequency distributions, maximum length 200–300mm). For small-bodied nongame fishes (i.e., maximum length ≤200mm), assessments typically aim to collect a specific number of individuals. For example, researchers assessing larval sea lamprey Petromyzon marinus attempt to collect 100 individuals (Slade et al. 2003). However, specific assessments of sample size requirements for describing length frequency in small-bodied fishes are virtually absent in the published literature.
Our purpose in this study was to assess the relative accuracy of different sample sizes to describe length frequency for larval Pacific lamprey Entosphenus tridentatus using resampling procedures from empirical length data sets in the Willamette River basin, Oregon. Pacific lamprey is a native anadromous fish that has declined considerably across its range in western North America, including extirpation from multiple drainages (Moyle et al. 2009; Luzier et al. 2011). Conservation of Pacific lamprey across its range will require the development of monitoring efforts that effectively characterize the status of individual populations (Luzier et al. 2011). Larval lampreys are also of interest to researchers because they have a unique life history with a relatively long larval phase (i.e., 3–7+ y; Potter 1980), typically exhibit slow growth relative to other small-bodied fishes (Meeuwig and Bayer 2005), and rarely exceed 160 mm total length (TL; Schultz et al. 2016). Our analyses will aid in the development of these monitoring programs and have further utility for assessments of other lamprey species (e.g., sea lamprey) and small-bodied fishes that attain similar maximum lengths.
Larval lamprey data set
We sampled larval lampreys from wadeable locations in 14 tributary subbasins of the Willamette River, Oregon, from July to October of 2011–2013 (Schultz et al. 2016; Figure 1). Each of our sample reaches consisted of two entire pool and riffle channel units to capture fish from all available habitat. We also sampled off-channel habitats if adjacent to any of the channel units. We sampled three tributary subbasins (Clear and Thomas creeks and the Marys River) in all 3 y, two of the subbasins (Crabtree and Deep creeks) in 2011 and 2012, and the remainder at least once in 2012 or 2013 (Data S1, Supplemental Material). At each reach, we characterized abundance of larval lamprey with backpack electrofishing. We moved upstream through each reach using a single electrofishing pass with an AbP-2 backpack electrofishing unit (Engineering Technical Services, Madison, WI). The electrofisher was designed specifically to sample larval lampreys, and it applied a pulsed burst train (3 on : 1 off) with a 25% duty cycle, and a fast pulse at 30 pulses/s to temporarily immobilize individual larvae and facilitate capture. Following completion of electrofishing, we enumerated, measured (TL, mm), and identified individual captured lamprey as Pacific lamprey or brook lamprey (Lampetra spp.) using caudal pigmentation characteristics. We identified only individuals >60 mm to species because accurate keys for species identification have not yet been developed for lamprey <60 mm (Goodman et al. 2009).
We used a resampling approach to evaluate sample sizes required to obtain a representative length-frequency distribution of larval Pacific lamprey populations. We defined accuracy as how much resampled length-frequency distributions deviated from a known length-frequency distribution. We included all reaches in which we collected >200 individual larval Pacific lamprey in a single year as individual samples for consideration within this analysis (total reaches = 13). To account for any large-scale variability that might have occurred across the entire Willamette River basin, we also included a sample that included all captured Pacific lamprey pooled across all 3 y. From each sample, we used a resampling technique to randomly select, with replacement, a subsample of varying size from each of the samples. This procedure involved a random draw of some number of individual lengths (i.e., the subsample) from the entire original sample data set; subsample sizes ranged from 10 to 700 fish, at 10-fish intervals (i.e., 10, 20, 30, …700). We chose to use 700 as our upper subsample size because the most individual Pacific lamprey we captured at a single location in all years was 691 (in Thomas Creek, 2013), which should represent the approximate upper limit of the number of larval lamprey that could be captured and processed at a single location with reasonable effort.
To evaluate the accuracy of characterizing a length-frequency distribution using different sample sizes, we compared the length frequencies of resampled data sets (i.e., subsamples) to the original samples to estimate mean absolute difference (MAD) in relative frequency. We used simulation to calculate the sampling distribution of MAD for each subsample by subtracting the relative frequency of the subsample from the observed relative frequency of the original sample in each length interval,
where p(yj) was the proportion of individuals in the jth interval from the original sample, p(ȳij) was the proportion of individuals in the jth interval and the ith subsample, b was the number of length intervals, and r the number of iterations. Because the width of individual length intervals (i.e., ‘length bins') can change the interpretation of length-frequency distributions (e.g., Vokoun et al. 2001), we examined the effect of length interval widths with our resampling. For each subsample size, we generated 10,000 subsamples and estimated MAD and computed 95% confidence intervals across these simulated data sets for each of three length intervals: 2, 5, and 10 mm (see example code in Text S1, Supplemental Material).
The MAD values indicate the degree of deviation between a resampled data set and the original sample length-frequency distribution, and can be interpreted as the average deviation in percentage units across length bins. We plotted both the estimated MAD and the upper 95th percentile of resamples for each subsample size. This plot can be used to assess the relative accuracy of a given sample size and provide guidelines for future monitoring. For example, the sample size at which the 95% confidence interval declines to below 5% MAD would be interpreted to be the level of sampling effort needed to yield sample length-frequency distributions across all length bins within 5% of the “true” length-frequency distribution 95% of the time.
In addition to the MAD metric, we examined the performance of resamples in relatively small length intervals. To do so, we calculated the percentage of resamples from each subsample size that contained at least one individual >110 mm. This length has meaning specific to Pacific lamprey monitoring because it is the approximate length at which larvae are likely to begin metamorphosis (Schultz et al. 2016). In this case, designers of monitoring plans would be interested in the sample size needed to document the presence (or quantify the abundance) of individuals in this life stage. For other species or research questions, researchers could modify the provided code (Text S2, Supplemental Material) to address other monitoring goals.
We captured 7,583 larval Pacific lamprey (TL >60 mm) across the 14 tributary subbasins sampled in 2011–2013. Mean TL for these fish was 89 mm (range: 60–156 mm). We captured ≥200 individual Pacific lampreys at 13 locations, including several reaches in more than 1 y of sampling. Results from the MAD accuracy simulations were very congruent for all of these subsamples. However, for illustrative purposes we chose to present simulation results from three of these locations that showed contrast in length-frequency distributions: Willamina Creek, the Calapooia River, and Thomas Creek in 2013 (Figure 1). These three samples included a negative skew, a relatively even distribution, and a positive skew, respectively. We also present the sample of all Pacific lampreys in aggregate, which also showed a relatively even distribution (Figures 2 and 3).
From our resampling procedure, we observed a consistent asymptotic decrease in MAD estimates with increasing sample size, regardless of the width of length intervals (Figure 2). Across the three smaller samples (i.e., Willamina Creek, Thomas Creek, and Calapooia River) and the entire sample, the largest gains in accuracy (i.e., MAD estimates declined with increasing sample size) occurred in sample sizes <50 individuals. The upper 95% confidence intervals were below 5% MAD for samples sizes of 40–50 individuals for 2-mm length intervals, 90–100 individuals for 5-mm length intervals, and 120–130 individuals for 10-mm length intervals. Accuracy with samples sizes >150 individuals continued to improve, but appeared to reach an asymptote with quickly diminishing returns in accuracy. Length interval width (2, 5, and 10 mm) influenced interpretation of length frequency analyses (Figure 3), and larger length intervals generally contained higher MAD values for a given sample size. The probability of capturing at least one individual >110mm exceeded 95% for all sample sizes with >20 individuals, and was 100% in all samples with >70 individuals.
We illustrated how accuracy of length distribution changed with sample size for larval Pacific lamprey relative to an empirical data set using statistical resampling. When averaging across all length intervals, we observed only slight increases in accuracy when sample sizes exceeded 130 individuals relative to increases with sample sizes <50, and sample size requirements appeared fairly robust to the shape (skew) of length-frequency distributions. From these analyses, we recommend that lamprey monitoring programs aim to collect 50, 100, or 130 individuals for length-frequency analyses using 2-, 5-, and 10-mm length intervals, respectively, to be 95% confident that samples reflected the true length distribution. Sampling of >20 individuals was also sufficient for detection of individuals large enough to undergo metamorphosis and infer the presence of outmigrants (>110 mm). This work is one of the few to apply these analyses for sampling small-bodied fishes, and, to our knowledge, the first in the published literature. These findings will be directly applicable to the monitoring and evaluation of larval lampreys populations and are potentially applicable to other small-bodied fishes.
These sample size requirements are much smaller than those researchers have suggested for larger-bodied fishes (e.g., Kritzer et al. 2001; Vokoun et al. 2001; Miranda 2007). Miranda (2007) found that the sample size necessary to accurately describe length-frequency distributions of populations was 225–1,200 for largemouth bass Micropterus salmoides (TL 100–700 mm), 200–650 for white crappie Pomoxis annularis (TL 90–500 mm), and 150–425 for bluegill Lepomis macrochirus (TL 90–350 mm), and Vokoun et al. (2001) recommended sample sizes of 300–400 individuals for bluegill and channel catfish Ictalurus punctatus (TL 225–650 mm). Our work provides quantifiable support for the hypothesis of Miranda (2007) that sample size requirements decrease with narrower length-frequency distributions in fishes, and suggested sample size for larval Pacific lamprey can be described accurately with 130 individuals or less. Although our analyses do not permit a clear recommendation for appropriate length intervals for larval lampreys, we caution against using 10-mm length intervals because “peaks” and “valleys” appear to be lost when examining size structure data (e.g., Willamina Creek and Calapooia River in Figure 3).
Our results provide sample size recommendations that have utility in developing and evaluating monitoring plans for larval Pacific lamprey and this methodology may be applied for monitoring other lampreys and small-bodied fishes. Several authors (e.g., Anderson and Neumann 1996; Kritzer et al. 2001; Gerritsen and McGrath 2007) have suggested general rules of thumb for the number of individuals needed to characterize a population's length frequency based on intended length intervals. However, our work here can also be used to retrospectively evaluate existing data sets and sampling approaches. For instance, researchers monitoring sea lamprey in the Great Lakes region typically sampled preferred habitats (“type I”) until 100 individuals were captured to assess reaches for potential lampricide treatments (Slade et al. 2003). Our data suggest that this approach adequately characterizes length frequency for these populations. Pacific lamprey conservation efforts can immediately incorporate these findings into monitoring plans as well (Luzier et al. 2011), and sample size recommendations might have applications for evaluating animal handling proposals (e.g., scientific collection permits).
Clearly considering the goals and objectives of a monitoring program is a necessary first step to developing a sampling design (Moser et al. 2007), but we have illustrated how resampling techniques can provide a means to refine monitoring programs further (e.g., Wanner et al. 2007). In many scenarios, an absence of individuals of a particular size from sampling data may be reflective of imperfect detection due to sampling bias rather than true absence (“the absence of presence does not constitute the presence of absence”). Heterogeneous capture probabilities are common for many fishes due to habitat affinities (e.g., Wanner et al. 2007; Price and Peterson 2010), sampling gear (e.g., Dunham et al. 2013), and size/age classes utilizing different habitats (e.g., Young et al. 1990; Almeida and Qunitella 2002; Sugiyama and Goto 2002), and can bias observed length-frequency distributions. Future work would benefit from considering these systematic biases, and attempt to account for heterogeneity due to these potential sources of error to interpret length-frequency data (e.g., Breton et al. 2013). Our approach helps initially inform sample size requirements, and provides tools that can aid further evaluations to refine sample size requirements for particular applications.
Please note: The Journal of Fish and Wildlife Management is not responsible for the content or functionality of any supplemental material. Queries should be directed to the corresponding authors for the article.
Data S1. Spreadsheet containing capture date and location, length (mm), and metamorphosis status (“Transformer”) for Pacific lamprey Entosphenus tridentatus collected from the Willamette River tributaries, 2011–2013. For the Transformer column, TRUE = metamorphosed individual, FALSE = nonmetamorphosed individual.
Found at: DOI: http://dx.doi.org/10.3996/112015-JFWM-112.S1 (327 KB PDF).
Text S1 and Text S2 are in one supplemental file that contains Text S1 on page 1: Annotated R code for performing the resampling techniques described in this paper to determine the sample sizes needed to characterize Pacific lamprey Entosphenus tridentatus length frequency. Sampling was performed between 2011 and 2013. This function is for calculating mean absolute deviance, mean squared differences, and associated 95% confidence intervals across range of sample sizes. Text S2 is on page 2 and contains the Annotated R code for estimating the relative performance of specific size classes with resampling. The function is for calculating the probability of sampling a fish in a specific length interval from a length data set; this data set is simulated from a gamma distribution in this example.
Found at: DOI: http://dx.doi.org/10.3996/112015-JFWM-112.S2 (30 KB PDF).
Reference S1. Luzier CW, Schaller HA, Bostrom JK, Cook-Tabor C, Goodman DH, Nelle RD, Ostrand K, Strief B. 2011. Pacific lamprey (Entosphenus tridentatus) assessment and template for conservation measures. U.S. Fish and Wildlife Service, Portland, Oregon.
Found at: DOI: http://dx.doi.org/10.3996/112015-JFWM-112.S3; also available at http://www.fws.gov/columbiariver/publications/USFWS_Pacific_Lamprey_Assessment_and%20_Template_for_Conservation_Measures_2011.pdf (5329 KB PDF).
Fieldwork assistance was generously provided by B. Clemens, J. Doyle, B. Gregoire, K. Kuhn, R. McCoun, B. McIlraith, G. Sheoships, and L. Wyss. We thank M. Colvin and J. Peterson for statistical discussions, and B. Morris for administrative support. Discussions and comments from C. Schreck, M. Heck, and L. Arnold greatly helped to develop and refine this paper, and a critical review by B. Compton and four anonymous reviewers greatly improved the quality of this manuscript. Funding for this study was provided by the Columbia River Inter-Tribal Fish Commission through the Columbia Basin Fish Accords partnership with the Bonneville Power Administration under project 2008-524-00, B. McIlraith, project manager.
Any use of trade, product, website, or firm names in this publication is for descriptive purposes only and does not imply endorsement by the U.S. Government.
Citation: Schultz LD, Mayfield MP, Whitlock SL. 2016. Sample sizes needed to describe length-frequency of small-bodied fishes: an example using larval Pacific lamprey. Journal of Fish and Wildlife Management 7(2):315–322; e1944-687X. doi: 10.3996/112015-JFWM-112
The findings and conclusions in this article are those of the author(s) and do not necessarily represent the views of the U.S. Fish and Wildlife Service.