Fish monitoring programs often rely on the collection, species identification, and counting of individual fish over time to inform natural resource management decisions. Thus, the utility of the data used to inform these decisions can be negatively affected by species misidentification. Fish species misidentification bias can be minimized by confirming identification using genetic techniques, training observers, or adjusting monitoring data using estimates of incomplete detection and false-positive misidentification. Despite the existence of well-established fish identification training and quality control programs, there is considerable uncertainty about fish species false-positive misidentification rates and the effectiveness of fish identification training programs within the San Francisco Estuary. We evaluated the misidentification of fish species among Delta Juvenile Fish Monitoring Program observers by conducting five fish identification exams under controlled conditions at the Lodi Fish and Wildlife Office in Lodi, California, between 2012 and 2014. To assess the variability in false-positive misidentification, we fitted data to species and observer characteristics using hierarchical logistic regression. We found that fish species misidentification was fairly common, averaging 17% among 155 test specimens and 32 observers. False-positive misidentification varied considerably among species and was negatively related to fish size, the abundance of the species within monitoring samples, and observer experience. In addition, observers who were not formally trained or used as full-time observers were, on average, 6.0 times more likely to falsely identify a species. However, false-positive misidentification rates among observers and specimens still varied considerably after controlling for observer experience and training, and species and size, respectively. Our results could be used to improve fish identification training and testing, increase the accuracy of fish occupancy or abundance estimation, and justify the allocation of resources to continually use and formally train full-time observers within long-term monitoring programs operating in the system.
Fish monitoring programs often rely on the collection, species identification, and counting of individual fish over time to determine changes in distribution and abundance, and inform natural-resource management decisions. Thus, the utility of the data collected can be greatly affected by species misidentification (Elphick 2008; Fitzpatrick et al. 2009; Vink et al. 2012). The misidentification of an individual within a sample simultaneously introduces two forms of error into a data set: false-positive and false-negative errors. False-negative error occurs when a species is not recorded in a sample unit where it is present and can cause the underrepresentation of a species or type of individual within a sample. Conversely, false-positive error occurs when a species is recorded in a sample unit where it is absent, which can overrepresent a species or type of individual within a sample. Failure to account for these errors can introduce systematic bias into data (Peterson and Paukert 2009; McClintock et al. 2010), obfuscate our understanding of the true status or distribution of species (Metcalf et al. 2007; Beerkircher et al. 2009; Costa et al. 2015), and negatively influence managers' ability to make effective resource-management decisions (Garcia-Vazquez et al. 2012).
Fish species misidentification bias can be minimized by adjusting monitoring data using estimates of incomplete detection and false-positive misidentification (Royle and Link 2006; Miller et al. 2011; Shea et al. 2013). Traditional sampling methods and statistical estimators designed to account for incomplete detection frequently account for false-negative identification error (Bayley and Peterson 2001; Williams et al. 2002; Miller et al. 2011), and these techniques have already been widely adopted by many monitoring programs. Researchers have only recently developed statistical estimators to account for false-positive identification error (Royle and Link 2006; Miller et al. 2011, 2015; Ferguson et al. 2015). Unfortunately, many long-term monitoring programs may find it difficult to adopt the sample design(s) needed to accommodate the assumptions of these estimators based on data continuity concerns or additional costs associated with increased effort from collecting replicate samples or deploying additional sampling gears (see Clement 2016). Alternatively, Shea et al. (2013) demonstrated that false-positive misidentification rates derived from supplemental investigations (e.g., Shea et al. 2011) can be used in place of adopting a new sampling design, which is applicable to any fish monitoring program. However, it is our understanding that no estimates of false-positive identification error exist for expert or professional field observers responsible for collecting and identifying juvenile or adult freshwater fish within the United States.
Fish monitoring programs typically have observers identify fish species using external morphological features in the field or laboratory (Strauss and Bond 1990). Thus, the accuracy of fish species identification can be influenced by the level of distinct, visible, morphological species traits (e.g., shape, color, etc.; Hillman et al. 1992; Moyle 2002; Beerkircher et al. 2008, 2009) coupled with observer bias (e.g., level of experience and training; Fitzpatrick et al. 2009; Shea et al. 2011). Consequently, species misidentification has long been recognized as a potential issue when identifying small and inconspicuous fish (Ko et al. 2013), new invaders (Kirsch et al. 2018), fish underwater or by video (Griffith 1981; Hatch et al. 1994), or when data are collected by resource users or volunteers (Schill and Lamansky 1999; Marko et al. 2004; Garcia-Vazquez et al. 2012). Although genetic techniques have been developed to improve fish species identification accuracy (Teletchea 2009), these techniques can be costly and difficult for large long-term monitoring programs to utilize, especially when large numbers of fish are being collected, fish are not being handled, or listed species are being collected and mortality, injury, or harassment is of concern. As a result, fish monitoring programs often assume perfect species identification accuracy and resort to allocating limited resources to recruit and train experienced observers.
The U.S. Fish and Wildlife Service's Delta Juvenile Fish Monitoring Program (DJFMP) has monitored fishes within the San Francisco Estuary, California, since 1976 primarily to inform water operation decisions and assess salmonid population status over time (Dekar et al. 2013). Approximately 112 fish species are encountered as part of this work. Recognizing that species misidentification could reduce the integrity of the data collected by the DJFMP, numerous control measures have been implemented throughout the program's history to minimize bias from field identification errors. Since 1995, field observers have brought unidentifiable fish back to the laboratory for identification and did not identify collected fish smaller than 20–25 mm in fork length. Further, the DJFMP developed a formal fish identification training and quality control program in 2001 when the program formally expanded its objectives to include monitoring nonsalmonid species. The training program has generally been staffed by one full-time fish identification biologist tasked with creating formal training curricula for both the field and laboratory, training inexperienced observers using that curriculum, developing accurate and effective field fish-identification keys, establishing and maintaining voucher and reference collections, and confirming species identification using genetics as needed. Although the control measures and training program have undoubtedly improved the identification accuracy of fishes among DJFMP observers, the effectiveness of the fish identification training program was unknown prior to our study and the assumption of perfect identification among observers remains unsupported. The objectives of this study were to 1) estimate the species misidentification rates among DJFMP observers, 2) assess the factors that influence an observer falsely identifying fish occurring in the San Francisco Estuary, and 3) evaluate the effectiveness of the DJFMP identification training program.
We evaluated the rate of fish species misidentification among DJFMP observers that monitored fish between 2012 and 2014 within the San Francisco Estuary and lower Sacramento and San Joaquin rivers located in the Central Valley of California. The San Francisco Estuary, hereafter referred to as the Estuary, consists of three distinct segments: the Sacramento–San Joaquin Delta, Suisun Bay, and San Francisco Bay (Moyle 2002). The San Francisco Estuary is notably the largest estuary in California; contains 40 native freshwater, estuarine, euryhaline marine, and anadromous fish species including 17 endemic species; and is recognized as one of the most invaded systems in North America (Wang 1986; Cohen and Carlton 1998; Moyle 2002). The climate is classified as Mediterranean and consists of cool wet winters and hot dry summers (Nichols et al. 1986). Historically, the Estuary was maintained by natural runoff from an estimated 40% of California's surface area (Nichols et al. 1986). However, increases in agriculture and urbanization throughout California over the past century have necessitated intense water management. Currently, the Estuary and its watershed are managed to, in part, provide native fish habitat, prevent flooding and supply freshwater for export through local irrigation and the Central Valley Project and State Water Project pumping plants located in the Delta (Kimmerer 2004; Brown and Michniuk 2007). As a result, the Estuary and lower rivers have been substantially altered by levees, dams, land reclamation, pollution, dredging, intrabasin water conveyance, and out-of-basin water export (Nichols et al. 1986). These alterations can have profound impacts on aquatic habitats and organisms (Stevens and Miller 1983; Nichols et al. 1986; Brandes and McLain 2001; Bunn and Arthington 2002; Kimmerer 2002; Feyrer and Healey 2003).
Fish species of management concern have been monitored within the Estuary and lower Sacramento and San Joaquin rivers by the DJFMP and other monitoring programs to inform management decisions by assessing their distribution and abundance over time. Although the DJFMP was originally designed to monitor juvenile Chinook Salmon Oncorhynchus tshawytscha, the program does provide data on other fish species (e.g., osmerids) to managers and researchers. In general, the DJFMP has conducted long-term fish monitoring at 58 fixed sites using beach seines and 3 fixed sites using surface trawls located within the lower Sacramento and San Joaquin rivers, at and between the entry and exit points of the Sacramento–San Joaquin Delta, and within the San Francisco Bay (Figure 1). Between 2012 and 2014, the program generally sampled each beach seine site once every wk or once every 2 wk and sampled each surface trawl site for approximately 20 min 10 times/d, 3 d/wk. For each sample, observers recorded the count of each species captured that had a fork length >25 mm and recorded the volume sampled. Please see Dekar et al. (2013) for more information concerning the DJFMP, and its seine and trawl site locations and sampling methodology.
Our evaluation of fish species misidentification among DJFMP observers occurred under controlled conditions modified from Shea et al. (2011). We conducted five fish identification exams at the Lodi Fish and Wildlife Office located in Lodi, California. The examinations occurred in the spring (8 March), summer (18 June), and autumn (29 November) of 2012 and in the summers of 2013 (22 May) and 2014 (28 July). One month prior to each exam, the DJFMP fish identification biologist collected approximately 100 fish specimens >25 mm in fork length at long-term monitoring locations to obtain representative samples among species and sizes encountered by observers within the San Francisco Estuary and lower Sacramento and San Joaquin rivers (Figure 1). The DJFMP fish identification biologist initially identified each potential exam specimen collected in the field and immediately preserved it by freezing in an attempt to maintain natural appearance (i.e., color). We did not collect fishes listed under the state or federal Endangered Species Act (CESA 1970, as amended; ESA 1973, as amended) for this study, but obtained listed specimens by preserving indirect mortalities from regular DJFMP monitoring.
We selected approximately 20–40 fish specimens for each exam. The DJFMP fish identification biologist initially selected exam specimens after defrosting them during the morning of each exam. To ensure the species identification of each exam specimen was accurate, three fish biologists considered as local experts from the California Department of Fish and Wildlife verified the identification of each exam specimen and their possible replacements (if available). If the local experts and the DJFMP fish identification biologist did not come to a consensus regarding the species identification of an exam specimen or its replacement, we eliminated the specimen from the exam. We assumed that each test specimen identified as the same species by all the local experts and the DJFMP fish identification biologist was identified accurately. The staff member(s) who selected specimens or those with foreknowledge of the exam specimen's species identification did not take the exam and instead helped proctor the exams. We randomly assigned exam specimens to stations numbered in sequential order. The examiner documented both the species and fork length (measured to the nearest mm using a measure board) for each exam specimen at each station at the beginning of each exam. Exam specimens that were damaged during the exams as a result of extensive or rough handling were replaced, if possible, by another individual of the same species, size (±5 mm in fork length), and condition or presentation (e.g., color, pliability, containing intact fins, etc.) during each exam.
We limited examined observers to individuals that had recently (e.g., within the past 60 d) identified fish for the DJFMP either as an employee, interagency partner, or volunteer. Observers identified specimens independently and were allowed to move freely among each exam station to identify exam specimens with no time constraints during the spring and summer exams of 2012. For the other three exams, we gave observers between 2 and 3 min to identify each exam specimen before moving, in sequential order, to another station. The additional structure added to these exams was an attempt to limit confounding results from observers having unequal time to identify test specimens. We allowed observers to identify exam specimens using their field keys to simulate conditions within the field. In general, field keys were derived through the compilation of field notes, Miller and Lea (1972), Wang (1986), and Moyle (2002). To prevent nomenclature errors during the exams, we asked observers to identify exam specimens by common names and provided sheets or keys that contained common names, scientific names, and monitoring species codes for all species detected by DJFMP staff within the San Francisco Estuary or its watershed within the Central Valley.
We asked each observer to record their experience (in months) identifying juvenile or adult fishes within California's Central Valley prior to each exam. After the exams, we documented whether the observer was a trained full-time DJFMP observer, which we defined as conducting monitoring and thereby identifying fish, on average, >2 d/wk after receiving training from the DJFMP fish identification biologist. In addition, we assigned a morphological group (e.g., bass-like, shad-like, salmon-like, pupfish-like, and minnow-like) to the species observed during the exams based on general fish shape, coloration, and life history similarities (Table 1). We also recorded whether the species possessed distinct morphological traits including barbels, an adipose fin, dorsal spines or hardened rays, and markings (i.e., spots, parr marks, mottling, or barring; Table 1). We determined species' morphological groupings and traits for each species using Moyle (2002) and Miller and Lea (1972). Lastly, we calculated the mean number of species identifications-per-sample (hereafter, monitoring identifications) made by all DJFMP observers in the field among all monitoring samples collected within 30 d prior to each exam for each species observed during the exam.
We first calculated and compared false-negative and false-positive identification errors among observations (i.e., individual identifications of an exam specimen made by an observer during an exam) to evaluate if they were similar or different for each species (Table 1). We defined false-negative identification errors as an instance when an observer did not correctly identify a specimen as the true (correct) species. We defined false-positive identification errors as the identification of a species by the observer given that it was not the true (correct) species. For example, a specimen of species A might be incorrectly identified (false-negative error) and identified as species B (false-positive error).
Our primary interest was to estimate the probability of false-positive species misidentification and evaluate how it related to species and observer characteristics. Species characteristics used represented those for the observed (i.e., predicted) species. Secondarily, we were interested in evaluating the effect of different exam structures (i.e., the addition of time limitations) on species misidentification. We evaluated the relations between false-positive misidentification and exam structure, species, and observer characteristics using hierarchical logistic regression models (Data S1, Supplemental Material; Williams et al. 2002; Royle and Dorazio 2008). Individual observations served as the response variable and we coded them as 1 when the species was falsely identified and as 0 when the species was correctly identified. Hierarchical models allowed us to account for potential dependence among specimens, species, or observers. We fit all models in Program R version 3.4.2 (R Core Team 2017) using the lme4 package (Bates et al. 2015, 2017).
Prior to model-fitting, we calculated Pearson correlation coefficients for all pairs of potential continuous predictor variables and eliminated highly correlated variables (r2 > 0.5) from our analysis to avoid multicollinearity (Dormann et al. 2013). We also standardized all continuous data with a mean of zero and a standard deviation of one to facilitate model-fitting and simplify the interpretation of parameter estimates. Lastly, we created binary indicator variables (0 for condition absent or 1 for condition present) for the categorical predictor variables: additional exam structure, untrained part-time observer, species special status, each morphological group (bass-like, shad-like, salmon-like, pupfish-like, minnow-like), and each morphological trait (barbels, an adipose fin, dorsal spines or hardened rays, markings).
We used an information theoretic approach (Burnham and Anderson 2002) to evaluate our primary a priori hypotheses of interest regarding the relative influence of exam structure, and species and observer characteristics, on false-positive misidentification (Table 2). We initially developed a global model that contained all predictor variables to determine the best variance structure. We considered seven variance structures representing all possible combinations of observer, species, and specimen random effects. We incorporated all random effects as randomly varying intercepts (Zuur et al. 2009) that represented unique effects unexplained by the species and observer characteristics. We identified the best approximating variance structure as that with the lowest Akaike Information Criteria with small-sample bias adjustment (AICc; Hurvich and Tsai 1989). The number of parameters used to calculate AICc included both fixed and random effects. We used the best approximating variance structure during the evaluation of the relative plausibility of the candidate models.
We developed 15 candidate models representing each of our hypotheses (Table 3). We determined the relative fit of candidate models by calculating Akaike weights (w) and calculated the strength of evidence using the ratios of Akaike weights (Burnham and Anderson 2002). We considered candidate models with Akaike weights within 12% of the best-approximating candidate model's Akaike weight plausible (Royall 1997) and included them within the confidence set of candidate models. We based all inferences on the individual models within the confidence model set. We assessed goodness-of-fit for each model in the confidence set using residual plots.
We assessed the precision of fixed-effect parameter estimates by calculating 90% confidence intervals. We considered confidence intervals that contained zero to be imprecise and representing a weak or inconclusive relationship. To allow for ease of interpretation, we estimated an odds ratio (OR) for each fixed-effect species and observer characteristic parameter in the confidence set of candidate models (Hosmer and Lemeshow 2000). An OR that is <1 indicated that the response variable is less likely to occur and an OR that is >1 indicated that the response variable is more likely to occur. Continuous predictor variables were standardized, so OR ratios for these parameters should be interpreted as a change associated with an increase of one standard deviation in the predictor variable. We also estimated a median odds ratio (MOR) for each random effect to allow interpretation of the relative magnitude of random effects (Larsen et al. 2000). We interpreted median odds ratios as the OR between a randomly chosen observer, specimen, or species with the lowest probability of misidentification and a randomly chosen observer, specimen, or species with the highest probability of misidentification while holding all other predictors constant.
We used 155 specimens representing 44 species in the examinations (Table 1; Figure 2). Thirty two observers participated in at least one of the fish identification examinations and 68.8% (n = 22) of the observers were trained full-time DJFMP observers. Observers made 2,465 observations and identified 63 species during our fish identification examinations. The false-negative error rate among species averaged 16.8% and ranged from 0% to 44%, and the false-positive misidentification error rate averaged 16.8% and ranged from 0% to 100%. Only one species, Bigscale Logperch Percina macrolepida, was never misidentified (Table 1). Four species (Threadfin Shad Dorosoma petenense, Common Carp Cyprinus carpio, White Catfish Ameiurus catus, Bigscale Logperch) had a false-negative error rate of zero and three species (Brown Bullhead Ameiurus nebulosus, Channel Catfish Ictalurus punctatus, Bigscale Logperch) had a false-positive error rate of zero (Table 1). None of these species are considered species of management concern or native to the San Francisco Estuary. Only two species (Pacific Herring Clupea pallasii, Sacramento Splittail Pogonichthys macrolepidotus) had false-negative and false-positive error rates that were greater than zero and nearly identical (differed by <1%; Figure 2).
We included two candidate models in our confidence set of models predicting the probability of false-positive species misidentification (Table 3). The best approximating model contained fork length; observer experience and untrained part-time observers; fork length × observer experience interaction; fixed-effect intercept; and observer, species, and specimen random-effect intercepts (Table 3). Akaike weights indicated that this model was 1.90 times more supported than the second-best candidate model that contained monitoring identifications; observer experience and untrained part-time observers; monitoring identifications × observer experience interaction; fixed-effect intercept; and observer, species, and specimen random-effect intercepts (Table 3). There was little or no support for the other candidate models, including those containing exam structure as a predictor.
The fixed-effect parameter estimates based on the confidence set of models indicated that false-positive misidentification was greatest among smaller and less encountered fish species identified by less experienced and untrained part-time observers (Figure 3; Table 4). The ORs suggested that untrained part-time observers were, on average, 6.0 times more likely to falsely identify a species. Observers were, on average, 1.3 times less likely to falsely identify a species for every 5-y increase in experience identifying fish in the Central Valley based on the best approximating model. The ORs also suggested that species were, on average, 2.3 times less likely to be falsely identified for every 60-mm increase in fork length and the effect of body size decreased with observer experience (Figure 3; Table 4). Further, false-positive misidentification was, on average, 2.8 times less likely when the abundance of a species increased by two individuals within prior monitoring samples and the effect of prior monitoring identifications increased with observer experience (Figure 3; Table 4).
The random-effect estimates indicated that considerable variation remained among species, observers, and specimens after accounting for species and observer characteristics (Table 4). The MORs for the best approximating model suggested that for two species with identical fork lengths, the more difficult-to-identify species was, on average, 20.1 times more likely to be falsely identified than the less difficult species. Similarly, for two observers with identical experience, the observer with less ability was, on average, 3.0 times more likely to falsely identify a species than the observer with greater ability. Lastly, for two specimens with identical species characteristics identified by observers with identical experience, the more difficult-to-identify specimen was, on average, 6.8 times more likely to be falsely identified than the less difficult specimen.
Accurate identification of fish species is a common assumption and fundamental aspect of most field research projects and monitoring programs (Elphick 2008). However, we are not aware of any studies that have assessed species misidentification rates for freshwater fish among trained and untrained professional observers (but see Hatch et al. 1994; Brewer and Ellersieck 2011). We found that fish species identification accuracy was imperfect for 63 species and perfect for only 1 morphologically unique and nonnative species (Bigscale Logperch). Conceptually, species misidentification may be of little consequence to management if false-positive and -negative error rates are either 1) low or 2) the same for a species and remain constant assuming fish species are equally distributed among samples, which is an assumption rarely met. Our results indicated that species misidentification rates varied considerably among species and observer characteristics, and we found that only two species (Pacific Herring, Sacramento Splittail) had false-negative and -positive error rates that were nearly identical. Previous research has demonstrated that occupancy can be substantially underestimated when false-negative identification error rates exceed 20% (Tyre et al. 2003) and overestimated when false-positive identification error rates are as low as 5% (Royle and Link 2006). Further, McClintock et al. (2010) demonstrated that false-positive error affecting <1% of detections can cause severe overestimation of site occupancy, colonization, and local extinction probabilities, which suggests false-positive identification error may have greater implications for rare species (Fitzpatrick et al. 2009). We found that fish species misidentification was, on average, 16.8% among all observations; and false-positive identification error was higher among less common species and exceeded 5% for 56 (89%) species. Therefore, we contend that perfect species identification may be a tenuous assumption and misidentification may be a substantial source of sampling bias within fish monitoring data collected within the San Francisco Estuary and lower Sacramento and San Joaquin rivers.
Species misidentification was related to species characteristics including size and their relative abundance within monitoring samples. As expected, our modeling results demonstrated that false-positive identification error decreased as fish became longer and more common within the monitoring samples. However, considerable variability remained among species after accounting for size and their relative abundance, which demonstrates the utility in using species-specific misidentification estimates to calibrate monitoring data (Shea et al. 2011, 2013). In general, many species share similar morphological traits in their early life stages and some morphological characteristics (e.g., color, fin shapes, etc.) can change as fish grow, become older, or occupy different environments (Wang 1986; Strauss and Bond 1990; Ko et al. 2013). As a result, the external morphological differences among species often become more apparent to field observers as fish increase in size (Wang 1986; Moyle 2002). Observers may become more aware of the distinct morphological species traits when they encounter the species more often in the field prior to the examinations based on the philosophy that “practice makes perfect” (see Jonides 2004). As such, observers may improve their ability to accurately identify a species through repeated identification attempts resulting in increased automaticity, working memory, and experience levels with the species (Jonides 2004; Olesen et al. 2004; Chow et al. 2016).
Fish researchers and managers have generally presumed that species misidentification is greatly reduced or eliminated among observers with more experience and training (Thurow 1994; Rosenzweig and Bennett 1995; Pattengill-Semmens and Semmens 2003). We found that false-positive species misidentification was lower among experienced and trained observers, which is consistent with recent investigations concerning the identification of other taxa (e.g., Fitzpatrick et al. 2009; Shea et al. 2011; Schaefer et al. 2015). Although observer experience reduced the probability of false-positive identification error, the effect of observer experience on species misidentification was relatively weak and varied among species characteristics. For example, our results indicated that observer experience was more effective in reducing the misidentification of smaller and more common species relative to larger and less common species. We speculate that smaller species were simply more difficult for inexperienced observers, who may not have been accustomed to paying additional attention to the less distinct morphological traits of smaller species, to accurately identify (Wang 1986). In addition, observers with more experience may have increased awareness of morphological differences among species at smaller sizes (Jonides 2004).
Observer training was the most influential predictor of species misidentification and our modeling estimates suggested that observers trained by the DJFMP were 6 times less likely to misidentify a species relative to untrained observers. Over the study period, DJFMP training generally consisted of observers obtaining occasional one-on-one training with the fish identification biologist in the field and attending two to four workshops each year. During the workshops, observers were presented information on both the external morphological traits and ecology of species that were currently being collected by DJFMP observers. Workshop attendees were also given preserved specimens from, in part, the program's voucher and reference collections to practice identifying more problematic species. Our results suggest that the formal training program utilized by the DJFMP is effective and a sound investment for reducing the uncertainty or bias within data. However, false-positive species misidentification rates among the trained observers often exceeded the 5% misidentification rate known to bias data (Royle and Link 2006) regardless of observer experience. We found that, on average, trained observers with up to 18 y of Central Valley fish identification experience had misidentification rates >5% for most species. This provides evidence that monitoring programs with experienced staff and effective observer training programs would be further enhanced by assessing and accounting for species misidentification among their observers.
Similar to other investigations (e.g., Shea et al. 2011; Miller et al. 2012, 2015), considerable variability remained among observers after accounting for years of Central Valley fish-identification experience and DJFMP training. We surmise that other observer-based characteristics, in addition to training and experience, influenced an observer's ability to accurately identify fish. For example, Miller et al. (2012) found that observer ability measured by online tests and self-assessments was a better predictor of anuran misidentification using calls relative to years of experience conducting call surveys. An observer's ability to accurately identify fish species is likely influenced by a multitude of contributing factors including, but not limited to, their innate ability to learn and retain information, motivation, training, and experience identifying species at relevant sizes within similar conditions, confidence in themselves and colleagues or trainers, availability of identification resources (e.g., keys, photographs, etc.), and stress levels during identification (Shea et al. 2011; Miller et al. 2012). This information is often unknown and difficult to obtain or accurately quantify; therefore, we recommend that large monitoring programs, especially those with relatively high staff turnover rates, obtain observer-specific misidentification estimates along with a variety of observer ability data over time to calibrate monitoring data and help facilitate our understanding of the factors that influence the ability of observers to accurately identify fish.
There are several potential sources of uncertainty that may have affected our evaluation of fish-species identification accuracy based on our sample design relative to the data collected by DJFMP observers and the identification protocols they follow. We derived our results from controlled laboratory examinations that may have caused stress or anxiety among observers (Stowell 2003 and references therein), which may have influenced their ability to accurately identify fish. In the field, DJFMP observers often work in pairs and may consult one another when identifying ambiguous specimens, whereas consultation between observers during our examinations was not allowed. Subsequent pilot examinations conducted by the DJFMP during interagency workshops have shown variation in species identification accuracy when observers were allowed to consult with each other compared with identifying specimens independently (J.L. Day, U.S. Fish and Wildlife Service, unpublished data). We also did not include any spatial catch information for any of the specimens beyond specimens being collected at DJFMP monitoring sites. Moyle (2002) reported that the location of where specimens were collected can be useful information for identifying fish when observers are knowledgeable about the ecology of species within an assemblage. Although our modeling results indicated that the differences in structure of our examinations was not an important factor in estimating misidentification, considerable variability remained among specimens after accounting for observers, species, and their characteristics. The use of preserved specimens rather than live fish may explain this dependency based on the possibility that the condition or appearance of specimens was altered to varying degrees during the freezing and defrosting process or the repeated handling by multiple observers. Lastly, we did not account for several of the DJFMP's fish-identification control measures, including bringing unidentifiable fish back to the laboratory for identification and confirming species identifications using genetic methods, which can greatly help reduce or eliminate species misidentification (Metcalf et al. 2007; Teletchea 2009; Kirsch et al. 2018). However, DJFMP observers rarely brought juvenile or adult fish from the field to the laboratory for species identification during our study period, with the exception of fishes of management concern being collected outside their known distribution. For all of these reasons, we believe our study provides a general and improved understanding of fish species misidentification for observers currently identifying juvenile or adult fish within the San Francisco Estuary and its watershed or similar systems using solely external morphological traits.
Successful fisheries management ultimately requires accurate and precise information to effectively inform decisions and efficiently guide actions (Conroy and Peterson 2013). Our study demonstrates that species misidentification may be an important source of sampling variation within fish monitoring data collected in the San Francisco Estuary and lower Sacramento and San Joaquin rivers. Based on our findings, we suspect that species misidentification may be fairly ubiquitous among fish monitoring programs or large research projects identifying fish in the field using solely external morphological traits within systems containing diverse fish assemblages. In particular, fish sampling that has observers identify fish underwater using sonar, infrared scans, or video can be especially prone to species misidentification because some or all key morphological traits can be obscured when fish are in high densities, improperly oriented, occurring in obstructed or turbid habitats, or moving at high velocities (Hillman et al. 1992; Hatch et al. 1994; Maxwell and Gove 2007; Feyrer et al. 2013). Although our study demonstrated that misidentification can be significantly reduced by training observers, species misidentification was common among trained observers and often exceeded rates known to cause bias in occupancy and abundance estimation (Tyre et al. 2003; Royle and Link 2006). Thus, fishery management may benefit by allocating resources to both train observers and continually investigate their identification accuracy. The information collected from these investigations can be used to improve observer hiring and training practices, develop useful modeling tools to properly calibrate data and reduce bias (see Shea et al. 2013), and in turn better inform fishery management decisions.
Please note: The Journal of Fish and Wildlife Management is not responsible for the content or functionality of any supplemental material. Queries should be directed to the corresponding author for the article.
Data S1. All fish misidentification data analyzed and presented in this manuscript are contained in the XLS file titled Data_S1_Raw_Identification_Data_USFWS_12Feb2018.
Found at DOI: https://doi.org/10.3996/032018-JFWM-020.S1 (241 KB XLSX).
Reference S1. Dekar M, Brandes P, Kirsch J, Smith L, Speegle J, Cadrett P, Marshall M. 2013. USFWS Delta Juvenile Fish Monitoring Program review. Lodi, California: U.S. Fish and Wildlife Service.
Found at DOI: https://doi.org/10.3996/032018-JFWM-020.S2 (3.3 MB PDF); also available at http://www.water.ca.gov/iep/docs/DJFMP_BACKGROUND_SUBMITTED_SAG_20May13.pdf.
Reference S2. Miller DJ, Lea RN. 1972. Guide to the coastal marine fishes of California. California Department Fish and Game, Fish Bulletin 157.
Found at DOI: https://doi.org/10.3996/032018-JFWM-020.S3 (6.23 MB PDF); also available at http://www.nativefishlab.net/library/textpdf/15272.pdf.
Reference S3. Schill DJ, Lamansky JA. 1999. The ability of southwest Idaho anglers to identify five species of trout. Idaho Fish and Game Report 00-12.
Found at DOI: https://doi.org/10.3996/032018-JFWM-020.S4 (222 KB PDF).
Reference S4. Thurow RF. 1994. Underwater methods for study of salmonids in the Intermountain West. Ogden, Utah: U.S. Department of Agriculture, Forest Service, Intermountain Research Station. Technical Report INT-GTR-307.
Reference S5. Wang JCS. 1986. Fishes of the Sacramento–San Joaquin Estuary and adjacent waters, California: a guide to the early life histories. Interagency Ecological Program Technical Report No. 9.
The manuscript was improved with suggestions from the Associate Editor, C. Shea, P. Moyle, and two anonymous reviewers. We would like to thank M. Marshal and P. Miklos for specimen collection and organizing the examinations, C. Johnston for creating the map, B. DeVault for acquiring relevant literature, and D. Contreras, J. Giannetta, and J. Morinaka for assistance in confirming specimen identifications prior to the examinations. This project was funded by the Interagency Ecological Program and Metropolitan Water District of Southern California. The Oregon Cooperative Fish and Wildlife Research Unit is jointly sponsored by the U.S. Geological Survey, the U.S. Fish and Wildlife Service, the Oregon Department of Fish and Wildlife, Oregon State University, and the Wildlife Management Institute.
Any use of trade, product, or firm names is for descriptive purposes only and does not imply endorsement by the U.S. Government.
Citation: Kirsch JE, Day JL, Peterson JT, Fullerton DK. 2018. Fish misidentification and potential implications to monitoring within the San Francisco Estuary, California. Journal of Fish and Wildlife Management 9(2):605–623; e1944-687X. doi:10.3996/032018-JFWM-020