Foodborne disease outbreak investigations identify foods responsible for illnesses. However, it is not known the degree to which foods implicated in outbreaks reflect the distribution of food consumption in the U.S. population or the risk associated with their consumption. We compared the distribution of 24 categories of foods implicated in outbreaks with the distribution of foods consumed by the U.S. population. Beef, chicken, eggs, fish, herbs, mollusks, pork, sprouts, seeded vegetables, and turkey were implicated in outbreaks significantly more often than expected based on the frequency of their consumption by the general population, suggesting a higher risk of contamination or mishandling from foods in these categories than from foods in other categories. In contrast, pasteurized dairy, fruits, grains and beans, oils and sugars, and root and underground vegetables were less frequently implicated in outbreaks than their frequency of consumption by the general population, suggesting a lower health risk associated with these food categories.
The distributions of foods consumed and of foods implicated in outbreaks differ.
Specific food categories are more or less likely to cause outbreaks.
These findings may assist with food safety interventions and recommendations.
Every year an estimated 9.4 million people in the United States develop foodborne illnesses caused by known pathogens (26). Outbreak surveillance data provide a direct link between illnesses and their sources and are used to estimate the percentages of foodborne illnesses attributable to specific food categories (9). Attribution estimates based on outbreak data can be used to target preventive efforts toward foods that place people at greatest risk. However, the design of targeted interventions may be improved by an understanding of whether the foods most frequently implicated in outbreaks merely reflect the most common food exposures in the population or whether these foods are instead at a higher risk of contamination or of failure to eliminate pathogens during processing and preparation (e.g., cooking to a sufficient temperature to kill pathogens). Although researchers have evaluated the relationship between foods implicated in outbreaks and consumption frequencies in the general population, these studies have primarily focused on single food categories (1–4, 15, 19–24). In foodborne illness attribution studies, data from foodborne disease outbreaks have been used to evaluate the sources of illnesses, and results have suggested that certain food categories are more risky than others (20). However, these studies have not quantified the relative frequency at which those foods are consumed in relation to how frequently they are implicated in outbreaks. Understanding which foods are over- or underrepresented in outbreaks relative to their consumption rate among the U.S. population can help to elucidate the risk from consuming certain categories of foods. To our knowledge, no comprehensive evaluation has been conducted of the relationship between the distribution of foods implicated in outbreaks and the distribution of foods consumed by the general population. Our goal was to undertake this evaluation with data from outbreaks reported to the Foodborne Disease Outbreak Surveillance System (FDOSS) and from a national population-based survey of dietary habits in the United States: the National Health and Nutrition Examination Survey (NHANES).
MATERIALS AND METHODS
Data source: FDOSS
We obtained data for outbreaks occurring from 2005 through 2016 from the FDOSS of the Centers for Disease Control and Prevention (CDC). State, local, and territorial health department officials submit reports of outbreaks investigated by their agencies to FDOSS using a standard Internet form. CDC assigns implicated foods to 1 of 24 outbreak food categories (OFCs) (9, 25). We assigned outbreaks attributed to multi-ingredient foods to the multiple ingredient food OFC unless all ingredients belonged to the same food category, in which case we assigned the food to that category (e.g., a fruit platter with three types of fruits was assigned to the fruit category). Thus, all multi-ingredient foods with ingredients belonging to different food categories were assigned to the multiple ingredient food OFC, even when a specific contaminated ingredient had been implicated, to improve comparability with the treatment of multi-ingredient foods in the NHANES. We combined foods assigned to the “other” group (e.g., nondairy beverages, condiments, and sweeteners) and “unknown” group (i.e., those that could not be assigned to a single category due to insufficient details in the outbreak report) into an other-unknown OFC for analysis. We created new categories to capture multiple or unspecified land animals (e.g., mixed meats, unspecified meat, and multiple meats consumed) and multiple or unspecified plants (e.g., guacamole, pickles, and multiple plants consumed). We created two dairy categories to distinguish pasteurized and unpasteurized dairy products, and when the status was unknown we assumed that the dairy products were pasteurized. We excluded outbreaks that occurred in institutional settings (e.g., prisons, nursing homes, and hospitals) for a fair comparison with the NHANES noninstitutionalized U.S. population.
Data source: NHANES
We downloaded publicly available data on population food consumption patterns from the dietary recall component of the NHANES collected over the same period as the FDOSS data. The dietary recall data included foods consumed by the participants during each meal in the previous 24 h (10). A detailed description of the food is often provided, including the method of preparation (e.g., baked, fried, or grilled) and the form in which the food was consumed (e.g., dried, raw, or pickled) (13). We used data from six 2-year survey cycles: from the 2005 and 2006 cycle to the 2015 and 2016 cycle.
We also assigned each NHANES food to an OFC. Initial assignment was facilitated by a key word search and matching algorithm programmed in SAS (v. 9.3, SAS Institute, Cary, NC), which automatically assigned matches based on shared key words. We reviewed the accuracy of each automatic food match and manually assigned unmatched foods based on a standard list of representative foods assigned to each OFC (25). Pasteurization status was not specified in the NHANES; because unpasteurized dairy foods are rarely consumed by the U.S. population (<5%), we assumed that dairy products in the NHANES were pasteurized (5, 6). Using the combination food information in the NHANES, we assigned foods consumed in combination with other foods (e.g., chicken nuggets consumed with sweet and sour sauce) to the multiple ingredient OFC and any combination beverage items (e.g., coffee, made from ground consumed with cream substitute) to the other-unknown OFC.
To estimate the proportion of foods represented by each OFC consumed on an average day across the U.S. population, we adapted methods used to determine important sources of nutrients (12). We defined a population-weighted number of foods consumed by creating a new rescaled weight variable, as suggested in the NHANES dietary tutorial (11), considering the number of OFCs consumed in a given day and the multiple survey cycles. For each survey participant, we first determined the number of single-ingredient OFCs consumed on day 1 of the NHANES. Then, we multiplied these OFC counts by the NHANES day 1 dietary weight for that participant and divided by the number of survey cycles in the analysis: (day 1 weight × OFC count)/6 survey cycles. We then tabulated each OFC to estimate the total population-weighted proportion of foods consumed on an average day attributable to each OFC.
Comparing food distributions
We assumed that the foods implicated in outbreaks reflected exposure during a single meal in a 24-h period. To compare FDOSS fairly to foods consumed by the noninstitutionalized U.S. population represented by NHANES participants, we estimated the percentage of single-ingredient foods consumed on an average day by the population associated with each OFC with associated 95% Korn-Graubard/Clopper-Pearson confidence intervals accounting for the complex survey design and sampling weights (with the survey package in R, v. 3.6.1), as recommended by the NHANES analytic guidelines for estimating proportions and confidence limits of dichotomous variables (8).
We generated 1,000 bootstrap samples of outbreak foods, weighting our sampling by the corresponding number of reported outbreak illnesses associated with each food. We calculated the percentage of single-ingredient foods implicated in outbreaks attributed to each OFC, with associated 95% credibility intervals. We compared the NHANES confidence intervals and outbreak-based credibility intervals and considered nonoverlapping ones to be an indication of statistical significance.
We identified 1,734 foods implicated in 10,969 outbreaks from 2005 to 2016. We excluded 97 foods implicated in 2,611 institutional outbreaks. The final data set included 1,525 foods implicated in 10,708 noninstitutional outbreaks: 614 (40.3%) were assigned to single food categories, 640 (42.0%) to the multiple ingredient OFC, and 271 (17.1%) to the other-unknown OFC. The top three most frequently implicated single food categories were seeded vegetables (4.2%), beef (3.7%), and fruits (3.6%).
A total of 12,508 foods were consumed during 229,831 meals by a representative sample of 54,042 people from the U.S. population who reported their day 1 24-h dietary recall information in the NHANES. Among the 12,508 foods consumed, 2,373 (29.0%) were assigned to single food categories, 8,501 (58.9%) were assigned to the multiple ingredient OFC, and 1,634 (12.2%) were assigned to the other-unknown OFC. The top three most frequently consumed single food categories were fruits (5.3%), pasteurized dairy (2.3%), and roots-underground (1.8%).
Among the aquatic animal OFCs (Fig. 1), three single-ingredient foods were implicated in outbreaks significantly more frequently than they were consumed by the U.S. population: fish (1.5% implicated versus 0.3% consumed), mollusks (1.1% versus 0.0%), and other aquatic animals (0.1% versus 0.0%). Among land animal food categories (Fig. 2), the major meat and poultry categories had foods implicated in outbreaks significantly more frequently than they were consumed by the U.S. population: beef (2.3% versus 0.6%), chicken (2.1% versus 0.6%), pork (1.7% versus 1.0%), and turkey (1.5% versus 0.2%). The egg category was also significantly more frequently implicated in outbreaks than they were consumed by the U.S. population (1.4% versus 0.2%) as were single-ingredient plant foods in the herbs (0.7% versus 0.0%), seeded vegetables (2.5% versus 0.3%), and sprouts (0.5% versus 0.1%) categories. In contrast, outbreak foods assigned to the pasteurized dairy (0.3% versus 3.6%) category were significantly less frequently implicated in outbreaks than they were consumed by the U.S. population. Similarly, four plant food categories (Fig. 3) had significantly fewer foods implicated in outbreaks than they were consumed by the U.S. population: fruits (2.2% versus 7.7%), grains-beans (0.4% versus 1.3%), oils-sugars (0.1% versus 1.4%%), and root-underground (0.2% versus 2.1%). Although multiple ingredient foods were implicated in outbreaks significantly less frequently than consumed by the U.S. population (19.1% versus 44.9%), multiple or unspecified land animals (0.7% versus 0.2%), multiple or unspecified plants (2.0% versus 0.4%), and other-unknown foods were implicated in outbreaks significantly more frequently than consumed (55.8% versus 32.6%).
We identified single-ingredient foods in the aquatic animal (fish, mollusks, and other aquatic animals), land animal (beef, chicken, eggs, pork, and turkey), and plant (herbs, seeded vegetables, and sprouts) categories that were implicated in outbreaks significantly more often than expected based on the frequency of their consumption by the U.S. population, suggesting a higher risk of contamination from foods in these categories than from foods in other categories. In contrast, pasteurized dairy, fruits, grains and beans, oils and sugars, and roots and underground vegetables were less frequently implicated in outbreaks than they were consumed by the general population, suggesting a lower risk for these food categories. Our findings provide additional evidence to support food safety recommendations about specific foods suspected to be more likely to cause illnesses and brings new insight into the differences in the distribution of foods associated with outbreaks relative to consumed by the general population. The difference between the consumption frequencies we report and those from other sources (e.g., the U.S. Department of Agriculture [USDA] Economic Research Service food availability per capita data system) is that our denominator was foods rather than people in order to fairly compare consumption data to the way most outbreak exposures occur (i.e., a single contaminated food among all the foods a person consumed). Our food categories also were restricted to single-ingredient foods rather than including multi-ingredient foods.
Differences in how frequently foods are consumed versus implicated in outbreaks may reflect differences in the likelihood of contamination, which could be due to differences in production, processing, and preparation, as found in previous studies. Heiman et al. (18) found that foods belonging to the beef and vegetable row crop categories combined were more important food vehicles for illness caused by Shiga toxin–producing Escherichia coli (STEC) than were foods from other categories. Hsi et al. (20) found that poultry had the highest per serving risk for Salmonella illness, and beef had the highest per serving risk of STEC O157 illness compared with other meat categories. Fecal matter remaining on animal hides and skin during slaughter and processing and improper processing practices can increase the risk of meat contamination. For example, the investigation of a 2013 to 2014 Salmonella Heidelberg infection outbreak that caused 634 illnesses in 29 states and Puerto Rico revealed that chicken products from three production establishments owned by a single company as the source of the outbreak, suggesting a common upstream source in the production or processing chain (16). With respect to risks introduced during food preparation, a study revealed that improper sanitation of cooking surfaces and a lack of knowledge of appropriate cooking temperature were common among 448 U.S. restaurants surveyed (3). Consumer preferences for certain raw and undercooked foods (e.g., rare steak, runny eggs, and sushi) (5) also may contribute to disproportionate numbers of outbreaks associated with meat, egg, and fish categories.
The approach used in this study is complementary to but does not replace root cause analysis. However, when setting priorities to reduce outbreak-associated illnesses, consideration should be given to food categories that are overrepresented in outbreaks compared with the frequency of consumption of these foods. Although some food categories were not implicated in outbreaks significantly more frequently than they were consumed, these foods may still be important sources of foodborne illnesses, both in outbreaks and in sporadic illnesses. For example, foods in the vegetable row crop category are estimated to be responsible for a high proportion of illnesses from outbreaks (7, 14). Outbreak investigations that consider a broad range of foods remain critical for identifying new foods that can be the source of an outbreak, sometimes causing illness in many people, even when these foods are not frequently implicated in outbreaks.
One limitation of this study was the inability to determine the pasteurization status of dairy foods in the NHANES. Our assumption that all dairy foods in that survey were pasteurized enabled us to clearly demonstrate that pasteurized dairy foods were implicated less often in outbreaks than they were consumed, but because NHANES does not have a separate category for unpasteurized dairy, we are not able to make a comparison of the risks due to pasteurized versus unpasteurized dairy. Because the pasteurization status of many products is not listed in outbreak data, the proportion of outbreaks associated with pasteurized dairy was overestimated.
Our study had several other limitations. Not all outbreaks are investigated and reported, many outbreak reports do not include an implicated food vehicle, and food category–specific biases are likely. For example, home cross-contamination of produce is less likely to result in a detected outbreak than is contamination by an ill food worker (17). Specific retail settings are more frequently associated with high-risk food preparation practices than others (22) and other settings may be more frequently associated with specific meals (e.g., restaurant meals may be more frequently consumed for lunch and dinner meals). Limited or incomplete implicated food information may lead to incorrect assignment of OFCs. For multiple ingredient outbreak foods for which the causative ingredient is known, it would be most appropriate to assign the outbreak to the food category of that ingredient. However, for multiple ingredient NHANES foods, there is not a similar “most appropriate” ingredient to use in food category assignment, especially because the recipes of multi-ingredient NHANES foods may be different from those related to outbreaks. For this reason, we grouped all multi-ingredient foods into a multiple ingredient OFC regardless of the implicated ingredients because this approach helped ensure a fair comparison of OFC distributions in outbreaks and as consumed by the general population. As is common in analyses of outbreak data (9), nearly 75% of the foods in our analysis were assigned to the multiple ingredient OFC or other-unknown OFC, and better approaches for handling these categories would permit a more thorough examination of the differences in the foods consumed during outbreaks and by the general U.S. population.
To our knowledge, this is the first study in which a broad range of foods consumed by the U.S. population was compared with food categories frequently implicated in foodborne illness outbreaks, providing a better understanding of which foods are over- and underrepresented in outbreaks relative to their consumption frequency. These findings could assist with setting priorities for focused interventions used to reduce outbreaks of foodborne illnesses.
The authors thank the members of the Interagency Food Safety Analytics Collaboration, a tri-agency collaboration of the CDC, U.S. Food and Drug Administration, and the USDA Food Safety and Inspection Service for their ideas and feedback on this analysis. The authors also thank the National Outbreak Reporting System team and the National Center for Health Statistics staff for their guidance in the management and interpretation of the data in the FDOSS and NHANES. The findings and conclusions in this report are those of the authors and do not necessarily represent the official position of the CDC.