ABSTRACT
Technological advancements, particularly in the field of geographic information systems (GIS), have made it possible to predict the likelihood of foodborne pathogen contamination in produce production environments using geospatial models. Yet, few studies have examined the validity and robustness of such models. This study was performed to test and refine the rules associated with a previously developed geospatial model that predicts the prevalence of Listeria monocytogenes in produce farms in New York State (NYS). Produce fields for each of four enrolled produce farms were categorized into areas of high or low predicted L. monocytogenes prevalence using rules based on a field's available water storage (AWS) and its proximity to water, impervious cover, and pastures. Drag swabs (n = 1,056) were collected from plots assigned to each risk category. Logistic regression, which tested the ability of each rule to accurately predict the prevalence of L. monocytogenes, validated the rules based on water and pasture. Samples collected near water (odds ratio [OR], 3.0) and pasture (OR, 2.9) showed a significantly increased likelihood of L. monocytogenes isolation compared to that for samples collected far from water and pasture. Generalized linear mixed models identified additional land cover factors associated with an increased likelihood of L. monocytogenes isolation, such as proximity to wetlands. These findings validated a subset of previously developed rules that predict L. monocytogenes prevalence in produce production environments. This suggests that GIS and geospatial models can be used to accurately predict L. monocytogenes prevalence on farms and can be used prospectively to minimize the risk of preharvest contamination of produce.
INTRODUCTION
Fresh produce presents a unique food safety challenge due to the absence of a kill step between harvest and consumption. An increase in recalls and reported outbreaks linked to fresh produce over the past decade (1–3) have been associated with consumer avoidance of products linked to outbreaks (4, 5). This trend can negatively affect growers and the produce industry (4–6). For example, following a 2011 listeriosis outbreak in the United States associated with fresh cantaloupe (7), cantaloupe consumption dropped 53% nationwide (6). The prevention of produce contamination in production environments is therefore a concern for growers, the produce industry, and public health professionals. To develop effective prevention strategies, it is important to understand the ecological processes and environmental factors that affect foodborne pathogen prevalence in produce production environments. Technological advancements, such as geographic information systems (GIS), have the potential to drastically improve our ability to examine these processes and to develop novel tools for ensuring the safety of fresh produce.
Numerous studies (8–21) have examined the ecology of foodborne pathogens in agricultural environments, and several (22–27) have used GIS and geospatial analysis. For example, Chapin et al. (26) used GIS to organize and extract remotely sensed data to show that different species of Listeria occupy distinct ecological niches in agricultural and natural environments. Despite a number of studies that have used GIS to extract or visualize remotely sensed data (22–27), only one study (25) has used GIS to predict the distribution and prevalence of a specific foodborne pathogen in produce production environments. This study, by Strawn et al. (25), used classification tree analysis (CART) to develop a geospatial model that predicts the prevalence of Listeria monocytogenes in New York State (NYS) produce fields. This model consisted of a set of hierarchical rules based on, in order, the proximity of the fields to surface water, temperature, the proximity of fields to impervious cover, available water storage (AWS), and the proximity of fields to pasture (25). Studies in other disease systems (e.g., Lyme disease and West Nile virus) have not only developed (28–34) but have also validated (35–40) geospatial predictive risk models. These validation studies (35–40) demonstrate the utility of geospatial risk models, like the model developed by Strawn et al. (25), to accurately and prospectively predict pathogen prevalence. Additionally, these studies (37, 39, 40) used the output of their models to prioritize and identify risk management strategies, suggesting that geospatial models can also be integrated with on-farm food safety plans to develop targeted approaches to disease prevention. Thus, the purpose of this study was to (i) validate the ability of the model developed by Strawn et al. (25) to predict on-farm areas with a significantly higher or lower prevalence of L. monocytogenes and (ii) identify additional land cover factors that were associated with L. monocytogenes isolation from produce production environments. This research also aimed to increase our understanding of foodborne pathogen ecology and to develop targeted mitigation strategies for risk management in produce production environments (e.g., tailored on-farm food safety approaches). While multiple pathogens can contaminate produce at the production level, we chose L. monocytogenes as a model organism to examine contamination at this level, due to its high prevalence in NYS produce production environments (11, 22, 23, 25). We recognize that the model developed by Strawn et al. (25) predicts the prevalence of L. monocytogenes; however, since the presence of a Listeria sp. is an indicator for L. monocytogenes, we also tested the ability of the model to predict Listeria species prevalence.
MATERIALS AND METHODS
Study design.A cross-sectional study was conducted over a 6-week period in July and August of 2014 on four produce farms in NYS. The farms were located in three regions of NYS: western New York (n = 2), the Hudson Valley (n = 1), and the Capital District (n = 1). The farms were not selected based on geographic location or management practices, and each farm was enrolled based on the willingness of the grower to participate.
All fields within a farm were classified into four high-risk categories and one low-risk category (see Fig. 1) based on a set of hierarchical rules that were adapted from the study by Strawn et al. (25). The rules were based on a field's proximity to water, impervious cover, and pasture and a field's AWS (see Fig. S1 in the supplemental material, “Geographic data and predicting field risk,” for more information). All field areas classified into a given category (e.g., areas within 37.5 m of water) were then divided into 5-by-5-m plots, and a subset of plots was randomly selected from each category for sampling. One drag swab was collected per plot. The methods used in this study were similar to those of Strawn et al. (25) to avoid bias between studies. However, unlike Strawn et al. (25), whose unit of analysis was the field, and who collected drag swab, composite soil, water, and fecal samples, we used the plot (i.e., subfield) as the unit of analysis in the study reported here and only drag swabs were collected.
Map of predicted prevalence of L. monocytogenes on the Homer C. Thompson Vegetable Research Farm at Cornell University; the expected prevalence of L. monocytogenes is listed in parentheses in the key. Note that this map is not based on any of the farms included in this study for confidentiality reasons. Map created using ArcGIS software, and the base map is from ArcGIS (ESRI [all rights reserved]).
Geographic data and prediction of field risk.All manipulations of geographic data were performed in ArcGIS (version 10.2.2; Environmental Systems Research Institute, Redlands, CA [41]). AWS data were obtained from the U.S. Department of Agriculture (http://datagateway.nrcs.usda.gov/GDGOrder.aspx). Land cover data for NYS for 2006 were downloaded and extracted from the National Land Cover Database (NLCD) (http://www.mrlc.gov/nlcd06_data.php). Road data were downloaded from the Cornell University Geographic Information Repository (cugir.mannlib.cornell.edu). Hydrologic data were downloaded from U.S. Geological Survey National Hydrography Map (http://viewer.nationalmap.gov/viewer/nhd.html?p=nhd). Maps of each farm were obtained from the grower, uploaded into ArcGIS, and georeferenced. If the image could not be accurately georeferenced, a farm map was drawn in ArcGIS by identifying field boundaries in satellite images using the original PDF file of the farm fields as a reference.
The predicted field risk for L. monocytogenes was based on a hierarchical model developed by Strawn et al. (25) using classification tree analysis. Briefly, we adapted that model by removing the meteorological factors so the model included only spatial factors (i.e., proximity to water, proximity to impervious cover, AWS, and proximity to pastures; see Fig. S1 in the supplemental material). This adapted model is referred to as the CART model throughout this article. The CART model had four splits/rules, which, in order, are the water rule, the impervious cover rule, the AWS rule, and the pasture rule (see Fig. S1 in the supplemental material).
Before dividing each farm into areas of high or low predicted L. monocytogenes prevalence, the relevant shapefiles for each farm were generated using ArcGIS. Hydrology shapefiles were buffered to 39.5 m, road shapefiles were buffered to 19.5 m, and pasture shapefiles were buffered to 62.5 m. Roads and waterways were buffered by an additional 10 m and 2 m, respectively, to give these features a realistic width. Additionally, the AWS data were converted from raster to shapefile format. The AWS shapefile was then split into (i) areas with AWS of >4.2 cm and (ii) areas with AWS ≤4.2 cm (i.e., high- and low-AWS areas, respectively). The NLCD raster was also converted to shapefile format and split, creating separate files for each land cover class (e.g., pasture, grasslands, and woody wetlands). The NLCD shapefiles for developed areas were merged with the road map to create an impervious-cover shapefile. Similarly, all NLCD shapefiles corresponding to wetlands and forests were merged to create a single wetlands shapefile and a single forest shapefile, respectively.
After creation of the relevant shapefiles, each farm was categorized into areas of high or low predicted L. monocytogenes prevalence according to the splits in the CART model (see Fig. S1 in the supplemental material). For example, the buffered hydrology shapefile corresponded to all areas with a high predicted L. monocytogenes prevalence according to the water rule. Similarly, all areas that did not have a high predicted prevalence according to the water rule but were included in the impervious-cover shapefile corresponded to areas with a high predicted prevalence according to the impervious cover rule.
To assess additional risk factors, the distance was calculated from the center of each 5-by-5-m sampling plot to land covers of interest (i.e., barren land, grassland, forest, impervious cover, roads, scrubland, water, and wetlands). The split NLCD shapefiles were used to calculate the distance to barren land, grassland, and scrubland. Similarly, the road and hydrology shapefiles were used to calculate the distance to roads and water. Last, the merged forest, wetlands, and impervious-cover shapefiles were used to calculate the distance to those features.
Sample collection and preparation.The samples were collected and prepared as previously described by Strawn et al. (25). Briefly, latex gloves (Nasco, Fort Atkinson, WI) were worn and changed for each sample collected. For each plot, a premoistened drag swab (30 ml of buffered Listeria enrichment broth [Becton Dickinson, Franklin Lakes, NJ] in a sterile Whirl-Pak bag) was dragged around the perimeter and diagonals of the plot for 3 to 5 min. All samples were transported on ice, stored at 4°C, and processed within 24 h of collection.
Bacterial enrichment and isolation.Listeria species and L. monocytogenes enrichment and isolation were performed as previously described (25). Briefly, each sample was diluted 1:10 with buffered Listeria enrichment broth (Becton Dickinson) and then incubated at 30°C. After 4 h, Listeria selective enrichment supplement (Oxoid, Cambridge, United Kingdom) was added to each enrichment. After being incubated for 24 and 48 h, 50 μl of each enrichment was streaked onto L. monocytogenes plating medium (LMPM) (Biosynth International, Itasca, IL) and modified Oxford agar (MOX) (Becton Dickinson); the plates were then incubated for 48 h at 35 and 30°C, respectively. Following incubation, up to four presumptive Listeria colonies were substreaked from MOX to LMPM and incubated at 35°C for 48 h. From all LMPM plates, up to four presumptive Listeria colonies were then substreaked onto brain heart infusion (BHI) plates (Becton Dickinson) and then incubated at 37°C for 24 h. The species and sigB allelic type of one presumptive Listeria colony per sample were determined by PCR amplification and sequencing of the partial sigB gene, as previously described (42–44).
Positive and negative controls were processed in parallel with the field samples. L. monocytogenes FSL R3-0001 (45) and uninoculated enrichment medium were used as the positive and negative controls, respectively. All isolates were preserved at −80°C, and the isolate information can be found on the Food Microbe Tracker website.
Statistical analysis.All statistical analyses were performed in R (version 3.1; R Core Team, Vienna, Austria). The frequency and prevalence of L. monocytogenes were calculated for each predicted risk area for each rule. Although the outcome of the CART model was the predicted prevalence of L. monocytogenes, all statistical analyses were performed for both (i) L. monocytogenes and (ii) Listeria spp. (including L. monocytogenes), since Listeria spp. are more common than L. monocytogenes in NYS produce production environments and are often used as an index for L. monocytogenes.
In order to test the ability of each rule to accurately predict the prevalence of Listeria spp. and L. monocytogenes in produce fields and to validate the CART model, multivariable logistic regression analyses were performed using the lme4 package (46). The multivariable model originally contained all four rules but was reduced using backward selection. The outcome for the multivariable model was the presence of Listeria spp. or L. monocytogenes. Farm was included as a random effect.
As the multivariable model used to validate the algorithm adapted from Strawn et al. (25) contained only four factors (i.e., AWS and proximity to surface water, impervious cover, and pasture), univariable generalized linear mixed models (GLMM) (46) were developed to examine the effect of additional land covers (i.e., proximity to barren land, forests, grassland, roads, scrubland, and wetlands) on the likelihood of Listeria species and L. monocytogenes isolation. Since the CART model was based on a binary interpretation of AWS and proximity to water, impervious cover, and pasture, univariable GLMMs were also developed to examine the relationship between these four factors, as continuous variables, and Listeria species and L. monocytogenes prevalence. In this and all other GLMMs performed for this study, farm was included as a random effect, and the outcome was the prevalence of Listeria spp. or L. monocytogenes. All factors that were significantly associated with the isolation of Listeria spp. or L. monocytogenes were tested for correlation with all other factors that were found to be significant by univariable analysis.
A multivariable GLMM was also developed de novo (i.e., not based on the rules reported by Strawn et al. [25]) to identify the most important land cover factors associated with Listeria species and L. monocytogenes isolation from drag swab samples. Factors that were not correlated and were significant by univariable analysis were considered candidate factors for inclusion in the multivariable model.
Predictive models based on the GLMMs for L. monocytogenes were then applied in a GIS platform to generate predictive maps of L. monocytogenes prevalence at the subfield level for comparison with the map that was developed using the CART model (Fig. 1). Predictive risk maps were developed by inputting the univariable and multivariable GLMMs into ArcGIS. The Homer C. Thompson Vegetable Research Farm at Cornell University, Ithaca, NY, was used to develop these maps to ensure the confidentiality of the commercial growers enrolled in our study.
RESULTS
The overall prevalence of Listeria spp. and L. monocytogenes for drag swabs collected from NYS produce farms was 20% and 12%, respectively. Overall, Listeria spp. (including L. monocytogenes) were isolated from 20% (208/1,056) of the samples. L. monocytogenes was isolated from 12% (128/1,056) of the samples, Listeria innocua was isolated from 4.0% (42/1,056) of the samples, Listeria seeligeri was isolated from 2.0% (21/1,056) of the samples, and Listeria welshimeri was isolated from 1.6% (17/1,056) of the samples.
Overall, the prevalence of Listeria spp. was greater for all field areas with a high predicted prevalence of L. monocytogenes isolation than that in field areas with a low predicted prevalence of Listeria species (Table 1 and Fig. 2). For example, the prevalence of Listeria species was 26% (51/195) in samples collected from areas with a high predicted prevalence according to the water rule and 18% (157/861) in samples collected from areas with a low predicted prevalence according to the water rule (Table 1 and Fig. 2).
Frequency and prevalence of Listeria species-positive and L. monocytogenes-positive samples for farm fields that had either a high or a low predicted risk of L. monocytogenes isolation based on land cover factors
Frequency and prevalence of positive Listeria species samples for farm fields that had either a high or a low predicted prevalence of L. monocytogenes isolation based on a hierarchical predictive risk model.
The prevalence of L. monocytogenes was greater for all field areas with a high predicted prevalence of L. monocytogenes isolation than that in field areas with a low predicted prevalence according to the water, pasture, and AWS rules (Table 1 and Fig. 3). For example, the prevalence of L. monocytogenes was 22% (43/195) in samples collected from areas with a high predicted prevalence according to the water rule and 10% (85/861) in samples collected from areas with a low predicted prevalence according to the water rule (Table 1 and Fig. 3).
Frequency and prevalence of positive L. monocytogenes samples for farm fields that had either a high or a low predicted prevalence of L. monocytogenes isolation based on a hierarchical predictive risk model.
Rules based on surface water and pasture proximity accurately predict L. monocytogenes prevalence in environmental samples collected from NYS produce production environments.Logistic regression was performed to test the ability of each rule to accurately predict L. monocytogenes prevalence in NYS produce production environments. Logistic regression analysis showed that only the water and pasture rules accurately predicted the prevalence of L. monocytogenes in NYS produce production environments (Table 2). Samples collected from field areas that had a high predicted prevalence of L. monocytogenes isolation by the water rule had an increased odds of L. monocytogenes isolation (odds ratio [OR], 3.0; 95% confidence interval [CI], 2.0, 4.6) compared to that for samples collected from field areas that had a low predicted prevalence. Samples collected from field areas that had a high predicted prevalence of L. monocytogenes by the pasture rule had an increased odds of L. monocytogenes isolation (OR, 2.9; 95% CI, 1.4, 6.0) compared to that for samples collected from field areas that had a low predicted prevalence.
Results of multivariable analyses built using backward regression (i.e., only factors with P ≤ 0.05 were retained) that tested previously identified rules to accurately predict the effect of different binary land cover factors (e.g., either far away from or close to water) on the likelihood of Listeria species and L. monocytogenes isolation
While the outcome of the CART model was L. monocytogenes prevalence, the ability of the model to predict Listeria species prevalence was also validated because Listeria spp. are more common than L. monocytogenes alone and, as a result, the findings based on Listeria spp. are more robust. Multivariable logistic regression showed that only the water rule was found to accurately predict the prevalence of Listeria spp. in NYS produce production environments (Table 2). Samples collected from field areas that had a high predicted prevalence of L. monocytogenes by the water rule had an increased odds of Listeria species isolation (OR, 1.6; 95% CI, 1.1, 2.4) compared to that from samples collected from field areas that had a low predicted prevalence of Listeria species.
Proximity to wetlands and scrublands was associated with altered likelihood of L. monocytogenes isolation from produce production environments in NYS.As the multivariable model used to validate the CART model (25) contained only four factors, GLMMs were developed to identify additional land cover factors that were associated with the isolation of L. monocytogenes from NYS produce production environments. Of the nine land cover factors that were evaluated, six features (i.e., proximity to forest, grasslands, pasture, scrublands, water, and wetlands) were significantly associated with L. monocytogenes-positive samples by univariable analysis (Table 3). For example, for a 100-m increase in the distance of a sampling site from forests, the likelihood of L. monocytogenes isolation decreased by 14% (OR, 0.86; 95% CI, 0.74, 1.0). Similarly, for a 100-m increase in the distance of a sampling site from surface water, the likelihood of L. monocytogenes isolation decreased by 23% (OR, 0.77; 95% CI, 0.66, 0.90; Fig. 4).
Results of univariable analyses that tested the effect of different land cover factors, treated as continuous variables, on the likelihood of Listeria species and L. monocytogenes isolation
True prevalence (bars) and predicted prevalence of Listeria species-positive samples (A) and L. monocytogenes-positive samples (B) (line) based on mixed models that included proximity to water as a risk factor. True prevalence was calculated for 50-m bins (e.g., all samples that were between 0 and 50 m from water went into the first bin); the sample size for each bin is noted at the bottom of each column. Among five samples collected >650 m away from water, two were Listeria species positive and none were L. monocytogenes positive. Prevalence is reported as a decimal.
To identify the most important land cover factors associated with L. monocytogenes isolation from produce production environments, a multivariable GLMM was developed. The six factors that were found to be significant by univariable analysis were included as candidate factors. In the final GLMM, only three land cover features were retained (see Table S1 in the supplemental material), and no significant interactions (i.e., P < 0.05) were observed between any variables in the model. For a 100-m increase in the distance of a sampling site from forests, the likelihood of L. monocytogenes isolation decreased by 13% (OR, 0.87; 95% CI, 0.76, 0.99). For a 100-m increase in the distance of a sampling site from scrubland, the likelihood of L. monocytogenes isolation decreased by 6% (OR, 0.94; 95% CI, 0.88, 1.0). Last, for a 100-m increase in the distance of a sampling site from water, the likelihood of L. monocytogenes isolation decreased by 15% (OR, 0.85; 95% CI, 0.76, 0.95).
Predictive risk maps (Fig. 5) were then developed using the univariable and multivariable GLMMs for L. monocytogenes described above (Table 3; see also Table S1 in the supplemental material). The maps were developed to allow for comparisons with the map based on the CART model (Fig. 1) and as a proof of a concept to assess if the multivariable GLMM for L. monocytogenes could be used to predict L. monocytogenes prevalence at the subfield level. This map shows that multivariable GLMM can be used to generate a map of L. monocytogenes prevalence and that this map is at a finer scale than that of maps based on CART analyses.
Map of predicted prevalence of L. monocytogenes for the Homer C. Thompson Vegetable Research Farm at Cornell University based on the results of (i) univariable generalized linear mixed models in which proximities to scrubland (A), water (B), and wetlands (C) were included as risk factors and (ii) a multivariable generalized linear mixed model in which proximities to scrubland, water, and wetlands were included as risk factors (D). Note that this map is not based on any of the farms included in this study for confidentiality reasons. Maps were created using ArcGIS software, and base maps are from ArcGIS (ESRI [all rights reserved]).
Proximity to forests and scrublands was associated with an increased likelihood of Listeria species isolation from produce production environments in NYS.Similar to L. monocytogenes, GLMMs were also developed to identify additional land cover factors that were associated with the isolation of Listeria spp. from NYS produce production environments. Of the nine land cover factors that were evaluated, five features (i.e., proximity to forest, pasture, scrublands, water, and wetlands) were significantly associated with Listeria-positive samples by univariable analysis (Table 3). For example, for a 100-m increase in the distance of a sampling site from forests, the likelihood of Listeria species isolation decreased by 16% (OR, 0.84; 95% CI, 0.74, 0.95). Similarly, for a 100-m increase in the distance of a sampling site from surface water, the likelihood of Listeria species isolation decreased by 15% (OR, 0.85; 95% CI, 0.76, 0.95; Fig. 4). No strong correlations (i.e., correlation coefficient of <0.5) were observed between any of the significant factors by univariable analysis.
To identify the most important land cover factors associated with Listeria species isolation from produce production environments, a multivariable GLMM was developed. The five factors that were found to be significant by univariable analysis were included as candidate factors. In the final GLMM, only three land cover factors were retained (see Table S1 in the supplemental material), and no significant interactions were observed between the variables in the final model. For a 100-m increase in the distance of a sampling site from scrubland, the likelihood of Listeria species isolation decreased by 14% (OR, 0.86; 95% CI, 0.79, 0.93). For a 100-m increase in the distance of a sampling site from water, the likelihood of Listeria species isolation decreased by 24% (OR, 0.76; 95% CI, 0.65, 0.89). Last, for a 100-m increase in the distance of a sampling site from wetlands, the likelihood of Listeria species isolation decreased by 9% (OR, 0.91; 95% CI, 0.83, 0.99).
DISCUSSION
The primary objectives of this study were (i) to validate previously developed geospatial rules that predicted areas of significantly higher or lower prevalence of L. monocytogenes and (ii) to identify additional land cover factors that may be associated with an increased or decreased likelihood of L. monocytogenes isolation in produce production environments. Our study validated two of the four rules (i.e., the water and pasture rules) that comprised the CART model (25). Additionally, among land cover factors that were not included in the original CART model but were tested here, proximity to scrubland and proximity to wetlands were found to be significantly associated with an increased likelihood of L. monocytogenes isolation. These findings suggest that on-farm produce safety is complicated by the ecological context unique to each field and by the scale (e.g., the farm, field, and subfield levels) at which prevalence is assessed. Thus, it is essential to have tools that allow growers to account for both ecological context and scale when developing on-farm produce safety plans. The validation of the water and pasture rules in this study demonstrates the application of geospatial models for prospective and accurate prediction of pathogen prevalence on produce farms, suggesting that GIS is a promising tool for food safety.
Geospatial models have the ability to accurately predict the likelihood of L. monocytogenes isolation from produce production environments.In this study, proximity to surface water and proximity to pasture were significantly associated with L. monocytogenes isolation from produce production environments by logistic regression. These findings validated two of the four rules from the CART model adapted from the study by Strawn et al. (25). These findings were also consistent with other studies conducted on L. monocytogenes in NYS agricultural environments (22, 23, 26) and on L. monocytogenes and other foodborne pathogens in agricultural and nonagricultural environments (19, 47–50). For example, in a Canadian study, Lyautey et al. (47) found that proximity to dairy operations was one of the most important predictors of L. monocytogenes-positive surface water samples. The repeated identification of an association between L. monocytogenes isolation and proximity to water, pasture, and other livestock-associated areas suggests that our findings are translatable to other farms in NYS. In our study reported here, proximity to water and proximity to pasture were significantly associated with L. monocytogenes isolation by GLMM and logistic regression, further supporting the robustness of this association. By validating two of the rules adapted from the CART model, our study demonstrates that geospatial models can be used to accurately and prospectively predict the prevalence of L. monocytogenes in produce production environments.
Interestingly, while our findings were generally consistent with the previously reported CART model (25), neither the AWS nor the impervious cover rules were validated by our findings. This may be the result of small differences in sampling protocols between the study by Strawn et al. (25) and the study reported here. Strawn et al. (25) used drag swab, composite soil, fecal, and water samples in their analyses, while in the study reported here, only drag swab samples were collected. As each sample type likely represents a unique L. monocytogenes population from a distinct ecological niche (e.g., water versus soil), it seems plausible that different factors would be associated with the isolation of L. monocytogenes in each study. Therefore, the fact that the AWS and impervious cover rules were not validated may indicate that these rules are associated with L. monocytogenes isolation from one of the sample types that were collected by Strawn et al. (25) but not in the study here (e.g., water samples). Future studies that investigate geospatial factors associated with contamination risk for actual produce (i.e., not environmental samples) are thus needed to increase the accuracy of predictive models and allow growers to maximize surveillance efforts. However, these studies will require considerably larger sample sizes, as pathogen prevalence on produce tends to be significantly lower than that in environmental samples (22). Also, in the study reported here, more samples were collected from areas at low predicted risk than from areas at high predicted risk; this was due to the fact that samples were collected in commercial settings. Future studies should aim to collect comparable sample sizes from high- and low-risk areas as well.
Identification of additional factors (e.g., proximity to wetlands) that were not included in the original CART model but were found to be associated with the prevalence of L. monocytogenes in produce production environments may aid in the refinement of prediction models. Importantly, these same factors have also been identified as risk factors for Listeria and L. monocytogenes contamination in past studies of natural (26) and agricultural (23, 26) environments. However, while the study reported here did not find any significant interactions between the different landscape factors studied, a previous report did find that interactions between landscape and meteorological factors significantly affected the probability of isolating Listeria spp. from soil, vegetation, and water (24). Similarly, previous studies (9, 11, 19, 49, 51–53) found that management practices were significantly associated with the likelihood of isolating L. monocytogenes from on-farm environments. Management practices and meteorological factors, which were not considered in the study reported here, may thus affect the relationships between L. monocytogenes prevalence and landscape factors. Further improvement of geospatial models may therefore be achieved by integration of additional environmental (both landscape and meteorological) and management practice data. While the development of such models would require larger data sets, these models might account for temporal (e.g., changes in management practices or meteorological factors over time) and spatial variation and would thus facilitate the identification of additional risk factors and control strategies.
Issues of scale need to be considered when developing and validating geospatial models for preharvest produce safety assessment.Despite the fact that the pasture rule was validated by logistic regression, proximity to pasture was not retained in the final multivariable GLMM. This difference may be a function of scale, which is defined by the resolution (i.e., grain) and extent of the available spatial data. Numerous studies (54–62) have found that changing the study scale changes the strength of associations and interactions. For example, in a study on habitat use by Eleodes hispilabris, McIntyre (62) found that E. hispilabris avoided shrubs at small scales but selectively occupied shrubland at larger scales, which may be due to different mechanisms influencing habitat selection at the different scales. Thus, studies that look at similar outcomes (e.g., L. monocytogenes prevalence) at different scales (e.g., field and subfield levels) may identify different predictor variables. The issue of scale is complicated by the grain and accuracy of the remotely sensed data available, particularly if the scale of the input data differs from the scale of the model (63). For example, while the 2006 NLCD has a national accuracy of 78% (64), the odds of misclassification increase as landscape heterogeneity increases (65). Therefore, in highly mosaic environments, such as produce farms, NLCD accuracy is lower. This may also explain why proximity to pasture was not retained in the final GLMM, particularly since misclassification of grass-dominated landscapes, such as pasture, accounted for 26% of all inaccuracies (64). It is therefore important that researchers are cognizant of the limitations associated with the use of remotely sensed data to develop geospatial predictive risk models. On the other hand, these limitations can be minimized by carefully designing studies and using appropriate analyses that account for scale (54, 63, 66). In addition, improved data collection strategies (e.g., using drones) could be used to address these issues in the future. Despite differences in study scale, it is important to note that proximity to pasture was significantly associated with L. monocytogenes prevalence by univariable GLMM, which does support the validation of the pasture rule by logistic regression.
Ecological and food safety implications of edge interactions on farm landscapes.In the present work, edge interactions between produce farms and four other land cover types (i.e., forest, scrubland, water, and wetland) were observed. The elevated prevalence of L. monocytogenes in ecotones (i.e., the transitional area where two ecological communities meet) is consistent with patterns observed in other disease systems (e.g., Lyme disease [67–70]). This is also consistent with our current understanding of infectious disease emergence, as they frequently arise at the interface between human habitats and other ecosystems (67–71). Ecotones are most abundant in fragmented landscapes, and their presence intensifies ecological processes. For example, ecotones are often more diverse than surrounding communities (69, 72, 73) and provide an ideal habitat for “edge species” (e.g., ticks and rodents [69]). Additionally, ecotones and the associated habitat fragmentation affect the nature and rate of species interaction (e.g., intensifying competition [69, 74]). In this context, our results suggest that food grown within short distances of ecotones, specifically the boundaries between farm fields and forests, water, scrublands, or wetlands, is at an increased risk for L. monocytogenes contamination. Thus, risk management plans need to consider the potential for increased preharvest food safety risks associated with produce grown in or near ecotones. For example, growers could create buffer zones of unharvested product near the edges of fields, increase surveillance and/or decontamination of produce grown near field edges, or stage harvesting and processing so that higher-risk material (i.e., produce grown near field edges) is harvested and processed last. These concerns are particularly pertinent for small farms that have a larger ratio of ecotone to field area; thus, future studies should account for farm size when developing and validating on-farm intervention strategies.
Predictive risk maps based on GIS-enabled models allow for the visualization of preharvest food safety risk at multiple scales.The CART model predicted prevalence at the field level, while the GLMMs developed in the study reported here predicted L. monocytogenes prevalence at every point within a field (i.e., at the subfield level). Thus, the CART model generated a map of discrete areas of high and low predicted prevalence (Fig. 1), while the GLMMs produced a risk gradient map (Fig. 5). As previously mentioned, different mechanisms drive ecological processes at different scales, so the factors that are significantly associated with L. monocytogenes isolation at the field and subfield levels may differ. Therefore, the model and predictive map that are most appropriate for use by the grower depend on the scale of their risk management plan (i.e., farm, field, or subfield level). In general, maps based on the GLMM are more appropriate, as those maps offer greater resolution than CART models, which allows for the development of more targeted mitigation strategies. However, the ability to develop both map types demonstrates the flexibility of geospatial tools and the utility of GIS for visualizing the output of different model types. Overall, GIS offers a unique opportunity to look at variation across scales and to account for cross-scale differences in predictive models by allowing for the integration and visualization of remotely sensed and field-collected data.
Conclusions.This study yielded quantitative data that showed that L. monocytogenes contamination on produce farms is dependent on the specific ecological context of a produce farm and that geospatial predictive risk maps can be used to accurately and prospectively predict L. monocytogenes prevalence in NYS produce production environments. Additionally, other land cover factors were identified that should be examined in future studies to develop higher-resolution models. The implementation of geospatial predictive models by the produce industry may increase the understanding of risk factors that promote foodborne pathogen prevalence and persistence in produce fields, and it will assist growers in focusing their food safety efforts. Geospatial models allow for the development of individualized preventive measures on produce farms, as they enable growers to proactively assess and address environmental factors that may increase the risk of contamination events on their specific farms. For example, predictive risk maps can identify areas of high predicted pathogen prevalence within farms and enable growers to make more informed decisions about the management of crops in these areas, including targeted pathogen surveillance programs and altered management practices. Thus, geospatial predictive risk models and maps have a promising future in preharvest food safety, as they can be applied to any location and utilize the unique combination of landscape characteristics (e.g., proximity to domestic animal operations), soil properties (e.g., available water storage), and climate (e.g., precipitation) of a farm in the prediction process.
ACKNOWLEDGMENTS
This work was supported by the Center for Produce Safety.
We thank Maureen Gunderson and Sherry Roof for their technical assistance and Erika Mudrak, David Kent, Saurabh Mehta, Julia Finkelstein, Francoise Vermeylen, and Sadie Ryan for their statistical support. We also thank Randy Worobo and Betsy Bihn for helping us enroll growers in the study.
FOOTNOTES
- Received 22 September 2015.
- Accepted 12 November 2015.
- Accepted manuscript posted online 20 November 2015.
Supplemental material for this article may be found at http://dx.doi.org/10.1128/AEM.03088-15.
- Copyright © 2016, American Society for Microbiology. All Rights Reserved.