Previous Article | Next Article ![]()
Applied and Environmental Microbiology, November 2003, p. 6405-6411, Vol. 69, No. 11
0099-2240/03/$08.00+0 DOI: 10.1128/AEM.69.11.6405-6411.2003
Copyright © 2003, American Society for Microbiology. All Rights Reserved.
Environmental Science and Engineering Program, Department of Environmental Health,1 Department of Biostatistics, Harvard School of Public Health, Boston, Massachusetts 02115,3 Massachusetts Water Resources Authority, Charlestown Navy Yard, Boston, Massachusetts 021292
Received 7 January 2003/ Accepted 8 August 2003
|
|
|---|
|
|
|---|
An effective indicator should be both sensitive and specific in predicting the presence and absence of all possible pathogens. Sensitivity is defined in this study as the ability of an indicator variable to correctly classify a beach as unsuitable for swimming (geometric mean Enterococcus density > 104 CFU/100 ml). Specificity is defined as the ability of an indicator variable to correctly classify a beach as suitable for swimming (geometric mean Enterococcus density
104 CFU/100 ml). An appropriate indicator variable should be easy to use and provide an accurate assessment of water quality in a timely fashion.
The U.S. Environmental Protection Agency's (EPA's) Ambient Water Quality Criteria for Bacteria1986 recommends that marine recreational waters not exceed a geometric mean density of Enterococcus of 35 CFU/100 ml and that no single sample exceed a maximum of 104 CFU/100 ml (13). An important limitation of using Enterococcus densities to manage beaches is that enumeration requires a minimum of 24 h. Because microbial densities in the waters being tested can change significantly in this time (2, 6), results of samples collected 24 h previously may not provide an accurate assessment of water quality and exposure at the time of use. A recent study demonstrated that approximately 70% of single samples that exceeded a bacterial threshold standard at Huntington Beach, Calif., lasted less than 1 h and that approximately 40% of single samples that exceeded a bacterial threshold at this beach lasted less than 10 min (1).
Environmental variables can also be used as indicators of elevated pathogen concentrations. In many areas, rainfall is a major factor affecting beach water quality due to the impact of contaminated storm water and sewer overflows on the shoreline (10). Rainfall-based alert curves can be constructed to describe a statistical relationship between rainfall events and pathogen concentrations at a specific site (15). This analysis corresponds to a multiple regression model that includes the amount of rainfall, the storm duration time, the number of dry days between rainfall events, and the lag time between rainfall event and beach pathogen appearance. Rainfall-based alert curves can require input data on several variables that may or may not be easy and cost-effective to collect.
Receiver operating characteristic (ROC) curves were developed in the field of statistical decision theory and later used in the field of signal detection for analyzing radar images during World War II (7). ROC curves enabled radar operators to distinguish between an enemy target, a friendly ship, and noise. ROC curves assess the value of diagnostic tests by providing a standard measure of the ability of a test to correctly classify subjects. The biomedical field uses ROC curves extensively to assess the efficacy of diagnostic tests in discriminating between healthy and diseased individuals (12). A test with good discriminatory ability has both a high sensitivity and high specificity. ROC curves can (i) assess the overall discriminatory ability of different potential indicator variables by generating a common metric for comparison and (ii) aid in the selection of a specific value of an indicator variable to use as a threshold or limit that provides a desired trade-off in sensitivity and specificity. With respect to beach water quality indicator variables, ROC curves can quantify the overall effectiveness of different indicator variables to correctly or incorrectly classify a beach as suitable for swimming and generate a single metric by which the different indicator variables can be compared.
Our objective was to determine the ability of Enterococcus density to correctly classify beach water quality as suitable or unsuitable for swimming 24 h after sample collection and to compare this to the ability of antecedent-rainfall volumes to correctly classify beach water quality as suitable or unsuitable for swimming. Another goal was to determine a maximum value, or threshold, for each of the indicator variables that would provide an optimal trade-off in sensitivity and specificity. The use of 104 CFU/100 ml in this study as a delineator of water suitable or unsuitable for swimming was based solely on current EPA recommendations. This work is not intended to comment on the appropriateness of this number or Enterococcus spp. as an indicator organism.
|
|
|---|
![]() View larger version (119K): [in a new window] |
FIG. 1. Map of study beaches and rain gauge locations.
|
|
View this table: [in a new window] |
TABLE 1. Local pollution sources at the four study beaches, including the number of CSOs and storm drains
|
Rainfall data.
Rainfall was measured at three locations with Sierra Misco tipping bucket rain gauges (see Fig. 1 for locations). The rain gauges used were already in place for other purposes when we began this study, and, because sudden, localized, summer rain showers have an impact directly at the beach, the gauge closest to each study beach was selected as the most appropriate. The gauges were calibrated to read and electronically record rainfall in 0.01-in increments. Rainfall records were stored electronically on-site, and 15-min-interval rainfall sums were obtained via telemetry with QuadraScan software (data storage and telemetry provided by ADS Equipment, Inc.; software was provided by QuadraScan 1995). The frequency of data collection allowed for precise calculation of rainfall sums prior to beach water sample collection. Rainfall collection gauges were located within 4.4 km of the sampling locations. To test whether more universally available rain data could effectively be used in this analysis, rainfall data for 2001 were obtained from the Logan Airport National Weather Service (NWS) rain gauge, which is within a 9.1-km radius of the four beaches.
Data preparation.
In this study, multiple samples were collected from several sites along each beach. To eliminate pseudoreplication, we calculated the geometric mean Enterococcus density for each beach on each day and used this value in all subsequent analysis. As described earlier, we defined water suitable for swimming as water with a geometric mean Enterococcus density of 104 CFU/100 ml or less, which corresponds to the recommended EPA single-sample maximum. This definition of water suitable for swimming is not in conflict with the EPA recommendation that the geometric mean of Enterococcus counts not exceed 35 CFU/100 ml because that value is intended for samples collected over time. The replicate samples analyzed here can be considered a snapshot of a single point in time and analogous to a single sample.
Construction of ROC curves.
An example of the information used to create an ROC curve is shown in Fig. 2. The hypothetical distributions of Enterococcus spp. in water suitable for swimming and water unsuitable for swimming on a given day are plotted with respect to previous day's Enterococcus density as the indicator variable. Due to the distribution of the indicator variable population, the two distributions overlap. For each unique value of the indicator variable there is an associated true-positive rate (TPR; sensitivity) and a false-positive rate (FPR; 1 - specificity), which are shown in Fig. 2. The ROC curve is a function relating the TPR to the FPR. The TPR for a given indicator variable is the true proportion of days having an Enterococcus density above 104 CFU/100 ml as identified by that indicator variable. A perfect TPR (1.0) means that all incidences of Enterococcus densities above 104 CFU/100 ml occur above the threshold value of the indicator variable. The FPR for a given indicator variable is the proportion of days having an Enterococcus density less than or equal to 104 CFU/100 ml incorrectly identified as being above 104 CFU/100 ml by that particular variable. This means that if a threshold value for an indicator variable has a zero FPR there will be zero incidences of an Enterococcus density below 104 CFU/100 ml above this threshold level of the indicator variable. The result of the calculation of the TPR and FPR is a unique pair of data points for each unique value of an indicator variable. For a given data set, the observed values of TPR and FPR are plotted for each unique value of the indicator variable to form a sample ROC curve (Fig. 3).
![]() View larger version (37K): [in a new window] |
FIG. 2. Hypothetical distributions of previous day's Enterococcus in water suitable for swimming and water unsuitable for swimming. Shown are a hypothetical TPR and FPR associated with a potential threshold value. An ROC curve was constructed by plotting the TPR (grey) and FPR (black) for each unique value of the indicator variable.
|
![]() View larger version (25K): [in a new window] |
FIG. 3. ROC curve constructed by plotting the TPR and FPR associated with each unique value of the indicator variable. *, hypothetical TPR-FPR pair. An indicator variable with high discriminatory ability will have a curve with an AUC near 1, and an indicator variable with low discriminatory ability will have an AUC near 0.5.
|
Several different indicator variables for assessing beach water quality were examined, including 24-, 48-, and 96-h antecedent rainfall and previous day's Enterococcus density. ROC curves were constructed with Microsoft Excel software (Microsoft Corporation 1999). The resulting paired TPR-FPR points were plotted to form a sample ROC curve. Sample AUC values were calculated according to the trapezoid rule, and these values were verified by using the Mann-Whitney procedures described by DeLong et al. (8). Standard error values were calculated and chi-square tests of hypotheses involving ROC curves were performed by the methods of Delong et al. (8).
|
|
|---|
|
View this table: [in a new window] |
TABLE 2. Descriptive statistical analysis on the Enterococcus data to characterize each beacha
|
![]() View larger version (15K): [in a new window] |
FIG. 4. ROC curves of previous day's Enterococcus density (A) and 48-h antecedent rainfall (B) for Constitution Beach. These curves demonstrate the difference in the slopes of the ROC curves of previous day's Enterococcus density and 48-h antecedent rainfall.
|
|
View this table: [in a new window] |
TABLE 3. AUC associated with each indicator variableb
|
|
View this table: [in a new window] |
TABLE 4. AUCs for the four indicator variables at each beach
|
|
View this table: [in a new window] |
TABLE 5. Interannual variability of AUCs for the four indicator variablesb
|
|
View this table: [in a new window] |
TABLE 6. Sensitivity and specificity associated with previous day's Enterococcus density threshold values at each beach in this studya
|
|
View this table: [in a new window] |
TABLE 7. Comparison of the specificities associated with 0.75 sensitivity for each of the indicator variablesa
|
|
View this table: [in a new window] |
TABLE 8. Utility of 48-h antecedent rainfall for classifying bacterial water qualitya
|
|
|
|---|
The beaches in this study are relatively clean and conform to the EPA's 30-day geometric mean criterion of an Enterococcus density less than 35 CFU/100 ml. Statistical analyses of the 5-year data set showed that increased Enterococcus densities at these beaches were associated with wet weather. Therefore, we chose to compare rainfall indicator variables of beach water quality to the currently used previous day's Enterococcus density.
Using sample ROC curves we were able to compare the abilities of previous day's Enterococcus density and antecedent rainfall variables to correctly classify beach water quality as suitable or unsuitable for swimming by the common metric of the AUC. This analysis suggests that antecedent rainfall was both a more sensitive and more specific indicator of poor bacterial water quality than Enterococcus densities collected 24 h previously at the four beaches in the study. Each of the rainfall variables examined consistently had larger AUCs and less variability among beaches and among years than previous day's Enterococcus density.
The sensitivity and specificity associated with potential threshold values of previous day's Enterococcus density varied widely among beaches. This variability compromises the use of a uniform Enterococcus threshold as the sole indicator of water quality. The sensitivities associated with a threshold of previous day's Enterococcus density of 104 CFU/100 ml ranged from 0.14 to 0.33. Using previous day's Enterococcus density, beach managers posted the swimming advisory accurately only one-third of the time or less. At Boston Harbor beaches, the threshold value of previous day's Enterococcus density greater than 104 CFU/100 ml has a very high false-negative rate and is, therefore, a poor indicator variable to protect public health at these beaches if used as the only management criterion.
The desired level of sensitivity can be determined a priori; however, the results of this a priori selection may prove impractical. A sensitivity of 75% implies that 25% of the incidences of Enterococcus densities above 104 CFU/100 ml will not be correctly discriminated by the threshold value. Increasing the desired sensitivity with a lower threshold value will decrease the probability of failing to predict an Enterococcus density above 104 CFU/100 ml, but specificity will decrease as sensitivity increases, meaning that the beach will be closed more often. With respect to public health, sensitivity is a more important parameter than specificity.
Beach management balances two competing priorities: (i) maintaining the beach as an accessible recreational resource by minimizing unnecessary swimming advisories and (ii) minimizing public health risk by appropriately issuing swimming advisories. A constructive approach is to evaluate the practical consequences of using different indicator variables, in addition to directly comparing AUC values. Thresholds of rainfall variables providing a desired sensitivity offer a more reasonable trade-off between beach accessibility and public health risk.
The sensitivities and specificities associated with 0.21 in. of 48-h antecedent rainfall determined with the 2001 validation data at Tenean and Wollaston beaches were very close to the expected values based on ROC curve analysis of the 1996 to 2000 data. However, at beaches with few incidences of Enterococcus densities above 104 CFU/100 ml, such as Carson and Constitution, an accurate indicator variable is difficult to identify because the sources of contamination may not be associated with rain and may be transient. This underscores the necessity of gathering enough monitoring data to adequately characterize a beach.
In conclusion, this study has shown that, for Boston Harbor beaches, previous day's Enterococcus density was frequently a poor indicator of elevated Enterococcus densities at the time of use, had a high false-negative rate, and may not adequately protect bathers from increased pathogen concentrations. Antecedent rainfall was both a more sensitive and specific indicator of Enterococcus densities above 104 CFU/100 ml. Antecedent-rainfall threshold values did not result in unacceptably high posting rates and provided more spatial and temporal consistency. Furthermore, antecedent rainfall is easily available at the time a beach manager must make a decision about issuing a swimming advisory.
This study has also demonstrated that ROC analysis is a simple and practical tool for quantifying the ability of indicator variables to assess beach water quality and that ROC curves facilitate the selection of a beach-specific threshold for an indicator variable that yields a desirable sensitivity and specificity. ROC analysis can effectively evaluate the relationships between a risk-related variable and candidate indicator variables used to actually manage the beach.
We thank the Massachusetts Division of Urban Parks and Recreation (formerly the Metropolitan District Commission) for data used in this analysis. We thank also Mark Dolittle and Matthew Liebman, who reviewed the manuscript, and the anonymous reviewers, who offered very useful suggestions.
This paper represents the opinions and conclusions of the authors and not necessarily those of the MWRA, NIEHS, or NIH.
|
|
|---|
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»