Previous Article | Next Article ![]()
Applied and Environmental Microbiology, June 2006, p. 4012-4019, Vol. 72, No. 6
0099-2240/06/$08.00+0 doi:10.1128/AEM.02764-05
Copyright © 2006, American Society for Microbiology. All Rights Reserved.
Department of Microbiology,1 BioTechnology Institute,2 Department of Soil, Water, and Climate, University of Minnesota, St. Paul, Minnesota 551083
Received 22 November 2005/ Accepted 29 March 2006
| ABSTRACT |
|---|
|
|
|---|
| INTRODUCTION |
|---|
|
|
|---|
Phenotypic and genotypic techniques have been used to determine potential sources of fecal bacteria found in surface waters (4, 5, 9, 11, 14, 19, 27, 31, 33, 37-39, 41), and Escherichia coli and Enterococcus sp. strains are the most widely examined bacteria in such studies. The majority of these methodologies require construction of known-source libraries to differentiate among environmental isolates originating from different animal sources (41). However, since the size of the host source libraries is often limited (many libraries consist of about 35 to about 2,500 isolates [27]), they do not permit adequate determination of potential sources of environmental E. coli and Enterococcus isolates. Moreover, the utility of known-source libraries is further limited by the lack of representation due to temporal and geographic variations in bacterial genotypes within and between animal species (13, 16, 24, 38), the presence of multiple strains in a single animal (31), host animal diet variation (17), the presence of soil- and alga-borne indicator organisms (7, 21), the presence of transient inhabitants of gastrointestinal tracts, and the great genetic diversity of microorganisms used for source-tracking analyses (27, 31).
Based on these shortcomings, investigators have evaluated the use of library-independent methods to define sources of fecal bacteria in the environment. These methods, which avoid issues of library size and isolate diversity, use both growth-dependent and growth-independent technologies. Enteric viruses have been investigated for use in growth- and library-independent analyses of fecal pollution sources. These studies have revealed that viruses from various animal sources exhibit some level of host specificity (26, 28, 34), and molecular assays have been developed to examine the usefulness of viruses in microbial source-tracking studies (12, 25). Bernhard and Field have been developing 16S rRNA gene-based genetic markers for growth- and library-independent analysis of Bifidobacterium and Bacteroides-Prevotella for source identification purposes (4, 5). Recently, Dick and coworkers reported effective use of a microplate subtractive hybridization method to define host-specific 16S rRNA-based genetic markers for Bacteroides sp. strains (10). In a separate study, Dick and coworkers (9) analyzed Bacteroidales 16S rRNA gene sequences from the feces of eight animals and designed host-specific PCR primers to identify pig- and horse-derived fecal pollution in water. Similarly, Scott and coworkers (39) reported isolation of a host-specific marker gene of Enterococcus faecium, coding for a putative virulence factor (esp), that allows determination of sources of enterococci in waterways. While these methods show great promise as microbial source-tracking tools, they may be limited by the inability to obtain high-throughput data and by the expense and limitations associated with the use of PCR with environmental samples. In addition, neither system allows correlation with fecal coliform or E. coli counts that are commonly obtained by government agencies for freshwater systems.
In this paper, we describe the development and validation of host source-specific genetic markers for E. coli strains originating from Canada geese (Branta canadensis). These markers were shown to be useful for determining sources of fecal pollution in Lake Superior, and they are useful for high-throughput studies. Instead of randomly screening for host source-specific genes, we took a genomic comparison approach by using suppression subtractive hybridization (SSH) to define host source-specific markers. The SSH technique has been found to be useful for examining genetic diversity in E. coli (32), identifying genetic differences between closely related strains (2, 32), examining pathogenicity determinants in E. coli (22), and developing probes to examine natural bacterial communities (30). More importantly, the SSH approach has been found to be an effective tool for the development of strain- and host source-specific marker probes (1, 10, 15, 20, 23, 29).
| MATERIALS AND METHODS |
|---|
|
|
|---|
To determine if marker DNAs were capable of hybridizing with goose isolates from other geographic areas, 172, 100, 73, and 14 E. coli isolates were also obtained from Canada geese in Delaware, West Virginia, Wisconsin, and Indiana, respectively.
Isolation of environmental E. coli.
Offshore lake water samples were collected from Lake Phalen (St. Paul, MN), an urban lake frequented by Canada geese, using standard procedures (6). Water samples (2 liters) were filtered through 0.45-µm Nuclepore polycarbonate membranes (Whatman, Florham Park, NJ). Bacteria on the membranes were resuspended in phosphate-buffered saline (pH 7.0) using a sterile magnetic stir bar and vortexing to facilitate suspension of the bacterial cells. A total of 1,152 E. coli isolates were isolated from the concentrated samples as previously described (11) and stored at 80°C before use.
Suppression subtractive hybridization.
SSH was done using the CLONTECH PCR-Select bacterial genome subtraction kit (BD Biosciences CLONTECH, Mountain View, CA) according to the manufacturer's instructions. Genomic DNAs from the five goose E. coli strains and five human E. coli strains were prepared using a cesium chloride density gradient centrifugation method as previously described (35). Two-microgram aliquots of genomic DNAs from the five goose E. coli strains and five human E. coli strains were separately pooled and used as tester and driver DNAs, respectively. Prior to the subtraction procedure, 2-µg aliquots of each pooled sample were digested to completion with RsaI. SSH was repeated using PCR-amplified secondary subtraction products as tester DNAs to further enrich for tester-specific fragments. To create a library of potential DNA inserts that were specific for geese, the final subtraction products were cloned into the pGEM-T vector using a T/A cloning procedure (Promega, Madison, WI). A total of 192 clones were randomly selected and stored frozen at 80°C in 50% glycerol until use.
Identification of DNA sequences specific for E. coli from geese.
The library of cloned potential goose-specific DNA fragments was screened for hybridization specificity using a dot blot procedure as described by Schleicher & Schuell, Keene, NH (http://www.schleicher-schuell.com/bioscience). Cloned insert DNAs were amplified by PCR using nested primers 1 (5'-TCGAGCGGCCGCCCGGGCAGGT-3') and 2R (5'-AGCGTGGTCGCGGCCGAGGT-3') provided in the CLONTECH SSH kit. PCRs were carried out using the following conditions: 94°C for 2 min, followed by 25 cycles of 94°C for 30 s, 68°C for 30 s, and 72°C for 1 min and a final elongation step of 2 min at 72°C. PCR products (0.5 µg) were spotted onto duplicate Nytran SuPerCharge nylon membranes (Schleicher & Schuell, Keene, NH) using a dot blot vacuum manifold (Gibco-BRL, Gaithersburg, MD) and the Minifold spotting protocol (Schleicher & Schuell, Keene, N.H.). Membranes were baked at 80°C for 2 h and prehybridized overnight at 42°C in a solution containing 6x SSC, 10x Denhardt's solution, 1% sodium dodecyl sulfate, and 100 µg denatured herring sperm DNA per ml (1x SSC is 0.15 M NaCl plus 0.015 M sodium citrate) (36). Aliquots (125 ng) of RsaI-digested pooled genomic DNAs from the five human E. coli strains or five goose E. coli strains were labeled with [
-32P]CTP using a random primer labeling kit (Invitrogen, Carlsbad, Calif.) according to the manufacturer's protocol. Probes were hybridized for 18 h at 46°C to membranes and washed under high-stringency conditions as previously described (36). Images were captured using a STORM 840 densitometer (Molecular Dynamics, Piscataway, NJ). Presumptive goose-specific DNA inserts were identified on the basis of visual differences in hybridization intensity.
Plasmids were isolated from presumptive goose-specific clones using a QIAprep Spin miniprep kit (QIAGEN, Valencia, CA) according to the manufacturer's protocol. Insert DNA was amplified by PCR using nested primers 1 and 2R as described above and electrophoresed on 2% agarose gels. DNAs were transferred to Nytran SuPerCharge nylon membranes as described previously (36). The membranes were probed with the RsaI-digested, pooled, genomic DNAs as described above.
DNA sequencing and analysis.
Confirmed goose-specific DNA inserts were sequenced in both directions using pUC/M13 universal forward (5'-CGCCAGGGTTTTCCCAGTC ACGAC-3') and reverse (5'-TCACACAGGAAACAGCTATGAC-3') sequencing primers. Sequencing reactions were performed using BigDye (Applied Biosystems, Foster City, CA) sequencing chemistry at the Advanced Genetic Analysis Center, University of Minnesota, St. Paul. Translated sequences were analyzed using the BLASTX algorithm at NCBI (http://www.ncbi.nlm.nih.gov/BLAST) and the GenBank and E. coli databases.
Colony hybridization for probe evaluation and environmental application.
The specificity of subtracted DNA inserts was evaluated by colony hybridization to 48 cat, 96 chicken, 96 cow, 96 deer, 96 dog, 81 duck, 135 goose, 42 goat, 78 horse, 210 human, 96 pig, 60 sheep, and 96 turkey E. coli isolates (27). An additional 27 E. coli strains isolated from Lake Superior Harbor in Duluth, MN, 1,152 isolates from Lake Phalen (St. Paul, MN), and 359 isolates from Canada geese obtained in Delaware, West Virginia, Wisconsin, and Indiana were also evaluated by colony hybridization. Probe specificity was evaluated using blind samples consisting of 96 randomly selected isolates obtained from geese, horses, pigs, sheep, and humans. E. coli strains were inoculated from frozen stocks onto Nytran SuPerCharge membranes (20 cm2; Schleicher & Schuell, Keene, NH) using a 48-pin multiple inoculator. The membranes were placed onto the surfaces of LB (36) agar plates (22 by 22 cm; Qtray Genetix, United Kingdom) and incubated at 37°C for approximately 5 h. Colonies were lysed, and DNA on the membranes was processed as described previously (36). Membranes were prehybridized at 68°C overnight in a solution containing 6x SSC, 10x Denhardt's solution, and 100 µg denatured herring sperm DNA per ml. Probes from insert DNAs (50 ng) were labeled using the Random Primer DNA labeling system (Invitrogen, Carlsbad, Calif.) according to the manufacturer's protocol. Membranes were hybridized overnight at 68°C in a solution containing 6x SSC, 10x Denhardt's solution, and 100 µg denatured herring sperm DNA per ml. Blots were finally washed in 0.1x SSC-0.1% sodium dodecyl sulfate at 65°C, and images were obtained as described below.
Quantitative image analysis.
Quantitative image analysis was used to determine positive and negative signals on colony hybridization membranes. Images were captured using a STORM 840 densitometer (Molecular Dynamics, Piscataway, NJ) and were analyzed using the ScanAlyze version 2.50 software (http://rana.lbl.gov/EisenSoftware.htm). The normalized intensity of each spot was calculated by subtracting the median intensity of the background from the mean intensity of each spot. Normalized spot intensities were plotted using the Sigma Plot version 8.0 software (Systat Software, Point Richmond, CA), and a cutoff value was assigned based on normalized mean intensities of negative control spots plus three times the standard deviation.
Nucleotide sequence accession numbers.
The nucleotide sequences obtained in this study have been deposited in the GenBank database under accession numbers DQ300500 to DQ300502 and DQ300504 to DQ300507.
| RESULTS |
|---|
|
|
|---|
|
|
|
|
|
Environmental E. coli and geographic analyses.
To examine the correlation between the results obtained using the new markers described in this paper and the results obtained using other methods, we isolated about 200 E. coli strains from Duluth harbor water and analyzed them first by using the HFERP DNA fingerprinting technique and then by hybridization using combined 32P-labeled insert DNAs GB2 and GE11. Of the 200 E. coli isolates examined, 27 (13.5%) were identified as isolates that likely originated from geese using the HFERP DNA fingerprinting technique, a comprehensive known-source DNA fingerprint library, and ID bootstrap analysis with a P value of
0.9 (27). When isolates were screened by colony hybridization to a pooled GB2/GE11 insert DNA probe, 22 of 27 strains hybridized with the probe. This corresponded to 81.5% agreement between HFERP classification and marker probe analysis using the GB2/GE11 screening system described here. The applicability of DNA marker technology was also demonstrated by screening randomly selected environmental E. coli isolates from Lake Phalen, a local urban lake frequently affected by Canada geese. Of the 1,152 isolates examined, 301 (26.1%) tested positive with the GB2 and GE11 probes.
To determine if the DNA markers used could identify E. coli from geese obtained from other geographic regions of the United States, we hybridized probes GB2 and GE11 with an additional 359 goose isolates obtained from Delaware, Indiana, Wisconsin, and West Virginia. The results of this experiment demonstrated that only 24.0% of the isolates hybridized to the marker DNAs (data not shown). Probes GB2 and GE11 hybridized to 20, 28, 38, and 20% of the goose E. coli strains from Delaware, Indiana, Wisconsin, and West Virginia, respectively.
Sequencing and BLAST searches.
The seven confirmed goose- and duck-specific DNA inserts were sequenced in both directions, and translated sequences were subjected to BLASTX analyses using E. coli protein databases at NCBI. The sequenced inserts were between 332 and 885 bp long. The results of BLASTX homology searches are summarized in Table 2. The GB2 and GE11 inserts, each of which hybridized to about 48% of the E. coli strains from geese, were 93% identical to each another at the nucleotide level. When the sequences were translated, there was significant amino acid homology (65% and 66% amino acid identity, respectively) to the C-terminal fragment of the AIDA-I adhesin-like protein of E. coli O157:H7 (GenBank accession no. BAB33785). The GD5 insert product exhibited 89% amino acid identity to a fragment of the TraT complement resistance protein of E. coli (accession no. AAT85681), and the GF5 insert was 98% identical to ORF5 in E. coli, with no significant matches to any entries in the database. Other matches with less than 50% amino acid identity to proteins in the database included two type III secretion machinery proteins from E. coli O157:H7 (accession no. AAG57987 and BAB37142) and a NikB nickase (accession no. NP_052661).
|
| DISCUSSION |
|---|
|
|
|---|
One downside of using multiple tester DNAs is reduced subtraction efficiency due to the increased complexity introduced into the reaction. Generally, genome subtraction yields greater than 25% tester-specific sequences after screening (CLONTECH, Mountain View, CA), compared to the approximately 9% efficiency that was observed in this study. However, reduced efficiency was not found to be an issue with the screening procedures that we employed, and for our purposes increased hybridization specificity and the ability to identify more isolates are the most important parameters. Seven goose-specific insert DNAs exhibited increased hybridization with strains isolated from geese compared to the hybridization with isolates obtained from humans. While these insert DNAs each hybridized with less than one-half of the goose isolates tested, revealing genetic diversity in goose E. coli strains, together the inserts identified 76 and 72% of the E. coli isolates from goose and ducks, respectively. Consequently, subsequent field studies should be done using pooled insert DNAs as hybridization probes.
When the sequences were translated, the products of the nearly identical insert DNAs GB2 and GE11 exhibited 65% amino acid identity to the C-terminal portion of the AIDA-I adhesin-like protein of E. coli strain O157:H7. This result suggests that inserts GB2 and GE11 are fragments of an unidentified adhesin-like gene. As adhesins mediate the attachment of bacteria to host tissues (45), it seems plausible and logical that this putative gene may mediate the attachment of specific E. coli isolates to the goose intestinal tract. Attachment to the host intestinal epithelium is the necessary first step in gut colonization (45), and, therefore, the putative gene may be responsible for preferential colonization of the goose host. If this hypothesis is validated by experimental in vivo colonization data, other adhesin genes that participate in host-specific colonization may also represent ecologically meaningful markers that can be targeted for microbial source-tracking purposes.
Since together the seven DNA inserts hybridized with 76% of goose isolates, we examined whether the probes cross-hybridized with isolates from cats, chickens, cows, deer, ducks, goats, horses, humans, pigs, sheep, and turkeys. Interestingly, the seven probes cross-hybridized with 73% of the E. coli isolates from ducks and with 14.6 and 12.5% of the isolates from turkeys and chickens, respectively, but with only about 10% of the E. coli strains from other hosts. However, the results of preliminary studies indicated that the GB2 and GE11 probes cross-hybridized with 11 and 9% of gull and tern E. coli isolates, respectively. Presumably, these results are due to the close genetic relationship between chickens, ducks, geese, and turkeys and may indicate that the intestinal tracts of some avian species can be colonized by the same E. coli strain. Alternately, they may reflect the cosmopolitan nature of some E. coli strains (47), a transient intestinal population structure (18), a lack of host specificity in this subgroup of E. coli, or the presence of multiple adhesins that mediate colonization (44).
Recently, Soule et al. (42) used a microarray approach to identify several DNA markers from Enterococcus sp. that were subsequently used to develop host-specific PCR primers. While many of the markers identified were specific for Enterococcus isolates from targeted host species, they often failed to detect a high percentage of the isolates from these hosts. However, other markers detected from 27 to 45% of the enterococci from targeted host species, but they also detected 1.1 to 7.1% of the nontargeted isolates. This result is similar to cross-reactions that we found in the current study using the DNA probes (Fig. 4). In contrast, Bernhard and Field (4) and Dick et al. (10) reported that PCR primers for Bacteroidales did not detect nontargeted hosts, suggesting that the markers which they used were more specific than those found in our study. However, these authors analyzed diluted fecal samples and DNAs rather than individual colonies, making direct comparisons to our method difficult.
Results obtained from screening water isolates from Lake Superior with the combined GB2/GE11 probe compared favorably with results obtained using the HFERP DNA fingerprinting method for assigning isolates to host source groups. Of the 27 isolates assigned to goose sources by HFERP, 22 (81.5%) had a positive hybridization signal with the GB2/GE11 probe. While the library-dependent HFERP method was previously shown to correctly identify about 70% of the waterfowl isolates in a known-source library (27) and far fewer environmental isolates, the method described here is a vast improvement for accurately and quantitatively determining the host origins of environmental isolates. Moreover, with the library-independent hybridization-based marker method there are fewer false-positive and false-negative reactions than there are with the HFERP and other techniques that have been evaluated recently, except for host-specific PCR analysis (14). The applicability of this DNA marker technology was also evaluated by screening E. coli isolates from Lake Phalen, a local urban lake frequently affected by Canada geese. The results of this analysis indicated that 26% of the 1,152 isolates examined hybridized with the GB2 and GE11 probes. These data further illustrate that the DNA markers identified can be used for environmental isolates. Considerably greater numbers of environmental isolates will most likely be found if hybridizations are done using the seven combined markers. Large-scale field studies using the combined seven probes will be done in the summer of 2006 to assess the impact of geese on Lake Superior beaches.
To assess whether the DNA markers allowed detection of goose E. coli strains from different geographic regions, we obtained isolates from eastern and midwestern United States. The results of our studies indicated that the combined GB2 and GE11 probes identified only 24% of the isolates examined. While the level of identification most likely would increase if all seven marker probes were used, our results suggest that E. coli strains are geographically distributed. Since the library that we used was constructed with goose E. coli strains isolated from two locations in Minnesota, it is not surprising that the highest percentage of strains identified were isolated in Wisconsin, a bordering state. Consequently, future efforts in which SSH is used to generate DNA markers specific for animal hosts should be done with tester strains originating from several regions of the United States.
In the past, the development of microbial source-tracking techniques has focused on library-dependent methods (37, 41). However, these methods suffer from the need to develop and maintain large reference libraries for comparisons with environmental isolates. Additionally, geographic and temporal variability in isolates, transportability issues, the inability to assign many environmental isolates to source groups, the large library sizes needed to adequately capture genetic diversity, and the high levels of false-positive and false-negative assignments make these methods difficult to implement at a large and economically feasible scale (14, 27). In contrast, library-independent methods that screen for host-specific and ecologically meaningful genes alleviate many of these issues. These genes most likely would not be influenced by geographic and temporal variability, as they would be stable in bacterial isolates obtained recently from a specific host source. While library-independent marker gene approaches have recently been investigated as source-tracking tools with members of the genus Bifidobacterium and the Bacteroides-Prevotella group (4, 5), these organisms are rarely quantified in routine analyses of fecal bacteria in waterways. Conversely, E. coli is becoming one of the most frequently monitored indicators of fecal contamination of freshwater systems, and thus, source-tracking information obtained using the markers reported here can be easily coupled with existing and new fecal count data for TMDL analyses and abatement strategies. Recently, a library-independent marker gene method has also been developed for Enterococcus species (39), allowing similar analyses for saltwater environments.
Since waterways are most often contaminated by fecal bacteria originating from several different sources rather than a single animal host species, it is frequently necessary to screen large numbers of isolates for accurate determination of host sources (31). The development of host-specific DNA fragments for screening by colony hybridization provides a cost-effective quantitative method for simultaneous analysis of many bacterial isolates. Moreover, this method can be easily adapted for automated, rapid, and high-throughput macro- and microarray screening strategies, reducing the time and expense of analyzing the thousands of isolates needed for large-scale and accurate source-tracking studies.
In summary, our results provide evidence that SSH is an effective tool for identification of ecologically meaningful marker DNAs that can be used to identify a large number of genetically diverse E. coli isolates originating from geese. While our initial studies indicated that these markers can be effectively used as hybridization probes to determine the source of environmental E coli isolates, more extensive field testing is needed before large-scale microbial source-tracking studies can be initiated. Nevertheless, we believe that the SSH approach will allow us to identify additional markers for E. coli strains from humans and other animals and to obtain more comprehensive information about sources of fecal contamination in waterways. Coupled with high-throughput, automated macro- and microarray screening, these markers may provide a cost-effective, quantitative, and accurate method for determining sources of genetically diverse E. coli strains for use in water quality analyses and TMDL determinations.
| ACKNOWLEDGMENTS |
|---|
We thank Satoshi Ishii, Sam Myoda, Cindy Nakatsu, Don Stoeckel, and Greg Kleinheinz for providing E. coli isolates and John Ferguson for help with the blind studies, cluster analyses, and library maintenance.
| FOOTNOTES |
|---|
| REFERENCES |
|---|
|
|
|---|
This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| J. Bacteriol. | Microbiol. Mol. Biol. Rev. | Eukaryot. Cell | All ASM Journals |
|---|