Previous Article | Next Article ![]()
Applied and Environmental Microbiology, August 2004, p. 4478-4485, Vol. 70, No. 8
0099-2240/04/$08.00+0 DOI: 10.1128/AEM.70.8.4478-4485.2004
Copyright © 2004, American Society for Microbiology. All Rights Reserved.
Mary B. Brown, Ethan A. Carruthers, John A. Ferguson, Priscilla E. Dombek, and Michael J. Sadowsky*
Department of Soil, Water, and Climate, University of Minnesota, St. Paul, Minnesota 55108
Received 18 December 2003/ Accepted 6 April 2004
|
|
|---|
|
|
|---|
Both phenotypic and genotypic methods have been explored as means to study the ecology of fecal bacteria related to host specificity and determining potential sources of fecal bacteria found in surface water (6, 32, 34). The most widely investigated bacteria for these studies have been Escherichia coli and Enterococcus sp. strains. The use of these methods is based on the hypothesis that specific strains, or a strain's phenotypic or genetic attributes, are related to specific host animals. This hypothesis, however, has been tested in only a limited manner.
The majority of phenotypic and genotypic methodologies require the construction of known-source libraries (a host origin database) to differentiate among isolates, which are subsequently used to determine the host origin of unknown environmental isolates (34). However, in most cases, the sizes of the host origin databases are rather limited, consisting of 35 to about 500 isolates (2-4, 6, 9, 12, 13, 23-26, 31, 33, 42, 43), making broader comparisons to larger populations of E. coli and Enterococcus in the environment difficult. In addition, temporal and geographic variation in bacterial genotypes within and between animal species (7, 12, 16, 31), multiple strains within a single animal (23), and diet variation within a host animal (13) have been shown to influence the representativeness of known-source libraries. Moreover, while microbial source tracking studies done using phenotypic approaches and antibiotic resistance patterns have frequently used large known-source libraries, consisting of about 1,000 to 6,000 isolates (2, 8, 10, 15, 44-46), many of the strains examined were isolated from the same source material or sample, and thus libraries may be biased due to the presence of multiple replications (clones) of the same bacterial genotype from the same source animal.
The repetitive extragenic palindromic-PCR (rep-PCR) DNA fingerprinting technique uses the PCR and primers based on highly conserved and repetitive nucleotide sequences to amplify specific portions of the microbial genome (22, 29, 40, 41). When the PCR products are separated by agarose gel electrophoresis and visualized following staining with ethidium bromide, the resulting banding patterns produce a "fingerprint" unique to each strain. The rep-PCR technique has proven to be a valuable tool to identify and track medically and environmentally important microorganisms (5, 17, 30, 40), and it has also been recently evaluated for its use as a source-tracking tool (1, 4, 6, 20, 23). The rep-PCR DNA fingerprinting technique is relatively quick, easy, and inexpensive to perform and lends itself to high-throughput applications, making it an ideal method for microbial source-tracking studies.
Initial studies done in our laboratory indicated that rep-PCR done with Box A1R primers and E. coli yielded more consistent and complex DNA fingerprints than did studies done using REP primers (6). However, rep-PCRs done with Box, ERIC (enterobacterial repetitive intergeneric consensus), and REP primers have all been evaluated in microbial source-tracking studies (1, 4, 6, 23). Dombek et al. (6) used a minimal data set consisting of about 200 nonunique E. coli isolates and reported that 100% of chicken and cow isolates and between 78 and 90% of human, goose, duck, pig, and sheep isolates were correctly assigned to host source groups by using rep-PCR DNA fingerprinting and Box A1R primers. Similarly, Carson et al. (4) reported that rep-PCR DNA fingerprinting done using Box A1R primers produced a 96.6% average rate of correct classification for human and nonhuman E. coli isolates, and McLellan et al. (23) reported a 79.3% average rate of correct classification for E. coli analyzed using rep-PCR and REP primers.
While all these initial analyses indicated that the rep-PCR technique may be useful for determining animal sources of E. coli, these studies were done with relatively small data sets. Moreover, since rep-PCR and most other source-tracking methods require the assembly of libraries of known-source fingerprints, which is labor-intensive and time-consuming, it is very important that the fingerprint database is unbiased, has high fidelity (36), and is representative of the diversity of E. coli strains potentially present in animal hosts and in environmental samples.
rep-PCR DNA fingerprints are usually analyzed using statistical tools. Binary similarity coefficients are used to analyze data for presence and/or absence (19), and simple banding data obtained from DNA fingerprints can be analyzed using binary coefficients such as Dice or Jaccard band matching algorithms. However, more quantitative algorithms, such as Pearson's product-moment correlation coefficient, can also be applied to complex DNA banding patterns, such as those found using rep-PCR. In this case, fingerprints are analyzed as densitometric curves, taking into account both peak position and height (intensity) (11).
In this study we created a large, known-source, rep-PCR and horizontal fluorophore-enhanced rep-PCR (HFERP) DNA fingerprint database from 2,466 E. coli isolates obtained from humans and 12 animal sources (cows, pigs, sheep, goats, turkeys, chickens, ducks, geese, deer, horses, dogs, and cats) and evaluated the usefulness of this method to differentiate human from animal sources of fecal E. coli.
|
|
|---|
E. coli preparation and rep-PCR conditions.
E. coli isolates were streaked onto plate count agar (Difco BD Diagnostic Systems) and grown overnight at 37°C. Single colonies were picked with a 1-µl sterile inoculating loop (Fisher Scientific, Pittsburgh, Pa.) and suspended in 100 µl of distilled H2O in 96-well microtiter plates, and 2 µl of the resulting suspension was used as template for PCR. The rep-PCR fingerprints were obtained using the Box A1R primer (5'-CTACGGCAAGGCGACGCTGACG-3'), and PCRs were done as described previously (6, 27, 28). PCR was performed using an MJ Research PTC 100 (MJ Research, Waltham, Mass.) thermocycler according to the protocol specific for this instrument and the Box A1R primer. PCR was initiated with an incubation at 95°C for 2 min, followed by 30 cycles consisting of 94°C for 3 s, 92°C for 30 s, 50°C for 1 min, and 65°C for 8 min (27). PCRs were terminated after an extension at 65°C for 8 min, and reaction mixtures were stored at 4°C. Reaction mixtures that were not used immediately for gel electrophoresis analysis were stored at 20°C.
Electrophoresis was done at 4°C for 17 to 18 h at 70 V with constant buffer recirculation (6, 27). Gels were stained for 20 min in 0.5 µg of ethidium bromide/ml prepared in 0.5x Tris-acetate-EDTA buffer. Gel images were captured as tagged image file format files with a FOTO/Analyst Archiver electronic documentation system (Fotodyne Inc., Hartland, Wis.).
HFERP studies.
HFERP analyses were performed using a modification of the procedures of Versalovic et al. (39) as follows. Single E. coli colonies were picked with a 1-µl sterile inoculating loop (Fisher Scientific), suspended in 100 µl of 0.05 M NaOH in 96-well, low-profile PCR plates (MJ Research), heated to 95°C for 15 min, and centrifuged at 640 rpm for 10 min in a Hermle/Labnet Z383K (Edison, N.J.) centrifuge. A 2-µl aliquot of the supernatant in each well was used as template for PCR according to the protocol described above for rep-PCR. The primer consisted of a mixture of 0.09 µg of unlabeled Box A1R primer per µl and 0.03 µg of 6-FAM (6-carboxyfluorescein; Integrated DNA Technologies, Coralville, Iowa) fluorescently labeled Box A1R primer per µl. The primer mixture was used at a final concentration of 0.12 µg/25 µl of PCR mixture. A 6.6-µl aliquot of a mixture of 50 µl of Genescan-2500 ROX (6-carboxy-X-rhodamine) internal lane standard (Applied Biosystems, Foster City, Calif.) and 200 µl of nonmigrating loading dye (150 mg of Ficoll 400 per ml and 25 mg of blue dextran per ml) was added to each 25-µl PCR mixture prior to loading the PCR mixture into agarose gels; 12 µl of the resulting mixture was loaded per gel lane. DNA fragments were separated as described for rep-PCR, and HFERP images were captured using a Typhoon 8600 variable mode imager (Molecular Dynamics/Amersham Biosciences, Sunnyvale, Calif.) operating in the fluorescence acquisition mode with the following settings: green (532-nm) excitation laser, 610 BP 30 and 526 SP emission filters in the autolink mode with 580-nm beam splitter, normal sensitivity, 200-µm/pixel scan resolution, +3-mm focal plane, and 800-V power.
Computer-assisted rep-PCR fingerprint analysis.
Separated gel images (ROX-stained standards and HFERP banding patterns) were processed using ImageQuant image analysis software (Molecular Dynamics/Amersham Biosciences) and converted to 256 gray-scale tagged image file format images. Gel images were normalized and analyzed using BioNumerics v.2.5 software (Applied Maths, Sint-Martens-Latem, Belgium). rep-PCR gel lanes were normalized using the 1-kb ladder from 298 to 5,090 bp, as external reference standards, while HFERP gel lanes were normalized using the Genescan 2500 ROX internal lane standard from 287 to 14,057 bp. Band matching for rep-PCR DNA fingerprints was accomplished by using the following BioNumerics settings: minimum profiling, 5%; gray zone, 5%; minimum area, 0%; and shoulder sensitivity, 5. Band matching for HFERP DNA fingerprints was done by using 3% minimum profiling, 0% gray zone, 0% minimum area, and 0 shoulder sensitivity. DNA fingerprint similarities were calculated by using either the curve-based cosine or Pearson's product-moment correlation coefficient, with 1% optimization, or the band-based Jaccard coefficient. Dendrograms were generated using the unweighted pair group method with arithmetic means (UPGMA). The percentages of known-source isolates assigned to their correct source group were calculated by using Jackknife analysis, with maximum similarities (9).
|
|
|---|
Influence of duplicate E. coli strains on classification of known-source library.
Since results from several studies suggest that E. coli is genetically diverse and clonal in origin and that this may influence the usefulness of this bacterium for source-tracking studies (7), we evaluated this technology using a large library of E. coli strains obtained from humans and 12 animal sources collected throughout Minnesota and western Wisconsin (Table 1).
|
View this table: [in a new window] |
TABLE 1. Animal source groups and rep-PCR DNA fingerprints generated from E. coli isolates
|
|
View this table: [in a new window] |
TABLE 2. Total and unique E. coli isolates correctly classified into source groups by rep-PCR and HFERP DNA fingerprinting method
|
Since identical DNA fingerprints from E. coli strains obtained from the same individual most likely represent isolates of clonal origin and can artificially bias subsequent analyses, we eliminated duplicate DNA fingerprints originating from E. coli strains obtained from the same individual human or source animal. Unique DNA fingerprints were defined as DNA fingerprints from E. coli isolates obtained from a single host animal whose similarity coefficients were less than 90%.
Results in Table 1 show that, of the 2,466 DNA fingerprints analyzed, 1,535 (62%) remained in the "unique" DNA fingerprint library. The influence of duplicate DNA fingerprints on the correct classification of library strains is shown in Table 2. When the 1,535 DNA fingerprints from the unique E. coli isolates were examined, Jackknife analyses indicated that only 44 to 74% of the isolates were assigned to the correct source group, with an average rate of correct classification of 60.5% (Table 2). Thus, there was a 21.7% reduction in the average rate of correct classification by using the unique DNA fingerprint library, relative to that seen with the complete library and less than we and others have previously reported with smaller libraries of E. coli strains containing duplicate DNA fingerprints from the same individual animal (4, 6, 23). Our results indicate that the clonal nature of E. coli (11, 20, 33) originating from the same source animal artificially biases the average rate of correct classification, alters the fidelity of the database, and overestimates the ability of the database to assign isolates to their correct source group.
Influence of library size on usefulness of DNA fingerprint libraries.
We also determined whether E. coli isolates obtained in this study were sufficient to capture the genetic diversity present within the E. coli populations sampled. E. coli isolates between animal source groups with rep-PCR DNA fingerprint similarities of 90% or greater (based on cosine coefficient, 1% optimization, and UPGMA) were assigned to the same genotype. By this definition, 657 genotypes were distinguished from the 1,535 unique E. coli isolates in the known-source database. The isolates were randomized, and a rarefaction curve was constructed by summing the number of genotypes that accumulated with the successive addition of isolates. Despite a library size of 1,535 DNA fingerprints, genetic diversity has not been saturated. This was evidenced by the apparent first-order relationship between isolate numbers (sampling effort) and accumulation of new genotypes (data not shown). Moreover, 58.75% of the genotypes from isolated strains, across all animal groups, occurred only once in the database, and a limited number occurred multiple times (Fig. 1).
![]() View larger version (10K): [in a new window] |
FIG. 1. Frequency of occurrence of genotypes among rep-PCR DNA fingerprints from unique E. coli isolates. Analysis was limited to the 657 genotypes identified among the 1,535 unique E. coli isolates with rep-PCR DNA fingerprint similarities of 90% or greater.
|
One suggested strategy to avoid this underrepresentation problem in large regional or national libraries is to develop moderate-sized libraries for a highly confined geographical region, wherein isolates are obtained only from the animals in the study area. In this way only animals pertinent to the study site, and those likely to have an impact on the targeted watershed, need to be examined in detail. However, it is also important that in some cases animals thought to be important to or prevalent in the study site may vary over time, depending on agricultural practices and migration. Thus, a careful inventory of potential animals in the study site needs to be made prior to, and during, sampling and analysis.
HFERP DNA fingerprinting.
In our studies we noted that cluster analysis of rep-PCR DNA fingerprint data often produced groupings that were more closely related to the gels from which they originated than to the host animal from which they were isolated. We hypothesized that within-gel clustering of DNA fingerprints was in part due to intrinsic gel-to-gel variation, differential DNA migration in repeated runs of the same and different PCR samples, and the inability to correct for heat- and buffer-induced gel distortion across and between single and multiple gels. Since DNA fingerprint libraries are assembled from many different gels, this could have a major impact on the fidelity of DNA fingerprint libraries and their subsequent use for tracking sources of unknown fecal bacteria.
To overcome these major limitations, we developed and evaluated the use of an HFERP technique as a means to differentiate human from animal sources of fecal bacteria. In this method, alignment, correction, and normalization of fluorescently labeled, rep-PCR DNA fingerprint bands within and between gels are facilitated by the use of internal ROX-labeled molecular weight markers that are present in each lane. The technique is similar to that previously described for use with a DNA sequencer (27, 39) but instead uses a standard horizontal agarose gel and a dual-wavelength scanner. An example of an unseparated HFERP gel displaying the ROX-labeled internal lane standard and 6-FAM-labeled Box A1R DNA fingerprints is shown in Fig. 2A, and the separated gel images are shown in Fig. 2B and C. Typically, and with our E. coli strains, 12 to 20 DNA bands per strain were revealed by the HFERP technique.
![]() View larger version (91K): [in a new window] |
FIG. 2. Representative examples of HFERP DNA fingerprint images. Genomic DNAs from 24 E. coli strains were subjected to HFERP DNA fingerprint analysis with a mixture of unlabeled Box A1R and 6-FAM fluorescently labeled Box A1R primers. Each lane contained Genescan-2500 ROX internal lane standards and HFERP DNA fingerprints. The combined, dual-colored HFERP image (A) was captured using a Typhoon Imager and two emission filters. Values at right are sizes in base pairs. Individual images of the HFERP DNA fingerprints (B) and Genescan-2500 ROX internal lane standards (C) were acquired using one filter at a time.
|
![]() View larger version (90K): [in a new window] |
FIG. 3. Comparison of DNA fingerprint patterns of a reference E. coli strain generated by rep-PCR and by HFERP. (A) rep-PCR DNA fingerprint patterns were assembled from 29 individual PCRs, each of which was run on a separate agarose gel. Fingerprints were generated using E. coli isolate P294 as template DNA and the Box A1R primer. (B) HFERP DNA fingerprint patterns were assembled from 29 individual PCRs, each of which was run on a separate agarose gel. Fingerprints were generated using E. coli isolate P294 as template DNA and a mixture of unlabeled Box A1R and 6-FAM fluorescently labeled Box A1R primers. Bands were aligned using Genescan-2500 ROX internal standards, which were present in each lane. Similarities were determined using the cosine algorithm of BioNumerics, and dendrograms were generated with UPGMA.
|
Previously, Versalovic et al. (39) and Rademaker et al. (27) reported on the use of FERP, whereby polyacrylamide gel electrophoresis and automated DNA sequencers were used to separate and detect bands generated by the FERP protocol. While the more automated method presented by these authors has some advantages, the increased cost of analyses and the limited dynamic range of fragment size separation on sequencing gels did not make this technique useful in our applications. In contrast, the HFERP method described here is relatively inexpensive to perform, can be done on standard electrophoresis apparatus, has high throughput, and allows for the separation of a large range of DNA band sizes. It should be noted, however, that the intensity of HFERP bands is more variable than that of those generated by rep-PCR and that some of the gains achieved by more precise alignment of bands may be offset by more variation in band intensity. We found that this variation in intensity can be overcome by the careful mixing of all reagents in the PCR master mix and greater pipetting precision when loading gels (data not presented). Further improvements in increasing the intensity of HFERP-generated DNA fingerprints may also be obtained by varying the ratio of labeled to unlabeled primer and the final concentration of the primer mixture in PCRs. Nevertheless, our results clearly show that HFERP-derived DNA fingerprint bands are more precisely aligned than the rep-PCR bands and reduce within-gel groupings of fingerprints, which can have profound ramifications for the assembly of libraries and the analysis of unknown environmental isolates. This technology will have application to other DNA fingerprinting methods that rely on the use of PCR primers.
Assignment of E. coli isolates to source groups by using HFERP DNA fingerprints.
Of the 1,535 previously selected unique E. coli isolates from animals and humans (Table 1), 1,531 were subjected to HFERP DNA fingerprinting with a combination of fluorescently labeled and unlabeled Box A1R PCR primers. Jackknife analyses of HFERP gels done with the curve-based Pearson's correlation coefficient indicated that 38 to 73% of the isolates were assigned to the correct source group by this technique (Table 2). For the curve-based analysis, the HFERP technique had the lowest percentage of correctly classified strains in cases where the numbers of analyzed fingerprints were relatively small (for sheep, horses, and goats). The average rate of correct classification for the unique HFERP-generated DNA fingerprints was 59.9%.
In contrast, Jackknife analyses of HFERP-generated DNA fingerprints done using the band-based Jaccard analysis showed that only 8 to 56% of the E. coli isolates were assigned to the correct source group, with a 43.0% average rate of correct classification. This indicates that, for this type of data, the Pearson's product-moment correlation coefficient was superior to Jaccard's band matching algorithm for assigning known isolates to the correct source groups. Interestingly, results in Table 2 also show that, despite problems associated with within- and between-gel variation, within-gel grouping of isolates, and repeatability issues, Jackknife analysis of rep-PCR DNA fingerprints, analyzed with Pearson's correlation coefficient, indicated that 48 to 74% of the isolates were assigned to the correct source group, a 60.9% average rate of correct classification.
While band matching data obtained from DNA fingerprints can be analyzed using binary similarity coefficients, which are mostly used to analyze data for presence and/or absence (19), quantitative similarity coefficients, which require a measure of relative abundance (18), can also be applied to DNA fingerprints if they are analyzed as densitometric curves that take into account both peak position and intensity (peak height). Results of our analysis of rep-PCR DNA fingerprint data indicated that the Jaccard band-based method was not as useful in separating E. coli isolates into their correct source group as was the curve-based quantitative method. This is similar to results reported by Häne et al. (11), who demonstrated that for complex DNA fingerprints, such as those produced with the techniques we used here, a curve-based method such as Pearson's product-moment correlation coefficient more reliably identified similar or identical DNA fingerprints than did band matching formulas, such as simple matching, Dice, or Jaccard. Similarly, Louws et al. (21) reported that curve-based statistical methods worked best for analysis of complex banding profiles generated by rep-PCR, since comparison of curve data is less dependent on DNA concentration in loaded samples and is relatively insensitive to background differences in gels. More recently, Albert et al. (1) performed a statistical evaluation of rep-PCR DNA fingerprint data and reported that k-nearest neighbor classification was similar to Pearson's product-moment coefficient in its ability to correctly classify fingerprints of 584 E. coli isolates.
Groupings of fingerprint data.
In some instances, it may be sufficient to identify unknown watershed E. coli isolates to the level of larger groupings, rather than to the level of individual animal types. To determine if the HFERP-generated DNA fingerprint data from our library of unique E. coli isolates grouped well into larger categories, we assembled DNA fingerprints from pets (dogs and cats), domesticated animals (chickens, cows, goats, horses, pigs, sheep, and turkeys), wildlife (deer, ducks, and geese), and humans and used Jackknife analysis to assess the percentage of correctly classified strains. Results in Table 3 show that the HFERP DNA fingerprints, analyzed with Pearson's product-moment correlation coefficient, correctly classified about 83, 54, 71, and 59% of the isolates into the domesticated animal, human, wildlife, and pet categories, respectively. The average rate of correct classification for these groups was 74.3%. In contrast, when DNA fingerprints were analyzed with Jaccard's coefficient, the average rate of correct classification was 66.2%. As before, the least precision was found in categories having the smallest number of fingerprints, pets and humans, suggesting that there is an apparent relationship between the number of fingerprints analyzed and the percentage of correctly classified isolates.
|
View this table: [in a new window] |
TABLE 3. Percentage of E. coli isolates correctly classified into domestic, human, and wildlife source groups by the HFERP DNA fingerprinting method
|
|
View this table: [in a new window] |
TABLE 4. Percentage of E. coli isolates correctly classified into human and animal source groups by the HFERP DNA fingerprinting method
|
We thank Linda Kinkel and Anita Davelos for help with rarefaction analysis and Vivek Kapur, Todd Markowski, and Bruce Witthun for help with laser image analysis.
Present address: Minnesota Department of Agriculture, St. Paul, MN 55107. ![]()
|
|
|---|
This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»