Previous Article | Next Article ![]()
Applied and Environmental Microbiology, September 2008, p. 5392-5401, Vol. 74, No. 17
0099-2240/08/$08.00+0 doi:10.1128/AEM.00151-08
Copyright © 2008, American Society for Microbiology. All Rights Reserved.
,
Young-Gun Zo,1,3,
and
Rita R. Colwell1,3*
Center of Marine Biotechnology, University of Maryland Biotechnology Institute, 701 E. Pratt Street, Baltimore, Maryland 21202,1 National Center for Genetic Engineering and Biotechnology, 113 Phahonyothin Road, Klong 1, Klong Luang, Pathumthani 12120, Thailand,2 Center of Bioinformatics and Computational Biology, University of Maryland Institute of Advanced Computer Studies, University of Maryland College Park, College Park, Maryland 207423
Received 16 January 2008/ Accepted 30 June 2008
|
|
|---|
10 kb), highly reproducible amplicons were generated from V. cholerae isolates, including those from different geographical locations and historical strains isolated during the period 1931-2000. The amplicons yielded reduced variability in their densitometric band patterns to
10% and clonal distinction at <90% similarity. Rapid band-matching analysis was accomplished for fingerprints with
90% similarity, discriminating O serotypes and biotypes (classical versus El Tor) as well as pathogenic and nonpathogenic strains. Compared to genome similarity measured by DNA-DNA hybridization, the results showed good correlation (r = 0.7; P < 0.001), with five times less measurement error and without bias. The method permits both phylogenetic inference and clonal differentiation of individual V. cholerae strains, enables robust, high-throughput analysis, and does not require specialized equipment to perform. With access to a curated public database furnished with appropriate analytical software applications, the method should prove useful in large-scale multilaboratory surveys, especially those designed to detect specific pathogens in the natural environment. |
|
|---|
Cholera is a disease that is highly amenable to these approaches. All major outbreaks of cholera prior to the seventh pandemic were associated with the Vibrio cholerae serogroup O1 classical biotype. In the seventh pandemic, serogroup O1 El Tor and O139 emerged and became dominant (24). This shift in epidemic clones emphasizes the importance of monitoring bacterial populations to detect toxigenic clones, as well as their close relatives. A surveillance program that employs high-throughput rapid screening, identification, and clonal delineation of bacterial isolates is essential in the case of rapidly evolving pathogens. V. cholerae is just such a rapidly evolving/recombining species (3, 13, 14, 18, 19) and is also native to the aquatic environment (10), with the greatest threat to developing countries (24).
Many methods that identify and delineate relationships between bacterial isolates are available. However, each has some limitation. For example, multilocus sequence typing (MLST) (29) is excellent with respect to portability of resulting data and fine-scale phylogenetic inference. However, the equipment required and its high cost are drawbacks for field application. Pulsed-field gel electrophoresis provides robust genomic inference, and implementation of the PulseNet database is based on this method (11). However, its capacity is limited, i.e., it has a low throughput and the equipment required is not readily available in most countries where cholera is endemic. Phenotypic methods, such as O serotyping with monoclonal antibodies, comprise frequently used rapid methods but do not cover the entire spectrum of existing or emerging pathogenic clones of V. cholerae.
Several genomic fingerprinting methods have been developed for V. cholerae clone differentiation, e.g., amplified fragment length polymorphism (AFLP), arbitrarily primed PCR, randomly amplified polymorphic DNA analysis, and repetitive element sequence-based PCR (rep-PCR) (7). However, most methods produce an insufficient phylogenetic signal (only a small number of informative bands), reproducibility, or portability. With AFLP or fluorescence AFLP, automated high-throughput analysis is possible but involves relatively complicated procedures, and equipment is a limiting factor for countries where cholera is endemic.
ERIC-PCR is a rep-PCR that employs the enterobacterial repetitive intergenic consensus (ERIC) sequence as the target for PCR and has been used to resolve V. cholerae clonal lineages successfully, but only to a limited extent. In previous studies carried out by Rivera et al. (23) and Colombo et al. (9a), ERIC-PCR fingerprints yielded only one to eight amplicons in the size range of 0.1 to 4 kb. Although ERIC-PCR was able to differentiate O1 and non-O1 V. cholerae isolates, successful differentiation of the closely related O1 and O139 strains has not been reported. Zo et al. (34a) used a modified protocol, with a lower amplification stringency and a higher-concentration agarose gel (3.6% Metaphore agarose; FMC, Rockland, ME), to obtain highly complex patterns, with more than 40 amplicons obtained from each isolate. However, only amplicons in the size range of 100 to 588 bp could be resolved. Moreover, because the low-stringency conditions were similar to those for arbitrarily primed PCR, fingerprint patterns from identical experiments could be analyzed only by variance decomposition analysis accounting for random variation, with limited portability for field use.
In this study, a novel ERIC-PCR and gel electrophoresis procedure was developed, adopting newly developed PCR technologies to provide longer amplicons than those obtained by traditional PCR. Long-range ERIC-PCR takes advantage of a commercially available, improved Taq polymerase and buffer system, with the higher-fidelity polymerase providing reliable and longer amplification fragments. Long-range PCR has proven useful in many areas of research, including molecular typing of bacteria (2, 8, 12, 17). Larger amplicons allow more genomic regions to be sampled, hence providing a better presentation of the total genomic polymorphism. The objective of this study was to develop a high-resolution ERIC-PCR protocol that is robust and scalable for high throughput, with minimal equipment. Both the reproducibility and the discriminative power of the method were assessed, and a protocol for computer-assisted interpretation of fingerprint patterns is provided. The accuracy of the estimation of genome relatedness and phylogenetic inference using ERIC-PCR fingerprinting was verified by comparing results obtained with the protocol with those obtained by DNA-DNA hybridization (DDH).
|
|
|---|
DNA preparation.
Genomic DNA from each bacterial isolate was extracted using a DNeasy tissue kit (Qiagen Inc., Valencia, CA). DNA concentrations were determined spectrophotometrically by measuring the absorbance at 260 nm (25).
Long-range ERIC-PCR.
Amplification reactions were carried out in 50-µl volumes, using the Takara Ex Taq DNA polymerase and buffer system (Takara Mirus Bio Corporation, Madison, WI). The final PCR mixture comprised 1x Ex Taq buffer (with 2 mM MgCl2), a 200 µM concentration of each deoxynucleoside triphosphate, an 800 nM concentration of each ERIC primer, 1.25 units of Ex Taq DNA polymerase, and 150 ng of template DNA. Primer sequences used were the universal ERIC primers of Versalovic et al. (32), as follows: ERIC-1, 5'-ATG TAA GCT CCT GGG GAT TCA C-3'; and ERIC-2, 5'-AAG TAA GTG ACT GGG GTG AGC G-3'. Polyacrylamide gel electrophoresis-purified primers (Sigma-Genosys, Woodlands, TX) were used to ensure reproducibility and to avoid variability among batches of primers.
The PCR cycle program was carried out using a Peltier PTC-200 thermal cycler (MJ Research Inc., Watertown, MA) and employing the calculated temperature control option, with initial denaturation at 95°C for 5 min, followed by 35 cycles of denaturation at 94°C for 45 s, annealing at 52°C for 1 min, and extension at 65°C for 10 min and a final extension step at 65°C for 20 min.
Gel electrophoresis.
After the PCR was completed, each 50-µl reaction mixture was mixed with 10 µl of 6x loading buffer (15% Ficoll 400 in water, 0.01% bromophenol blue), a modification from the work of Sambrook et al. (25), to avoid interference in band imaging by dye in the gel, and 5 µl of the mixture was used for gel electrophoresis. Three microliters of molecular weight marker (HyperLadderI; Bioline USA Inc., Canton, MA) was loaded in the first, middle, and last lanes of the gel for gel image normalization. One percent (wt/vol) agarose was selected as the appropriate concentration for electrophoresis separation of the long-range ERIC-PCR products. Electrophoresis was performed at 120 V for 2.25 h, using a Bio-Rad GT Subcell electrophoresis system (electrode distance, 30 cm) with a 15-cm by 20-cm tray. Two buffers commonly used for DNA electrophoresis, Tris-borate with EDTA (TBE; 89 mM Tris-borate, 1 mM EDTA), and Tris-acetate with EDTA (TAE; 40 mM Tris-acetate, 1 mM EDTA), and two different comb thicknesses, 0.75 mm and 1.00 mm, were compared for the ability to provide sharp, high-resolution fingerprint patterns. After electrophoresis, the gels were stained in ethidium bromide solution (5 µg ml–1) for 3 to 5 min and destained in tap water for 20 min, with shaking. The fingerprint banding patterns were recorded using a FluorImager 575 imaging system (Molecular Dynamics Inc., Sunnyvale, CA) at a resolution of 100 µm, 16 bits, at a photomultiplier tube setting of 650 V.
Gel image analysis.
The fingerprint patterns in the gels were analyzed using a computer software package, GelCompar II, version 3.0 (Applied Maths BVBA, Belgium). After background subtraction and gel normalization, the fingerprint patterns were subjected to cluster analysis using the unweighted-pair group method using average linkages (UPGMA). Two methods for measuring similarity, one based on binary data on the occurrence of the band (band-based), calculated using the Dice coefficient, and another based on the overall densitometric profile (curve-based) of the banding pattern, calculated using Pearson's product moment correlation (rERIC), were compared.
Genomic DDH.
To estimate genome-genome similarity, dot blot hybridization was performed on duplicate dot blots of genomic DNAs from 176 strains that yielded unique ERIC-PCR patterns. Genomic DNA (500 ng) was dot blotted onto nylon membranes, using a Bio-Dot microfiltration apparatus (Bio-Rad Laboratories, Hercules, CA). Genomic DNAs of three probe strains (V. cholerae N16961, RC395, and RC466) were sheared to an approximate size of 400 to 600 bp by sonication and labeled using thermostable alkaline phosphatase (Geneimages AlkPhos direct labeling kit; Amersham Biosciences Ltd., Buckinghamshire, United Kingdom). Hybridization buffer and washing solution were prepared following the manufacturer's protocol. The membrane was prehybridized at 60°C for 30 min and hybridized (10 ng of probe DNA per ml of hybridization buffer) at 60°C overnight in a rotary hybridization oven. Each membrane was subjected to a high-stringency wash twice for 10 min at 70°C, followed by a low-stringency wash twice for 5 min at room temperature. Chemifluorescent signals were generated using ECF substrate (Amersham Biosciences). The fluorescent signals were recorded using a Typhoon 9410 apparatus (Molecular Dynamics Inc., Sunnyvale, CA), and the signal intensity was quantified by ImageQuant software, version 5.1 (Molecular Dynamics, Inc.). Results are expressed in relative binding units (RBR), which show the ratio of signal from the target DNA to that from the probe DNA itself (i.e., positive control) as the target DNA. Duplicates of V. cholerae ATCC 14035T, Vibrio mimicus ATCC 33653T, Vibrio fluvialis RC442, and Aeropyrum pernix K1 were included on every membrane as control strains providing different levels of genome relatedness to the probe. A. pernix, an archaeon, was included as a negative control.
Statistical analyses.
A battery of statistical analyses were performed to measure precision and to test the agreement of rERIC with RBR. Significance was determined at the 5% type I error level. Analysis of variance (ANOVA) and analysis of covariance were performed with SAS, version 8.2 (SAS Institute Inc., Cary, NC), according to general procedures (27). See the supplemental material for further description of the analysis.
|
|
|---|
![]() View larger version (119K): [in a new window] |
FIG. 1. Effect of electrophoresis buffer on banding patterns in ERIC-PCR fingerprinting, using a 1.0% agarose gel and a 0.75-mm comb. (A) 1x TAE buffer. (B) 0.5x TBE buffer. Lanes: M, size markers; 1 to 3, V. mimicus RC 217; 4 to 6, V. cholerae RC 561; 7, V. cholerae O1 El Tor N16961. Triangles between lanes show bands that are more clearly visible in gels using TBE.
|
Optimization of electrophoresis.
Since the aim of this study was to develop a rep-PCR and electrophoresis procedure with good reproducibility and portability, sufficient to be deployed in any minimally equipped laboratory in any geographical location, especially countries where cholera is endemic, agarose gel electrophoresis was standardized with respect to physical and temporal dimensions to achieve optimal conditions. The steps involved in resolving and visualizing PCR amplicons determine the resolution power of the fingerprint, especially for complex banding patterns, such as those obtained using PCR-based fingerprinting. Parameters considered were the concentration of the agarose gel and the choice of loading buffer, running buffer, comb thickness, and staining dye. Gel concentrations tested ranged from 0.7% to 3%, with 1% agarose gel giving the best banding pattern, i.e., providing the most bands with good band separation, effectively separating bands in the size range of 0.4 to 10 kb for amplicons obtained by long-range PCR. To compensate for the volume of water evaporating during melting of the agarose in the microwave oven, the agarose concentration in gel preparations was always readjusted to the prerecorded total weight of the agarose, buffer, and container by adding prewarmed distilled water.
To reduce the "smiling effect" of bands, Ficoll 400 was employed as a sinking agent for the loading buffer, instead of glycerol. Glycerol-based loading buffer has a lower molecular weight and allows DNA to stream up the sides of the well before electrophoresis is started, resulting in a U-shaped band (A handbook for gel electrophoresis; FMC Bioproducts, Rockland, ME). The dye concentration was also reduced to a minimum to avoid interference with visualization of the bands by dye color. The modified loading buffer provided a sufficient visual aid in sample loading, with minimal effect on the banding pattern. However, since the dye is nearly invisible after electrophoresis begins, the run should be stopped at a preset time rather than by visual inspection of the running distance of the dye front. This criterion is also essential for standardizing the electrophoresis conditions.
In general, a thinner comb yielded sharper bands and a better resolution, with a 0.75-mm comb producing thin bands while not making sample loading too difficult. The choice of electrophoresis buffer also has an effect on the banding pattern, and two commonly used buffers, 1x TAE and 0.5x TBE, were tested, with TBE buffer proving superior to TAE in resolving the banding patterns (Fig. 1), i.e., showing more bands and causing less background smearing. Although TBE buffer provided lower DNA migration rates, longer running times can be used to obtain the same migration distance as that obtained with TAE.
Two DNA staining dyes, ethidium bromide and Sybr gold, were compared. Sybr gold, which is 10-fold more sensitive than ethidium bromide in DNA staining (25), had the disadvantage that discerning distinct banding patterns was hampered by strong staining of the background by smearing caused by nonspecific products. The less sensitive ethidium bromide stain was deemed preferable because of the clearness of the band patterns.
Depending on the gel preparation, migration in the middle lanes was faster than that in the lanes closer to the edge of the gel, resulting in U shaping or "smiling" of gels. Although gel-to-gel and lane-to-lane normalization by GelCompar II software can be used to standardize migration rates among lanes and gels, it is preferable to minimize lane-to-lane variation of the migration rate. After much trial and error, it was found that preparing the agarose gel at a high temperature minimizes "smiling" of the gels. Melted agarose poured within 5 min after being microwaved ensures two-dimensional homogeneity in the gelling process. However, a heat-resistant gel casting tray must be used.
Calculation of similarity among fingerprints.
The DNA size range for comparison was set at 0.4 to 10 kb. DNA bands smaller than 400 bp often yielded diffuse banding, which is to be expected when 1% agarose gel electrophoresis is run for >2 h. The 10-kb upper limit for the comparison range was based on the reproducibility of bands with 150 ng of template DNA. Although the amplicon size was larger than 10 kb, a larger amplicon may not be reliable in all PCRs if a low concentration of heavily sheared DNA is used as the template.
Comparison of fingerprints for different strains was performed after gel images had been processed by GelCompar II. Two options for calculating pattern similarity are band-matching similarity and densitometric curve similarity, and both of these were used. To calculate the similarity of bands, binary data representing the presence or absence of bands were generated for each of the band positions, i.e., band migration distances. The Dice coefficient, which disregards the significance of the band position when both fingerprints in a pair do not yield a band, was used as the measure of similarity. Bands and band positions were assigned by "band calling," a process requiring at least two arbitrary criteria, i.e., setting a minimum signal intensity to distinguish bands from the background and assigning band positions for thick or diffused bands for which the position of maximum intensity was not readily apparent. For these reasons, although the analysis was computer assisted, the majority of lanes and gels required manual inspection of problematic bands. This limitation reduced not only the reliability and reproducibility of the band-calling process but also the automation of high-throughput analysis.
To analyze the similarity of the densitometric curves, a similarity value was calculated as the Pearson correlation coefficient based on the densitometric curve of the fingerprint pattern (Fig. 2). Since the method takes into account the overall pattern of the fingerprint, it is less sensitive to small variations arising from faint bands or small shifts in the overall pattern. By skipping the band-calling step, the method provides consistent results with less manual intervention. A set of 21 fingerprint patterns obtained for the same strain (V. cholerae N16961) provided a sample set. Since all fingerprint patterns were from the same genome, the expected similarity was 100%. However, due to random measurement errors in band positions and in the signal intensity at each position, observed values should be distributed close to but less than 100%. Thus, a method yielding a higher similarity value is a better estimate of similarity. When similarity was calculated, based on the Dice coefficient and Pearson product moment correlation, and the similarity values compared, Pearson correlation coefficients (rERIC) usually yielded higher similarity values than the band-based Dice coefficients did (Fig. 2). Therefore, for robustness, consistency, and simplicity of calculation, the curve-based method using Pearson correlation was selected as the densitometric similarity coefficient.
![]() View larger version (42K): [in a new window] |
FIG. 2. Example of calculation of similarity values between pairs of ERIC-PCR fingerprints. Matching of band positions was assigned by automated band calling for two fingerprints of V. cholerae N16961 in GelCompar II. The Dice coefficient (90.32%) was calculated for match/mismatch data. The line graph shows the intensity of ethidium bromide fluorescence along normalized migration distances for the two fingerprints. The Pearson correlation coefficient (94.40%) was calculated from the intensity values. The thick black line shows a densitometric curve for the right fingerprint, and the thin gray line tracing the shaded area shows a densitometric curve for the left fingerprint.
|
Variability arising from electrophoresis and image analysis was evaluated by examining 56 lanes of DNA size markers run in 25 electrophoresis gels. Pearson correlation values (rERIC) ranged from 95.6 to 99.6%. Thus, a ca. 5% similarity coefficient could be attributed to the differences in electrophoresis running distance, background signal intensity, and random variation in the normalization of gel images. To estimate the total random variation, 21 fingerprints of V. cholerae N16961 were produced from 21 genomic DNA extractions of the strains (Fig. 3). Although all fingerprints revealed similar banding patterns, slight variations in positions and relative intensities of the bands were observed. The variation in the densitometric similarity coefficients ranged from 83.2 to 99.6%, with a mean of 94.5%. If a 5% type I error rate in determining the identity of a pair of isolates, i.e., allowing erroneous rejection of identity of clones for truly identical clones, is accepted, the 5% quantile of the coefficient can serve as a criterion for accepting or rejecting a null hypothesis of clonal identity. The 5% quantile of the N16961 fingerprints was calculated as 89.5% from the pairwise similarity coefficients for 21 fingerprints. Therefore, strains yielding fingerprints with similarities ranging from 90% to 100% were indistinguishable when the Pearson correlation coefficient was employed in the analysis of normalized densitometric curves.
![]() View larger version (119K): [in a new window] |
FIG. 3. Variability in similarities of fingerprints. The results of 21 PCRs and electrophoresis of genomic DNA of V. cholerae O1 El Tor N16961 are shown. The dendrogram shows a tree created by UPGMA clustering based on the Pearson correlation coefficient for the range of 0.4 to 10 kb. The scale bar at the top of the dendrogram shows the Pearson correlation coefficient (%); the minimum similarity coefficient in the dendrogram is 89.66%.
|
Resolving power of the densitometric similarity coefficient.
The accuracy of the criterion for rERIC values of <90% in predicting differences between bacterial strains was tested by employing 213 strains for which background information, such as taxonomy, serotype, biotype, and source of isolation, was known. This set of strains provided information relative to genus, species, clinical versus environmental source, serotype and/or biotype, and clonal complex, e.g., O1 classical, O1 El Tor, and O139. Among different species, rERIC was always <50% (Fig. 4), making rERIC useful for differentiating V. cholerae from other species, i.e., Vibrio harveyi, V. mimicus, V. fluvialis, Aeromonas, and Shewanella spp. Among V. cholerae strains, rERIC values of <90% distinguished environmental and toxigenic strains. However, differences between the genomes of toxigenic strains of O1 classical (e.g., ATCC 14035T, DK 64, and O395), O1 El Tor (e.g., CDC 2164-78 and N16961), and O139 complexes were not resolvable by the rERIC of <90% criterion (Fig. 4). Fingerprints (Fig. 5) revealed detectable differences in banding patterns among those groups, but pairs of O1 El Tor and O139 strains had similarities of >90%, and thus the rERIC of <90% criterion did not distinguish O139 from O1 El Tor strains. On the other hand, similarities between O1 classical and O1 El Tor or O139 strains were <89.5%, meeting the rERIC of <90% criterion for distinguishing O1 classical strains from O1 El Tor and O139 strains, with a narrow margin of error. Inspection of the banding patterns showed that two band positions, 1.4 kb and 1.0 kb, discriminated the three closely related clones (Fig. 5). Both bands were present in all O1 El Tor strains, while only the 1.4-kb band, not the 1.0-kb band, was present in V. cholerae O139. The 1.4-kb band was absent from O1 classical strains (Fig. 5).
![]() View larger version (108K): [in a new window] |
FIG. 4. Genomic fingerprints of selected bacterial strains. The dendrogram was created by UPGMA. Gray lines, rERIC < 50%; black lines, rERIC 50%. Abbreviations: Sp., species; SG, O serogroup; Sm., type of specimen from which the strain was isolated; Vc, Vibrio cholerae; Ae, Aeromonas spp.; Vf, V. fluvialis; Vm, V. mimicus; Vh, V. harveyi; Sh, Shewanella spp.; X, non-O1/non-O139 strain; R, rough strain; E, environmental isolate; C, clinical isolate; F, fish isolate.
|
![]() View larger version (60K): [in a new window] |
FIG. 5. Genomic fingerprints of toxigenic V. cholerae strains. The dendrogram is a hypothetical cladogram based on UPGMA clustering and band matching. Closed circle, strains 10 to 15; open circle, strains 19 and 20. Plus and minus signs indicate additions and deletions of bands from the fingerprints of the upper nodes (closed circle and open circle). The numbers show the sizes of the bands added or deleted (kb). SG, O serogroup; ST, O serotype; BT, biotype; CT, presence/absence of cholera toxin gene; Og, Ogawa; In, Inaba; ET, El Tor; CL, classical.
|
V. cholerae O37 provides an example of a non-O1 strain associated with localized, cholera-like outbreaks in African and European countries (1, 34), isolated during the early stages (1968 and 1965, respectively) of the seventh pandemic that began in Southeast Asia in 1961. There have been suggestions concerning the phylogeny of this serogroup, namely, that it is a derivative of the O1 classical biotype, based on rep-PCR fingerprinting of IS1004 sequences (4), multilocus enzyme electrophoresis (3), and allele profiles of multilocus virulence genes (21). The suggestion that it is a derivative of O1 El Tor was based on the DNA sequence of recA (30). Another suggestion, that it is an independent lineage, with slight divergence from both the O1 El Tor and O1 classical biotypes, was based on ribotyping and DNA sequences of toxin and toxin-related genes (14a). According to results of band-matching analysis by long-range ERIC-PCR fingerprinting (strain 18) (Fig. 5), V. cholerae O37 is a derivative of the O1 classical biotype, but with significant divergence. This conclusion is in agreement with previous studies based on genome-wide random sampling or multiple genetic loci, but not if results of polymorphism analyses of single or several linked loci are considered. Recent reports provide strong molecular evidence for serotype conversion of V. cholerae O1 to O37 or O139 in the aquatic environment (5, 15), which occurred in the sixth pandemic strains (classical biotypes O1 to O37) and the seventh pandemic strains (El Tor biotypes O1 to O139) in the 1960s and 1990s, respectively.
It is concluded that the rERIC of <90% criterion is sufficiently powerful to distinguish V. cholerae clones from clades that have conventionally been classified as cholera-pathogenic clades. In addition, when band-matching analysis is done, pandemic V. cholerae O1 classical, O1 El Tor, and O139 strains and their endemic variants can be distinguished by long-range ERIC-PCR fingerprinting. Since the short-range ERIC-PCR fingerprinting method previously used (23) reported that nearly all pathogenic V. cholerae O1 and O139 strains yielded the same pattern, comprising four bands with sizes of 0.5 to 1.75 kb, the power of differentiation is significantly enhanced by long-range ERIC-PCR.
Estimation of genome relatedness.
The similarity coefficient rERIC can provide estimates of genome relatedness for pairs of strains. Since DDH is accepted as the standard method, the precision and accuracy of rERIC were assessed in comparison to the RBR of DDH. To meet assumptions of normality and homoscedasticity in ANOVA, it was necessary to transform RBR and rERIC to the square root of RBR (
RBR) and the arcsine of the square root of rERIC (sin–1
rERIC), respectively. Transformed values from the two methods were highly correlated (Pearson correlation coefficient [r] = 0.7; P < 0.001) (Fig. 6) when V. cholerae N16961 was used as the probe strain. Highly significant correlations (r > 0.5; P < 0.001) were also found when other V. cholerae strains (RC395 and RC466, both isolated from the Chesapeake Bay) were used. From these results, it was concluded that the two measurement variables are in good agreement, with a strong proportional relationship.
![]() View larger version (17K): [in a new window] |
FIG. 6. Distribution of RBR and rERIC values for 161 V. cholerae strains when genomic DNA of V. cholerae N16961 was used as the probe strain.
|
rERIC was due, for the most part, to a treatment effect rather than measurement error (99:1 ratio of treatment effect to measuring error). On the other hand, the contribution of measurement error was higher for
RBR (7:3 ratio of treatment effect to measuring error), indicating a higher level of measurement error associated with DDH than with ERIC-PCR fingerprinting. The larger measurement error was also confirmed by the se/sA ratio, which was 5.4-fold higher for RBR. Therefore, it was concluded that rERIC can provide a >5-fold better resolution in differentiating bacterial genomes. The least significant difference value, which can readily be used for pairwise comparisons of strains to determine significant differences, if any, in their genomes, also showed a threefold better precision for rERIC. |
View this table: [in a new window] |
TABLE 1. Estimates of variance components from ANOVA on transformed RBR and rERIC valuesa
|
RBR = 1) at the point where rERIC is 100% (sin–1
rERIC = 1.57). From those results, it was found that the predicted 95% confidence interval always included the expected RBR value for all three probes (see the supplemental material). Therefore, it was concluded that rERIC is an unbiased and accurate estimate of genome relatedness.
Determination of clonal identity and diversity.
A common method of isolating bacterial species from environmental samples is enrichment, based on selection for a specific metabolic function or physiology of the target bacteria. Alkaline peptone water enrichment was employed to isolate V. cholerae following a standard method (31). The principle of enrichment is that it provides a suitable environment for bacterial species of interest to multiply, thereby increasing the chances of their being detected in samples where they are outnumbered by other, competitive bacteria. However, it also has the drawback that it creates "artificial" redundancy in the clones multiplying in the enrichment medium that does not reflect the actual abundance, i.e., the "natural" redundancy, in the original sample. In contrast, direct plating, that is, directly spreading a water sample aliquot on agar plates to obtain colonies of the bacterium of interest, does not generate clonal redundancy artificially. If more than one isolate belonging to the same clone is produced by direct plating, it can indicate clonal redundancy that is naturally occurring, i.e., the abundance of a clonal lineage in the sample. In previously reported studies of V. cholerae from the Chesapeake Bay (16), V. cholerae was isolated by both enrichment and direct plating during surveys carried out in 1999 and by enrichment alone in 1998. As an example of the successful application of ERIC-PCR fingerprinting, a subset of isolates from the 1998-1999 surveys was analyzed for both redundancy of clones and clonal diversity.
Clonal identity was analyzed using the rERIC of
90% criterion, with 38 isolates from seven alkaline peptone water enrichment flasks inoculated with size-fractionated samples and 8 isolates from direct plating on alkaline peptone agar plates. Fingerprints of V. cholerae O135 isolates from several sources, using both enrichment and direct plating, are shown in Fig. 7 (clone 1). The results validated the efficacy and robustness of both ERIC-PCR fingerprinting and the rERIC of
90% criterion in determining the clonal identity of V. cholerae strains. In general, the distinction of clones by fingerprints was supported by O serotyping (Fig. 7). Clonal distinction among other clones was also supported by phenotypic/genotypic divergences observed for the strains. For example, clones 2 and 4 yielded fingerprints that were similar but distinct from that of clone 1, and their traits of high ecological importance were also different from those of clone 1, that is, clone 2 was negative for heat-stable enterotoxin (stn) and clone 4 was positive for bioluminescence (luxA).
![]() View larger version (110K): [in a new window] |
FIG. 7. Determination of clonal identity and diversity of V. cholerae isolates in water and plankton net samples from Chesapeake Bay. The dendrogram was created by the UPGMA clustering method. Gray lines, rERIC < 90%; black lines, rERIC 90%. ID, isolate number; Sero, O serogroup; St, sampling site; Y, year; M, month; D, day; F, size fraction; T, isolation method; Lm, bioluminescence in marine broth 2216e; Lx, luminescence gene luxA+; Ms and Mt, acid production from mannose and mannitol, respectively; Mr, methyl red test; Vp, Voges-Proskauer test; Pb, resistance to polymyxin B; Tr, Ou, and Ow, similarity to toxR, ompU, and ompW, respectively, in V. cholerae N16961, on a discrete scale from 0 to 4; Tx, similarity to heat-stable toxin gene (stn) in V. cholerae RC66; R, untypeable rough strains; HP, Horn Point, MD (38°35.59'N, 76°07.80'W); SE, the Rhode River subestuary, MD (38°53.20'N, 76°32.51'W); W, whole water; P, particles of 20 to 64 µm; Z, particles of >64 µm; E, enrichment method; H, oligonucleotide probe hybridization of colony lift blots from direct plating. Boxes show ERIC-PCR clonal clusters and properties of strains supporting the clustering. 1, characteristic is present; 0, characteristic is absent.
|
Strain data are readily portable and are preferred, since regulations instituted to prevent trafficking of infectious pathogens make it difficult to obtain reference strains, at least not without significant delay. The ideal data are molecular sequences, such as MLST types, since molecular sequence data are definitive discrete values and do not require comparison with standard or reference cultures. Furthermore, interlaboratory communication of sequence data by public depositories or from curated MLST databases is not limited by regulations. Pulsed-field gel electrophoresis data are formatted to the sizes of the enzyme-digested genome fragments and can be communicated from curated databases, such as PulseNet (11). Band-matching analysis using ERIC-PCR fingerprints produces data in the same format as that of the PulseNet database. However, densitometric curves may be a preferred format for communicating data, since the densitometric correlation coefficient method is sensitive to the inequality of relative densities of gels and provides a more rapid interpretation of the data. Portability of fingerprint data is best achieved by standardization. Therefore, a curated, centralized, and dynamic database built on standardized protocols for long-range ERIC-PCR can achieve the goal of a real-time, networked method for identifying microorganisms, as shown here for V. cholerae. One benefit is that a single generation of genome fingerprints can be used to identify V. cholerae isolates by genome similarity, serotype, and biotype. Another is that identification of clones by matching densitometric curves or band calling by minimally equipped laboratories can be facilitated by software support and use of a centralized database. As done for the PulseNet and MLST databases, clone identification can be accomplished by the central database via communication of digital images.
In conclusion, an optimized protocol is provided for genomic fingerprinting using long-range ERIC-PCR, taking advantage of an improved long-range Taq polymerase enzyme to produce a highly robust PCR, with well-defined gel electrophoresis conditions to achieve high resolution of the fingerprint patterns. For fingerprint similarity calculations, Pearson's product moment correlation of densitometric curves is accurate (i.e., unbiased) and precise (fivefold greater precision than RBR, measured by DDH). The protocol yields highly reproducible fingerprints and has the power of resolution to discriminate closely related strains, e.g., V. cholerae O1 El Tor and O139 serogroup strains, when band-matching analysis is supplemented. The method can be used to identify clonal identity, relatedness of strains, diversity, and phylogenetic structure for large collections of bacterial isolates and is suitable for epidemiological and ecological applications.
Published ahead of print on 7 July 2008. ![]()
Supplemental material for this article may be found at http://aem.asm.org/. ![]()
N.C. and Y.-G.Z. contributed equally to the manuscript and are listed in alphabetical order. ![]()
|
|
|---|
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»