Previous Article | Next Article ![]()
Applied and Environmental Microbiology, June 2007, p. 3695-3704, Vol. 73, No. 11
0099-2240/07/$08.00+0 doi:10.1128/AEM.02735-06
Copyright © 2007, American Society for Microbiology. All Rights Reserved.
,
Division of Infectious Diseases and Geographic Medicine, Department of Medicine, Stanford University School of Medicine, Stanford, California 94305,1 Department of Civil and Environmental Engineering, Stanford University, Stanford, California 943052
Received 22 November 2006/ Accepted 9 April 2007
|
|
|---|
|
|
|---|
The ability of V. cholerae to colonize multiple ecological niches is consistent with its extensive phenotypic and genotypic diversity. Over 200 serogroups of V. cholerae have been identified (12), and isolates have been found to be genetically and biochemically diverse (1, 3, 8, 11). Even when restricting the analysis to toxigenic isolates, significant interannual and interepidemic diversity has been documented (12, 61) for both gene content and allelic variation (8, 9, 21, 29, 42). The extent of variability is greater when considering nonepidemic pathogenic isolates and even more so with nonpathogenic environmental isolates (3, 11, 23).
The extensive genomic diversity described for environmental V. cholerae isolates using genomic markers and sequence analysis (13) is consistent with patterns described for the intraspecific variability of other Vibrio species and also of more divergent bacteria. By comparing the sequence of the hsp60 gene with that of the total genome size, Thompson et al. recently demonstrated that a coastal Vibrio splendidus population contains hundreds of genotypes per milliliter of seawater, with each genotype present at <1 ml1 (55). Detailed analysis of the fully sequenced genomes from two pathogenic Vibrio vulnificus isolates points to significant genomic plasticity, much of which is due to large genomic islands specific to one strain (44). A comparison of the whole genome sequences of eight isolates of Streptococcus agalactiae led to the division of the S. agalactiae pan-genome into universally conserved genes that comprise the core genome of this species and sporadically conserved dispensable genes. Computational modeling indicates that the set of dispensable genes, present in some but not all S. agalactiae isolates, is inordinately large (54).
Multilocus enzyme electrophoresis and DNA sequence analysis show that V. cholerae O1 and O139 toxigenic isolates from humans with cholera are phylogenetically tightly clustered with each other within V. cholerae populations. Some clones, both toxigenic and nontoxigenic, have been observed to persist in the environment for years (3, 11, 52). Nonetheless, results from both genome-wide and single-locus analyses point to extensive recombination between V. cholerae lineages (4, 5, 29, 30, 52). The emergence in 1992 of epidemic isolates of serogroup O139 exemplifies this phenomenon. While the mechanism of gene transfer is still unclear, current evidence suggests that the O1 progenitor of the O139 epidemic strains had acquired new O-antigen genes from an environmental V. cholerae strain of serogroup O22 and that these genes had been recombined into the O1-encoding locus (57). Thus, while a small number of V. cholerae clones have the combination of genes necessary to efficiently infect humans (3), these organisms can also acquire DNA encoding novel functions from nontoxigenic lineages of environmental V. cholerae strains (43).
Several mechanisms have been described by which new DNA from potentially very distant sources can be introduced into the genome of a V. cholerae lineage. The genes encoding the cholera enterotoxin reside on a functional prophage (58). Furthermore, Vibrio pathogenicity islands, VPI-1 and VPI-2, which encode other virulence determinants, bear hallmarks of recent acquisition by lateral gene transfer (20, 26). An integrating conjugative transposon, SXT, has been detected in clinical and environmental isolates (59). This transposon can mobilize not only itself but also plasmids and linked chromosomal DNA (17). Recently, a more generalized mechanism for genetic exchange between different V. cholerae strains has been described. V. cholerae O1 becomes competent for natural transformation when grown on chitin (36), a nutritive substrate which it colonizes in aquatic habitats (18). While the acquisition of sequences by natural competence, which requires homologous recombination, is expected to be restricted to donors that are closely related to V. cholerae, it is potentially capable of transforming any part of the genome between V. cholerae lineages. Transformation proceeds without the requirement for specialized sequence elements (e.g., repeats) or for genes encoding dedicated functions (e.g., integrases). Genetic exchange via transformation has been implicated in the extensive diversity observed for Neisseria spp. (32) and Helicobacter pylori populations (10). In addition to these mechanisms by which lineages can gain new sequences, gene loss through deletion (49) can play a role in generating diversity.
In this study, we used comparative genome hybridization (CGH) to identify genes that are conserved and genes that vary between 41 non-O1/O139 environmental V. cholerae strains isolated from eight ecologically diverse sites along the central California coast. We demonstrate the transfer of two clusters of variable genes encoding different metabolic functions into the genomes of California isolates induced to competence by growth on chitin. These results suggest that recombination through natural transformation can facilitate the exchange of variable genes between different V. cholerae lineages residing in the same aquatic habitat.
|
|
|---|
Comparative genome hybridization analysis of gene content.
Genomic DNA (gDNA) samples from isolates and the N16961 reference strain were prepared in parallel from overnight cultures grown in LB (2) using QIAGEN DNeasy columns according to the manufacturer's protocol. DNA was concentrated by ethanol precipitation, labeled with Cy3 or Cy5 by primer extension (6), and then mixed, purified, and concentrated using Microcon YM-10 centrifugal concentrators. PCR amplicon microarrays with spots for 3,357 of 3,891 annotated genes in V. cholerae N16961 were hybridized as described in reference 37. Only spots with a regression correlation of >0.6 were considered for analysis. For genes with multiple probes, log2(isolate/N16961) values were averaged after normalization.
A two-step protocol was designed to (i) identify genes which were highly conserved in all isolates and then (ii) normalize signal intensity based only on universally present genes. This strategy allows normalization using the majority of the probes on the array while avoiding biasing the normalization constant based on the number of genes missing from any given isolate. First, channel 2 (typically Cy3) intensity was adjusted so the log2 ratio of the net signal from features for genes encoding ribosomal proteins was equal to 0. The 2,222 genes for which the log2(isolate/N16961) was >1 for all isolates were considered always conserved. Each array was then renormalized based on these 2,222 always-conserved genes, resulting in a biphasic distribution of the normalized log2 ratio of hybridization intensities with a large, narrow peak (conserved genes) centered near 0 and a small, wider peak (divergent genes) centered at 2 to 3.
For each isolate genome, genes were categorized as "positive," "negative," or "uncertain," corresponding to "present," "absent," and "uncertain," respectively, as described in reference 46. For each array, we calculated the median for the two sets of genes in which at least two chromosomally adjacent genes had log2(isolate/N16961) values of
1.5 (negative) or >1.5 (positive). Genes were then determined to be ("called") negative if the log2(isolate/N16961) value was less than or equal to the mediannegative minus 2 standard deviations and positive if log2(isolate/N16961) was greater than or equal to medianpositive plus 2 standard deviations. The final determination ("call") for each gene within a genome was based on the consensus of calls from three arrays.
The hybridization data describing the conservation of genes within each isolate genome were aggregated to examine the pattern of conservation across the sampled California V. cholerae isolates. Considering all of the California isolates, genes were grouped into the categories "conserved" (no isolate called negative), "absent" (no isolate called positive), or "variable" (some isolates called negative and others positive). As a quality control measure, genes with calls in fewer than 70% of isolates were categorized as "uncalled." The 70% cutoff was introduced to prevent categorizing genes based on unreliable hybridization data. Some genes were found to have hybridization ratios between the cutoffs for being called "positive" and "negative" in a large proportion of isolates. These intermediate hybridization ratios could have arisen from either technical (e.g., nonspecific background hybridization, or spot contamination) or biological (e.g., paralog hybridization) causes. This microarray data cannot distinguish between these, so the 70% cutoff was imposed to eliminate these genes from the analysis.
It should be noted that the categories used here considered a gene missing from just one isolate to be variable, while genes missing from as many as 5% of isolates were considered part of the "core" genome, as defined by Keymer et al. (27), for comparison to previous analyses. The lower threshold was used in this study to maximize detection of variable genes.
Clusters were defined as contiguous, absent, and/or variable genes separated by no more than one unprobed or uncalled gene for N16961. Uncalled or unprobed genes within or immediately upstream of clusters were included in the cluster. This approach requires the assumption of synteny between the N16961 and the California isolate genomes, an assumption supported by the PCR validation and transformation results presented below.
Growth experiments.
Vibrio isolates were grown at 37°C in M9 (2) supplemented with minimum essential medium vitamins (Invitrogen), 0.001% Casamino acids, and the indicated carbon source at 0.2%. Plates were solidified with 2% Nobel agar. Cultures were inoculated in triplicate at an optical density at 600 nm (OD600) of 0.01 from washed, overnight cultures grown in M9 with 0.5% sodium lactate. Plates were inoculated by frogging from washed, overnight cultures diluted to an OD600 of 0.1.
Transformation experiments.
Transformations were performed on crab shell as described previously (36). gDNA (2 µg) was added to washed biofilms grown on crab shell for 18 h, and then transformants were selected 18 to 20 h later. All strains grew equally well under transformation conditions. Transformants were selected on LB medium with antibiotic or M9 plus a carbon source. Transformation efficiency was calculated as transformant CFU/total CFU on LB or M9 plus N-acetylglucosamine. Newly acquired DNA was detected by PCR using three primers, two flanking the variable region and one within. The recombined DNA was mapped based on improved hybridization to an oligonucleotide probe microarray containing probes for all the annotated protein-encoding genes in strain N16961. gDNA samples from transformants and parental strain W6G were labeled as described above and hybridized as described in reference 36. Data from four arrays were averaged for each transformant. Recombined DNA was detected as improved hybridization at probes directed against genes absent from or containing sequence differences in isolate W6G. Recombination junctions were confirmed and mapped to higher resolution by sequencing PCR products.
|
|
|---|
These individual hybridization results allowed us to categorize each ORF according to its representation within the collection of 41 California isolates (Table 1). A total of 2,727 genes (81% of 3,357 genes probed) gave positive hybridization results for all tested isolates. These compose the "conserved" gene set because they are universally present within all of the tested environmental isolates and in the N16961 V. cholerae O1 sequenced strain. We identified 133 genes (4.0%) which were present in the genome of the V. cholerae O1 sequenced strain N16961 but which were not detected in any of the California isolates. These compose the "absent" gene set. Finally, we identified 364 genes (11%) which were present in the V. cholerae sequenced strain and in some, but not all, of the California isolates. These were designated the "variable" gene set since their presence varies across the California isolate collection. In addition, 133 (4.0%) genes were not categorized because they gave a definitive hybridization result in less than 70% of the tested isolates. That these genes gave ambiguous hybridization results in a large portion of isolates suggested that their microarray probes were not specific enough to make confident calls. These genes were assigned to the "uncalled" gene set (Table 1).
|
View this table: [in a new window] |
TABLE 1. Number (fraction) of probed N16961 genes conserved, absent, and variable in California Vibrio cholerae isolates
|
To gain a better understanding of the distribution of the different gene sets across the two V. cholerae chromosomes, we mapped the location of genes in the conserved, absent, and variable categories onto the fully sequenced genome of N16961 (Fig. 1). As expected, given the high variability of chromosome 2 (8), it is more variable than chromosome 1 among California isolates (Table 1). In particular, multiple large clusters of absent and variable genes are found nearly adjacent to each other at the integron on chromosome 2 (VCA0296 to VCA0506) and in two additional clusters predicted to encode transposases (VCA0198 to VCA0200 and VCA0790 to VCA0795).
![]() View larger version (41K): [in a new window] |
FIG. 1. Genes variably present in or absent from all California Vibrio cholerae isolates mapped onto N16961 chromosomes. Circle one shows each of the protein-coding genes (16) in reference strain N16961 probed in this study (blue); circle two shows genes not probed. Circle three shows genome landmarks. Absent (green) and variable (orange) genes are plotted in the fourth circle. Conserved and uncalled genes are not shown. Circle five shows genes identified by previous CGH studies as missing from at least one V. cholerae isolate (black) (8, 9). Circle six shows the percentage of G+C (gray) in a 5,000-bp window in 1,000-bp steps, determined using GACK software (28). Generated with GenoMap (47).
|
Some of the genes that are present in the sequenced strain N16961 but absent from California isolates are known to have been acquired on transmissible genetic elements which encode dedicated machinery for gene exchange. Notable among these are the genes encoding cholera toxin (VC1456 to VC1464), which reside within a prophage (58). To discover if this association is a common feature of genes which are present in N16961 but absent in California isolates, we determined if they are more likely than genes in the variable gene set to have chromosome positions near mobility genes. We measured the distance between each gene in the N16961 genome and the closest gene annotated to encode mobile and extrachromosomal element functions (Fig. 2). Assuming synteny, these values are also a good approximation for the distances in the California isolate genomes. Nearly all (94%) of the genes which are present in the N16961 sequenced strain, but absent in California isolates, fall within 40 kbp of a gene specifying a mobility function in N16961, a distance that corresponds to the size reported for several vibriophages (15). By contrast, only 52% of genes whose presence varies among California isolates and 16% of genes conserved in all isolates meet this criterion (Fig. 2). Furthermore, analysis of this association showed that differences in cluster size and distance from a coding region for a mobile genetic element parallel differences in the G+C content of genes in the absent, conserved, and variable gene sets. Genes absent from California isolates but present in the N16961 genome were found to have a median G+C content of 39.3% compared to the median G+C content of 49.0% for conserved genes. The lower G+C content of the former is largely due to the fact that they are usually associated with pathogenicity islands or phage and suggests that they were recently acquired by the epidemic lineages (16, 46). Genes that vary between California isolates have an intermediate G+C content of 45.4%. The same pattern is observed when using more nuanced methods which use G+C content as only one of several factors to estimate the probability of any gene having been introduced into the V. cholerae genome by horizontal transfer (14, 56). When comparing our CGH results to horizontal gene transfer (HGT) predictions (56), genes categorized as absent are most likely to be predicted to have originated by HGT (79%), and conserved genes are least likely (3.2%). Variable genes fall in between (34%). This reflects the fact that while most variable genes are unassociated with known mobile elements and pathogenicity islands, a portion are found in or near known islands.
![]() View larger version (32K): [in a new window] |
FIG. 2. Distance to mobility genes. The distance from the midpoint of each gene probed by CGH to the midpoint of the closest N16961 gene annotated as "mobile and extrachromosomal element functions" (16) is plotted in a cumulative histogram. Absent (green), variable (orange), and conserved (black) genes are plotted by chromosome. Because the chromosomes vary in size, the maximum possible distances are 1,481 kbp for chromosome 1 (Ch. 1) and 536 kbp for Ch. 2. This, along with uneven distribution of mobility genes, accounts for much of the differences between the shapes of the curves for the conserved genes.
|
Variable regions encoding metabolic functions are mobile via natural competence and transformation.
In the preceding section, CGH was used to study 41 California environmental isolates to identify genes in the conserved, absent, and variable gene sets. Reports of frequent interlineage recombination among V. cholerae (3, 5, 29) led us to ask if variable genes outside of known mobile elements can be transferred from one genome to another via natural competence. We focused on two gene clusters (VC1280 to VC1286 and VC1820 to VC1827 in the genome of strain N16961) that encode different metabolic functions and vary between California isolates. These clusters are of interest because their presence or absence in California isolates was found to be significantly associated with water temperature (27), one of the physical parameters measured at the time of sampling. Strains simultaneously positive for gene cluster VC1820 to VC1827 and negative for VC1280 to VC1286 (e.g., strain W6G, shown in Fig. 3A) were associated with cold water. By contrast, strains positive for VC1280 to VC1286 and negative for VC1820 to VC1827 were more often isolated from warmer water (e.g., strain Sa5Y).
![]() View larger version (28K): [in a new window] |
FIG. 3. Transformation of variable loci. (A) Positive (black), negative (green), or uncertain (gray) genes at three loci on chromosome 1 for each isolate genome. Strains are grouped by overall genomic similarity based on a report by Keymer et al. (27). (B) Growth in M9 with 0.001% amino acids and 0.2% mannose (filled symbols) or (GlcN)2 (open symbols). W6G (blue) lacks genes VC1280 to VC1286. Sa5Y (green) lacks both VC0269 to VC0270 and VC01820 to VC01827. N16961 (orange) encodes all three loci. (C) Isolates Sa5Y (green) and W6G (blue) were grown on crab shell and transformed with gDNA from strain VCXB21. Transformants were selected on LB medium containing the indicated antibiotic or M9 with (GlcN)2 (W6G) or mannose (Sa5Y). (D) Metabolic phenotypes of transformants were confirmed by growth on M9 containing the indicated carbon source. D, gDNA donor strain; R, recipient isolate; T, nine independent transformants. (E and F) The selected locus transformed from the donor strain into the recipient isolate was detected by PCR. M, molecular weight marker (kbp). (E) Probes VC0269 to VC0270 transformed into Sa5Y using primers in VC0268, an intervening gene present in Sa5Y but not in VCXB21, and VC0271. (F) Probes VC1280 to VC1286 transformed into W6G using primers in VC1279, VC1280, and VC1287.
|
The cluster VC1820 to VC1827 is annotated as encoding, among other functions, mannose-6-phosphate isomerase (VC1827), which catalyzes the interconversion of fructose-6-phosphate and mannose-6-phosphate. In addition to the VC1827 locus, the N16961 genome contains a second gene (VC0269) which also is predicted to encode a mannose-6-phosphate isomerase. However, CGH results indicate that VC0269 is missing from W6G and Sa5Y and from 37 other California isolates (Fig. 3A). The absence of both mannose-6-phosphate isomerase genes VC1827 and VC0269 from Sa5Y likely explains its failure to grow in a medium containing mannose as the sole carbon source (Fig. 3B). By contrast, strain W6G, which retains the VC1820-to-VC1827 gene cluster, grows on a mannose-containing medium. The N16961 sequenced strain, which has both of the mannose-6-phosphate isomerase genes, also grows on a mannose-containing medium.
In a prior report from our laboratory, Meibom et al. used laboratory strains of V. cholerae O1 El Tor and antibiotic resistance genes to establish the phenomenon of chitin-induced natural transformation as a mechanism of genetic exchange in this species (36). However, that report also showed that not all V. cholerae strains can be transformed in this manner and that in some strains this was shown to be due to a mutation in the quorum-sensing regulator hapR (36). To determine if W6G and Sa5Y can acquire genes by chitin-induced natural transformation, these California isolates were individually grown as biofilms on a crab shell fragment in artificial seawater. Then, we added genomic DNA from strain VCXB21 (31), a kanamycin- and streptomycin-resistant derivative of the N16961 sequenced strain. The resulting antibiotic-resistant transformants of W6G and Sa5Y were selected on media containing either kanamycin or streptomycin, and the transformation efficiencies were determined. W6G and Sa5Y were able to acquire the kanamycin and streptomycin-resistant loci encoded in gDNA from VCXB21 (Fig. 3C). Transformation efficiencies of W6G and Sa5Y for the acquisition of kanamycin resistance, conferred by neoR integrated upstream of lacZ, were comparable to those reported for laboratory strains. Streptomycin resistance, conferred by a spontaneous mutation, can be transformed into the two California environmental isolates with an efficiency 5- to 10-fold higher than that of kanamycin resistance.
These results showed that W6G and Sa5Y could acquire genes by chitin-induced natural transformation. These strains then were tested to determine if they can use chitin-induced natural transformation to acquire the VC1280-to-VC1286 and VC1820-to-VC1827 gene clusters and corresponding metabolic functions. For this purpose, we exploited the carbohydrate-specific nutritional phenotypes of W6G or Sa5Y (Fig. 3C) to select transformants of these strains which could grow on (GlcN)2 or mannose, respectively. As for the antibiotic resistance transformation experiments described above, each strain was propagated as a biofilm on a crab shell fragment. Then, DNA from VCXB21 was added, and transformants were selected by using a medium that contains (GlcN)2 or mannose as the carbon source. After selection, transformants of W6G or Sa5Y were tested individually and confirmed to have stably acquired the new metabolic trait by growth on medium containing (GlcN)2 or mannose, respectively (Fig. 3D).
PCR assays were used to determine if acquisition of the new metabolic phenotypes by the W6G and Sa5Y transformants was correlated with acquisition of sequences for the corresponding genes. Each of the nine tested mannose-metabolizing transformants of Sa5Y was found to have acquired the VC0269-to-VC0270 locus, which encodes a mannose-6-phosphate isomerase, from the VCXB21 donor (Fig. 3E). These results also showed that the acquired VC0269-to-VC0270 locus from VCXB21 apparently replaced the intervening segment found between ORFs VC0268 and VC0271 in the Sa5Y recipient strain. Independent mapping showed that Sa5Y contains
9 kbp of DNA in place of VC0269 to VC0270 (data not shown). This region does not amplify under the conditions used in Fig. 3E. However, none of the tested mannose-utilizing transformants acquired the VC1820-to-VC1827 locus from the VCXB21 donor (data not shown), even though VC1827 is also predicted to encode a mannose-6-phosphate isomerase.
We also detected newly acquired DNA from the VCXB21 donor in the (GlcN)2-metabolizing transformants of W6G (Fig. 3F). A PCR assay that can detect both the parental W6G locus and the donor VCXB21 locus was used to test if nine independent (GlcN)2-utilizing transformants had acquired the VC1280-to-VC1286 gene cluster (Fig. 4). Each transformant generated a PCR product indicative of only the VCXB21 locus. These results show that the VC1280-to-VC1286 gene cluster of VCXB21 had been recombined into the W6G chromosome, between VC1279 and VC1287.
![]() View larger version (19K): [in a new window] |
FIG. 4. Recombination of large chromosome fragments at VC1260 to VC1300 by naturally competent V. cholerae. (A) Strategy for detecting transformed DNA by microarray hybridization. (B) Oligonucleotide microarray CGH comparison of W6G transformants selected for growth on (GlcN)2 to the original W6G isolate. Lane 1 is colored as in Fig. 3A and reflects processed calls, not raw hybridization data. Lane 2 is colored to reflect W6G (red) hybridization relative to N16961 (green). Lanes 3 to 11 reflect transformant hybridization (red) relative to that of environmental isolate W6G (green) as depicted in panel A. (C) Dark green ORFs (VC1280 to VC1286) are missing from W6G and were acquired by the transformants. Light green ORFs are present in W6G but can be distinguished from homologous genes in VCXB21 by virtue of reduced microarray hybridization and/or sequenced single-nucleotide polymorphisms. For each of nine transformants, yellow reflects DNA donated from VCXB21, and black reflects DNA of W6G origin. Recombination junctions were resolved to the gene. The dotted line reflects DNA of ambiguous origin due to a lack of sequenced single-nucleotide polymorphisms. The arrowheads above ORF map reflect primers used in Fig. 3F.
|
Transformants acquire large DNA fragments from the donor.
We performed CGH assays using oligonucleotide microarrays with probes for each of the annotated ORFs in V. cholerae N16961 to map the recombination events which generated the (GlcN)2-metabolizing W6G transformants (Fig. 4A). Sequencing demonstrated that the W6G genome has polymorphisms relative to N16961 at each of the ORFs (Fig. 4C). By combining the hybridization results with sequencing at sites flanking the apparent recombination site, we mapped the junction of transformed DNA with the parental chromosome to within a few kilobase pairs (Fig. 4C; see Table S3 in the supplemental material). Remarkably, we observed the recombination of quite large DNA fragments. The mean size of the recombined fragment was 22.7 kbp, ranging from 7.9 to 44.9 kbp. This indicates that while recombination at sites immediately flanking the insertion/deletion at VC1280 to VC1286 is possible, it is more likely to happen further from the selected locus. Fragments of DNA substantially larger than the 5.6-kbp region that differs between W6G and VCXB21 recombine freely. Additionally, even at this relatively low resolution, it is clear that eight of the nine transformants underwent different recombination events, since only two transformants shared the same pair of junctions.
|
|
|---|
Previous studies using CGH to explore the genomes of V. cholerae isolates (8, 9) focused on a limited number of strains with the ability to cause disease in humans or animal models. Here, we have examined a larger number of environmental isolates from a geographically limited, but ecologically diverse, region. This set of isolates is optimal for characterizing the genomic and functional diversity of V. cholerae attributable to an association with different aquatic niches and diverse enough to allow identification of variably present genes with readily selectable phenotypes. Because many of these encode metabolic and sensing functions, their presence in a particular strain might determine its capacity to occupy a specific niche within an aquatic habitat. This is consistent with studies showing that geographic area, habitat, and ecological parameters, such as salinity and water temperature, influence V. cholerae population structure and dynamics (22, 23, 61). Keymer et al. (27) correlate the presence of several of the genes that vary between California isolates with environmental parameters measured during isolation, suggesting that genome content specializes isolates for different aquatic niches. An alternative, nonexclusive hypothesis is that the correlation with environmental parameters reflects common ancestry of isolates collected from waters with similar environmental parameters.
The microarray employed in this study could only detect genes present in sequenced strain N16961. Consequently, we could not monitor genes which are absent from N16961, including any which might be limited to aquatic or California V. cholerae. Two observations suggest that this set of genes is likely to be large. The draft genome of RC385, a V. cholerae strain which was repeatedly isolated from the Chesapeake Bay (TIGR Vibrio Genome Project, http://msc.tigr.org/vibrio/) encodes over 300 proteins (
9.3% of the 3,221 predicted ORFs) with no BLASTp hit in the N16961 predicted proteome with an E value of less than 1e-10 (data not shown). In addition, results from suppressive subtractive hybridization performed with two V. cholerae isolates from southern California coastal sites were used to estimate that 3 to 20% of sequence in these genomes lack homology to N16961 (43). In both of these cases, the set of genes present in aquatic V. cholerae but absent from N16961 includes many genes that are annotated as hypothetical; others are predicted to have functions in cell surface modification, metabolism, and DNA mobility. These same types of functions are overrepresented in the variable gene set of our California isolates.
Despite this limitation, our analysis identified at least two classes of genes which are missing from or vary between the V. cholerae strains used in this study. The first class includes genes mobilized by dedicated mechanisms of horizontal gene transfer, such as phage and integrases. These include the well-described cholera toxin phage as well as an island containing VSP-II (8), which also encodes phage-like proteins. These are completely missing from the California isolates described here, and thus, are likely to have been recently acquired by epidemic V. cholerae lineages from non-V. cholerae sources. The second class of genes shows variability within the California isolates; most of these are not associated with known mobility elements or other clear markers of horizontal gene transfer. The interisolate variability observed for genes outside of the known mobile elements is likely to have arisen by more than one mechanism. Rapid gene loss through deletion plays a major role in bacterial genome evolution (38), and so a substantial portion of variability may be due to gene loss. In some cases, the variable genes may be on horizontally acquired mobile elements without clear signatures of horizontal gene transfer.
In addition to transduction (24) and conjugation (17), we expect natural transformation, which ordinarily incorporates acquired sequences by homologous recombination, to play an important role in the movement of genes in the variable gene set between different V. cholerae genomes. This prediction is supported by our observation that chitin-induced natural transformation allows California isolates long separated from clinical strains to take up and incorporate DNA from VCXB21, which was derived from an O1 El Tor Bangladeshi isolate. This result also shows that there is no significant physiological or sequence barrier to recombination between distant V. cholerae lineages. Taken together, the CGH data and results from the transformation experiments suggest the great diversity of genome content in these California isolates could have been generated by intraspecific recombination. This in turn may have been the consequence of natural transformation.
Perhaps the most notable of our findings is the capacity of V. cholerae isolates to take up and recombine chromosomal fragments over 40 kbp in length. The average recombined fragment length of 22.7 kbp in the W6G transformants is substantially longer than averages reported for Bacillus subtilis (7). Fragments of this size could encode complex metabolic and biosynthetic pathways. This is exemplified by the capacity of W6G, a California coastal isolate, to acquire DNA sufficient to encode components of the (GlcN)2 utilization pathway both from Sa5Y, another California isolate, and from an O1 El Tor strain. Since (GlcN)2-containing chitin polymers occur in natural chitin sources (39), these W6G transformants might be more fit than the parent strain to occupy aquatic habitats where chitin is available as a nutrient.
Neisseria spp. are naturally competent and contain variable genome segments that appear to have moved among Neisseria strains via transformation (48). These minimal mobile elements (MMEs) are short segments of one to five genes and are defined by their presence at homologous locations in multiple Neisseria species, by a mosaic structure, and, by a lack of obvious mechanism for horizontal transfer. Many of the V. cholerae genes in the variable set defined in this study fall in clusters that resemble MMEs; the best characterized example is the locus between VC0268 and VC0271. In strain N16961, this locus contains VC0269 to VC0270, two genes which were readily transferred to environmental isolate Sa5Y. By contrast, in Sa5Y, this region contains 9 kbp of DNA apparently not present in N16961. It remains to be seen whether genes in the V. cholerae variable gene set originated from or can be mobilized to other Vibrio species.
DNA taken up during chitin-dependent natural competence has several potential fates. Here we have demonstrated homologous recombination into the chromosome. If DNA taken up during competence includes the appropriate sequence elements, then the integron integrase (35) and transposases (59) encoded in the V. cholerae genome would be expected to use it as a substrate for site-specific recombination into the genome. No doubt, these recombinational processes would compete with nucleases that degrade DNA to nucleotides that can be used for DNA replication or further metabolism (51).
The approach used here has several important limitations. As noted above, the microarray we used for CGH contained probes only for the N16961 genome. Consequently, we could not monitor the distribution of genes absent from this clinical isolate. Since these genes would be expected to play a significant role in the adaptation of these isolates to their niches, their distribution among California isolates (and the functions that they encode) is of great interest. Likewise, probes for genes in the large, highly variable integron are underrepresented on the microarray, and so this particularly interesting part of the genome is poorly sampled. Due to these biases, the isolate collection may be even more diverse than is apparent. By probing more of the pan-genome, we would expect the set of conserved genes to remain largely unchanged (27) but the set of variable genes, which are to date largely unidentified, to increase dramatically. Finally, we focused our transformation analysis on large, multigene segments of DNA to establish the potential of this mechanism to contribute to the variability observed by CGH. However, the frequency of these events in the wild is unknown and might be substantially lower than the recombination of subgenic segments or of sequences located at sites without large insertion/deletions.
Funding was provided by the Ellison Foundation (to G.K.S.), the NIH (to G.K.S.), the Giannini Family Foundation (to M.C.M.), and the Woods Institute for the Environment at Stanford (to D.P.K., A.B.B., and G.K.S.).
Published ahead of print on 20 April 2007. ![]()
Supplemental material for this article may be found at http://aem.asm.org/. ![]()
|
|
|---|
This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2010 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»