Previous Article | Next Article ![]()
Applied and Environmental Microbiology, October 2003, p. 6099-6105, Vol. 69, No. 10
0099-2240/03/$08.00+0 DOI: 10.1128/AEM.69.10.6099-6105.2003
Foodborne and Diarrheal Diseases Branch, Division of Bacterial and Mycotic Diseases, National Center for Infectious Diseases, Centers for Disease Control and Prevention, Atlanta, Georgia 30333
Received 25 November 2002/ Accepted 11 July 2003
|
|
|---|
|
|
|---|
The 2,523 different serotypes currently described in the Kaufmann-White scheme (29) are derived from the considerable number of permutations of 46 O serogroups, 11 additional O antigens, and 119 H antigens (3). The O antigen is the outermost component of lipopolysaccharide (LPS), an immunogenic glycolipid that is a major component of the outer membrane in gram-negative bacteria (31, 35). A considerable amount of diversity is seen within Salmonella O antigens, which are composed of multiple repeats of an oligosaccharide unit (O unit), and they contribute major antigenic diversity to the cell surface, which is used to serotype Salmonella isolates. The basis of the variation in O antigen structure is represented by differences in sugar composition, arrangement of the sugars in the O unit, the specific linkages between the O units, and the addition of branch sugars and modifying side groups.
Much of the Salmonella O antigen variation is a consequence of the extensive genetic diversity within rfb (O antigen) gene clusters, which encode many of the enzymes involved in O antigen biosynthesis and assembly. Typically, three different types of genes are seen within rfb gene clusters: (i) genes that encode enzymes involved in the synthesis of the sugars that form the O subunit; (ii) genes that encode transferases, which assemble sugar substituents into the O subunit; and (iii) genes that encode proteins involved in processing and assembly steps to build the O antigen from the O subunit, such as the O antigen transporter (wzx) and O antigen polymerase (wzy).
Wzx proteins are involved in the transport of completed O antigen subunits across the cytoplasmic membrane (21). The mechanism of O antigen export in Salmonella enterica has been shown to be Wzx dependent, and all of the Salmonella rfb gene clusters sequenced to date have a wzx gene. The function of the O antigen polymerase is in the polymerization of the O units to form the O antigen, which is encoded by the wzy gene (formerly rfc). While most S. enterica rfb gene clusters contain a wzy homolog, serogroups B and D have a wzy gene elsewhere in the genome.
The rfb gene clusters from 10 Salmonella serogroups have been studied: A (19), B (14, 45), C1 (18), C2 (4), D1 (19), D2 (47), D3 (6), E1 (42), O:54 (16), and O:35 (44). The rfb regions from serogroups A, B, C2, D, and E, which all have a trisaccharide O subunit containing mannose, rhamnose, and galactose, are related. The rfb gene cluster from serogroup C1, whose O subunit is composed of four mannose residues, one N-acetylglucosamine residue, and a glucose side branch, shows little homology to them (18). The S. enterica O:35 rfb gene cluster encodes the same O antigen as Escherichia coli O111, and they have been shown to have closely related gene clusters (44). Synthesis of the O:54 antigen is mediated by a plasmid-encoded gene cluster (16). More recently, the genetic variations in the dDTP-L-rhamnose (rml) (30) and GDP-mannose (man) pathway genes of an additional 11 and 13 serogroups that contain rhamnose and mannose in their respective O antigen structures have been described (13).
Little is known about the rfb gene clusters from other Salmonella O serogroups. As part of a larger project to develop a DNA-based approach for serotyping Salmonella (7), we have sequenced the rfb gene cluster from two serogroup O:6,14 isolates, Salmonella enterica serotype Sundsvall (I 6,14,25:z:e,n,x) and S. enterica serotype Carrau (I 6,14,24:y:1,7), with the aim of identifying serogroup-specific DNA targets. We found that the serogroup O:6,14 rfb gene cluster is most related to serogroup C1. Comparison of the S. enterica serotypes Sundsvall and Carrau rfb sequences, which differ in the expression of O factor 24 or 25, revealed essentially no difference in the rfb region between the two serogroup O:6,14 isolates. We targeted the wzy gene in a PCR detection assay and found the sequence to be specific for serogroup O:6,14.
|
|
|---|
DNA manipulation.
General DNA procedures and bacterial transformations were performed as previously described (34). The rfb region was amplified by PCR (Expand Long PCR kit; Roche, Indianapolis, Ind.) with oligonucleotide primers corresponding to the 5' end of gnd and the middle of the JUMPstart sequence as previously described (43). The 8.4-kb amplicon was doubly digested with EcoRI and ClaI (New England Biolabs), and the restricted fragments were cloned into DH5
(Gibco BRL). Three clones of different sizes were identified.
DNA sequencing.
The ends of each of the fragments were sequenced with universal primers, and the sequences were extended via primer walking. Sequencing was performed on an PE Applied Biosystems 377 automated DNA sequencer with BigDye terminator cycle sequencing ready reaction mix according to the manufacturer's instructions (PE Applied Biosystems, Foster City, Calif.). DNA sequence data were processed by using Lasergene 99 (DNASTAR, Madison, Wis.) software and assembled into a contiguous sequence. The TMpred program (12) was used to predict transmembrane domains. This program uses an algorithm based on the statistical analysis of a database of naturally occurring transmembrane proteins to predict membrane-spanning regions and protein orientation. The National Center for Biotechnology Information (NCBI) BLAST network server was used to search sequence databases for sequences with homology to the open reading frames (ORFs) found in our sequence.
Serogroup O:6,14 PCR assay.
Bacterial genomic DNA extracted with the Puregene DNA isolation kit (Gentra Systems, Minneapolis, Minn.) was resuspensed in sterile distilled water at a concentration of 50 ng/µl. A total of 46 pools were made, with 1 to 12 samples of DNA per pool. A forward primer (5'-GTCTCCGCTAAGCTATTTCGGTTTGTA-3') and a reverse complementary sequence primer (5'-TACCGCAATAATTCAATCACAAGGG-3') were derived from sequence within wzy to generate a 501-bp PCR amplicon. PCR amplification was performed in 25-µl volumes with Ready-to-Go PCR beads (Amersham Biosciences, Piscataway, N.J.). PCR amplification was carried out in a thermal cycler (MJ Research, Waltham, Mass.) with the following cycle parameters: initial denaturation, 96°C for 2 min; followed by 25 cycles of 94°C for 30 s (denaturation), 58°C for 30 s (annealing), and 72°C for 45 s (extension); and a final extension at 72°C for 10 min. Amplification products were analyzed with a bufferless E-Gel 96 high-throughput agarose electrophoresis system (Invitrogen, Carlsbad, Calif.) and a UV transilluminator (GelDoc1000; Bio-Rad). If a positive PCR result was seen for a particular pool, all DNA samples from the pool were individually retested by PCR, and amplification products were then confirmed by conventional agarose gel elecrophoresis with 1% (wt/vol) agarose gels.
Nucleotide sequence accession number.
The sequences of the rfb O antigen gene cluster determined in this study have been deposited in the GenBank database under accession no. AY 334017.
|
|
|---|
![]() View larger version (23K): [in a new window] |
FIG. 1. O antigen gene cluster of S. enterica serogroup O:6,14. Putative ORFs are represented by arrows, with the corresponding assignment of gene name. The percentage G+C ratios were calculated and plotted for each 100 bases.
|
|
View this table: [in a new window] |
TABLE 1. Characteristics of each ORF, including gene identity and length, G+C content of each gene, and P values for the corrected average G+C contents for codon positions 1, 2, and 3
|
|
View this table: [in a new window] |
TABLE 2. Features of the initiation regions of the ORFs of S. enterica serotype Sundsvall strain #185a
|
|
View this table: [in a new window] |
TABLE 3. Comparison of the size of the intergenic region between the last gene in the rfb gene cluster and the start of gnd for a number of Salmonella serogroups
|
|
View this table: [in a new window] |
TABLE 4. Comparison of GDP-mannose pyrophosphorylases and phosphomannomutases from serogroups B, C1, C2, and O:6,14
|
|
View this table: [in a new window] |
TABLE 5. Amino acid sequence homologies between Salmonella mannosyl transferases
|
![]() View larger version (57K): [in a new window] |
FIG. 2. PCR amplification products with serogroup O:6,14-specific primers using template DNA from different S. enterica serogroups. Lanes 1 and 8 contain a 100-bp ladder (Invitrogen, Carlsbad, Calif.). Lanes 2, 5, and 6 represent serogroup H strains; lane 3 represents a serogroup B strain, lane 4 represents a serogroup C2 strain, and lane 7 represents the negative control.
|
|
|
|---|
Mannose biosynthetic genes.
manB, encoding phosphomannomutase, and manC, encoding mannose-1-phosphate guanylytransferase, are responsible for the biosynthesis of GDP-mannose from mannose-6-phosphate (13). Mannose is present in the O antigen side chain of serogroup O:6,14 (22), and the rfb gene cluster of this serogroup contained two genes with sufficient similarity to identify them as manB and manC, respectively. At the amino acid level, serogroup O:6,14 manC (ORF 2.33) had 58% identity to manC from serogroup C1 and 57% identity to manC from serogroup B. Serogroup O:6,14 manB (ORF 3.72) had 58 and 57% amino acid identity to the manB genes from serogroups C2 and B, respectively, with much lower homology to manB from serogroup C1 (14.9%). Both genes from the S. enterica serotype Carrau strain had 100% identity to a previously published sequence of manC and manB from S. enterica serotype Carrau (13). Two genes from the cps gene cluster, cpsB and cpsG, are considered isogenes to manC and manB (37). The serogroup O:6,14 manC and manB genes share a similar level of identity to the respective isogenes of serogroup B, as previously reported for LT2 (37).
Mannosyl transferase genes.
ORF 0.13 of serogroup O:6,14 shared 47% identity with wbaC (ORF 6.17) from S. enterica serotype Choleraesuis (serogroup C1). Both ORFs have a low G+C content; 29% for serogroup O:6,14 and 30% for serogroup C1, respectively. ORF 1.12 shared 49% identity with wbaD (ORF 7.17) in serogroup C1 (S. enterica serotype Choleraesuis), and again they have a low G+C content: 33% for serogroup O:6,14 and 31% for serogroup C1. The wbaC and wbaD genes from the rfb gene cluster of serogroup C1 are thought to encode the mannosyl transferases that assemble the serogroup C1 O antigen subunit (18). Until now, however, the sequences of these genes had shown little similarities to the rfb region of other Salmonella. Although mannosyl transferases have been identifed in rfb regions from serogroups A, B, D, C2, and E1 (19, 20), they share little homology to wbaC and wbaD (Table 5), suggesting that different specific functions are performed by these transferases from the different serogroups.
O antigen transport and polymerase genes.
Both ORFs 5.33 and 6.67 had no significant homology to any sequences in GenBank. Proteins Wzx and Wzy from different O antigen gene clusters have little sequence homology even at the amino acid level; however, their predicted structural homology (multiple transmembrane domains) has been used to putatively identify them. Analysis with the TMpred program (12) showed that ORFs 5.33 and 6.67 contained 13 and 12 transmembrane helices, respectively (Fig. 3A and B). As is characteristic of flippases and O antigen polymerases, both of these predicted proteins are hydrophobic.
![]() View larger version (30K): [in a new window] |
FIG. 3. Suggested transmembrane topology for Wzy (A) and Wzx (B).
|
O antigen polymerases share a characteristic secondary structure of multiple transmembrane regions and a large cytoplasmic loop. Serogroup H wzy encoded a protein predicted to have a similar secondary structure. However, there was no sequence homology between this ORF and any of the other known O antigen polymerases. When the predicted protein of ORF 5.33 was used to then search databases with PROPSEARCH (11), a program designed to find putative protein families for protein sequences with little sequence identity, 87% reliability was found, linking ORF 5.33 to the O antigen polymerases of S. enterica serogroup C2, E. coli K-12, and Shigella flexneri. Other features common to Wzy-like polymerases include similar amino acid compositions and codon usage (25). The predicted protein from ORF 5.33 had a high content of leucine, isoleucine, and phenylalanine and contained a high percentage of rare or modulating codons, 12.9% (10), as seen in other wzy genes (25). Thus, wzy has many features in common with other wzy genes, but definitive identification requires further experimental validation.
Other features of the rfb region.
The intergenic region between the last rfb gene and gnd was much larger in serogroup O:6,14 than in the other Salmonella serogroups (Table 3). A 103-base sequence starting 3 bp downstream from the end of the wzx gene shared 89 and 82% homology to the IS3-like insertion element IS1230B of E. coli (32) and S. enterica serotype Enteritidis (5), suggesting that it is the remnant of an IS3-like element. Sequences with homology to IS3 have been previously reported in Salmonella, located near the invH gene of S. enterica serotype Choleraesuis (9) and the sef operon of S. enterica serotype Enteritidis (5). The occurrence of an IS5 element located about 500 bp upstream from the gnd promoter in E. coli K-12 has been shown (15) and was thought to play a role in gene expression. Analysis of the promoter region of the gnd revealed that the -10 region of the gnd gene (AGGAG) is identical to the corresponding region of gnd in S. enterica serotype Typhimurium and E. coli K-12, and the 30 bp upstream of the gnd share between a 96 and 100% homology to the corresponding region in S. enterica serotypes Typhimurium and Choleraesuis and E. coli K-12 and O111.
G+C content.
Although bacterial species display large variation in their overall G+C content, the genes within a particular bacterial species' genome are usually relatively similar in base composition (27). However, S. enterica rfb gene clusters that have been sequenced to date all contained segments of different G+C content, which are thought to have evolved in organisms of largely low G+C content that have been independently acquired and incorporated into the same Salmonella locus (31). Consistent with this, Shepard et al. (36) reported interspecies transfer of an entire O antigen gene cluster via a plasmid. All of the genes within the serogroup O:6,14 gene cluster had a low G+C content ranging from 28.9 to 37.4% (Table 1), which is significantly lower than the 51% GC content seen in the majority of S. enterica coding sequences, suggesting that all of the genes in the serogroup O:6,14 rfb gene cluster were also acquired by transfer from a different species.
Within a bacterial species, codon positions 1, 2, and 3 have a characteristic base composition, with differences due to biases in the mutation rates for the 4 bases (26, 39). Consequently, each species has a characteristic G+C content for each codon position and specific codon preferences. A positive, linear relationship exists between the genomic base composition and the G+C content for each codon position (26), with the effect greatest at position 3, where most changes are synonymous and not under strong selective constraints. Therefore, if base composition and codon usage patterns are primarily caused by mutational biases, horizontally acquired DNA will often have unusual sequence characteristics that distinguish it from ancestral DNA. At the time of introgression, newly acquired DNA will reflect the base composition of the donor genome. However, the introgressed genes will be subject to the same mutational pressures as the recipient genome, and so over time, these sequences will change or "ameliorate" to reflect the base composition of the new genome (17). This is most evident at sites with few functional constraints, such as codon position 3. The P values for the corrected average G+C contents for codon positions 1, 2, and 3 (P1, -2, and -3) (39) are shown in Table 1. These properties are similar to those of the other Salmonella serogroups that have been sequenced, with the P3 value being the lowest. Given that the average G+C content at P3 is 58% for E. coli and S. enterica serotype Typhimurium, the observed P3 value of 23.8% for the serogroup O:6,14 rfb gene cluster suggests that this region is still in the process of amelioration.
Models of amelioration can been used to estimate the time of introgression of foreign genes into a chromosome. The G+C content of the third codon positions can each be back-ameliorated until the sequences conform to the Muto and Osawa relationships (25), providing estimates of both time of introgression and nucleotide composition of the donor genome (15). Unfortunately, however, rfb genes are not good candidates for amelioration because they have seen too many mutational contexts. Moreover, there is likely too little information to use: the amelioration algorithm becomes reliable only when the DNA sequence under scrutiny exceeds 15 to 20 kb; below that, the estimates get very large variances. Wang and Reeves (44) recently applied this algorithm to back-ameliorate the E. coli O111 and S. enterica O35 rfb gene clusters, which they showed to be highly conserved. They found the program did not give meaningful data and suggested the reason was because the genes of interest deviated too much from the Muto and Osawa models or incorrect parameters were used in the amelioration program.
Comparison of S. enterica serotypes Sundsvall and Carrau.
Only minor sequence differences were identified for the rfb region of S. enterica serotypes Sundsvall (O:6,14,25) and Carrau (O:6,14,24), indicating that O antigen factors 24 and 25 are encoded outside the rfb gene cluster, possibly by phage or plasmid. A number of Salmonella O antigens have been reported to be bacteriophage encoded, such as factors O1 (41) and O34 (46). A plasmid-encoded rfb gene cluster has also been reported necessary for biosynthesis of the O54 antigen in S. enterica serotype Borreze (16).
PCR for identification of serogroup O:6,14 isolates.
A PCR based on wzx was developed for the rapid identification of Salmonella serogroup O:6,14. Several serotypes of this serogroup were among the top 20 emerging serotypes in a study investigating the trend in Salmonella serotypes isolated from humans in the United States from 1987 to 1997 (28). Together with the serogroup PCR assays that have been previously described (23), we intend to identify additional rfb targets specific for other Salmonella serogroups (8). The overall goal is to combine this serogroup identification system with DNA targets specific for each H antigen (J. R. McQuiston, M. Ortiz-Rivera, L. Gheesling, F. Brenner, and P. I. Fields, submitted for publication) to establish a comprehensive DNA-based scheme for identification of the major Salmonella serotypes, without the need for serological testing (7). This will allow a rapid and convenient alternative for identification of Salmonella serotypes attainable by nonspecialized laboratories.
|
|
|---|
34. J. Bacteriol. 105:927-936.
This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»