Previous Article | Next Article ![]()
Applied and Environmental Microbiology, February 2004, p. 1160-1168, Vol. 70, No. 2
0099-2240/04/$08.00+0 DOI: 10.1128/AEM.70.2.1160-1168.2004
Copyright © 2004, American Society for Microbiology. All Rights Reserved.
Department of Ecology and Evolutionary Biology, University of Colorado, Boulder, Colorado 80309
Received 2 September 2003/ Accepted 28 October 2003
|
|
|---|
|
|
|---|
Integrons are HGT systems containing elements necessary for site-specific recombination and expression of foreign DNA (17, 45). Integrons consist of two parts: (i) the stationary integron platform, including the integrase gene (intI), a strong promoter (pANT), and a recombination site (attI); and (ii) the mobile gene cassettes, which are promoterless open reading frames (ORFs) with a recombination site (attC) (Fig. 1A). IntI catalyzes a site-specific recombination event between the attI and attC sites and integrates or excises gene cassettes. Gene cassettes that are integrated into the integron platform are expressed from the promoter pANT (Fig. 1B).
![]() View larger version (12K): [in a new window] |
FIG. 1. (A) Cartoon showing the stationary integron platform containing the intI gene, attI recombination site, and the pANT promoter and the mobile gene cassette containing an ORF and attC site (shaded circle). The recombination site is marked with an X. (B) MR integron with one gene cassette. (C) SI with several gene cassettes and homogenous attC sites. Gene cassette primer sites are indicated with arrows.
|
Recently, genome sequence analysis of Vibrio cholerae led to the discovery of a superintegron (SI) (5, 32). SIs are found on bacterial chromosomes and consist of an array of gene cassettes adjacent to the integrase gene. They differ from MR integrons because they contain many gene cassettes, the activities of the gene cassettes are not limited to antibiotic resistance, and the attC sites associated with the arrays are homogenous (for reviews, see references 5, 32, and 44 to 46) (Fig. 1C). SIs are also found in other species of Vibrio (8) and in an assortment of Pseudomonas species (20, 55). Genome sequencing projects have also uncovered SI intI-like genes in Gamma-, Delta-, and Betaproteobacteria and in the spirochete Treponema denticola (45).
Integrons are found in several major lineages of bacteria, and the view of integron diversity is expanding. There are currently 32 unique integron integrase genes available in GenBank and various sequencing projects, an eightfold increase in the past 3 years. However, the culture-based methods and mostly pathogenic Proteobacteria used for the study of integrons leave two central questions unanswered: (i) what is the phylogenetic distribution of these elements? and (ii) what types of genes can be transferred by integrons?
Soil microbial communities contain phylogenetically diverse organisms living in an array of niches and present an ideal environment to investigate these questions. Molecular techniques, typified by 16S rRNA gene analysis, have revolutionized environmental microbiology and provide the tools necessary to explore microbial communities without the biases associated with cultivation (3, 39). Recently Nield et al. (35) developed degenerate primers for integrons and identified three new classes of integrons from soil microbial communities. The same group (49) also designed primers targeted to the attC recombination sites of gene cassettes. These primers amplify genes flanked by two recombination sites, an arrangement present in any multicassette integron (Fig. 1C). Their study revealed gene cassette diversity in environmental samples. However, most sequences from that study did not share significant sequence similarity with known genes, and the functions of these gene cassettes are unknown.
In this study, we used molecular techniques to examine the diversity of integrons in heavy-metal-contaminated mine tailings. We chose this environment because it is under intense selection pressure for traits other than antibiotic resistance. We used both published primers and newly designed primers to uncover 14 new integron integrase genes and 11 new gene cassettes from this environment. One of the gene cassettes that we discovered is similar to a gene that codes for a step in a pathway for nitroaromatic catabolism. We also discuss the evolution of chromosomally associated SIs by statistically comparing the phylogenies of 16S rRNA and SI integrase gene trees from the same organisms, using sequences available from GenBank and sequencing projects. We found significant differences between the organismal (16S rRNA) and integrase trees, and we suggest that these differences may be due to HGT.
|
|
|---|
DNA extraction and clone libraries.
DNA was extracted from the tailings using a modification of the protocol described by Zhou et al. (56). Five grams of soil was added to 10 ml of buffer (100 mM Tris-HCl [pH 8.0], 100 mM EDTA [pH 8.3], 100 mM phosphate buffer [pH 8.0], 1.5 M NaCl, 1% cetyltrimethylammonium bromide), 50 µl of proteinase K (20 mg/ml), 60 µl of lysozyme (100 mg/ml), and 9 µl of RNase (10 mg/ml). Samples were incubated at 37°C with shaking at 80 rpm for 30 min. A 1.5-ml volume of 20% sodium dodecyl sulfate was added, and the tubes were gently agitated and then incubated at 65°C for 2 h. Samples were centrifuged at 3,850 x g for 10 min. The supernatant was removed and extracted twice with phenol-chloroform-isoamyl alcohol (25:24:1). DNA was precipitated with 0.6 volumes of isopropanol and washed with 1 ml of 70% ethanol. Four separate DNA extractions were pooled and purified over Sepharose 4B (Sigma, St. Louis, Mo.) packed columns as described by Jackson et al. (24).
Approximately 30 ng of DNA was amplified with a variety of primer sets (Table 1). 16S rRNA genes were amplified with 27f and 1492r (28), integrase genes were amplified with int1.F/R and intlld F/R, and gene cassettes were amplified with HS286 and HS287. The reaction conditions consisted of a 400 nM (27f/1492r and int1.F/R) or 4 µM (intltdF/intltdR and HS286/HS287) concentration of each primer, a 200 µM concentration of each deoxynucleoside triphosphate, and 1.25 U of Taq DNA polymerase (Promega, Madison, Wis.) in Taq DNA polymerase buffer containing MgCl2 (Promega). After an initial denaturation step at 94°C for 1 min, 35 cycles of 94°C for 1 min, 58°C for 30 s, and 72°C for 2.5 min with a terminal 10-min extension at 72°C were performed. PCR products from the 16S rRNA and integrase gene amplifications were gel purified and ligated into the vector TOPO 2.1 (Invitrogen, Carlsbad, Calif.) and transformed into Escherichia coli cells following the manufacturer's instructions. Gene cassette PCR products were ligated into the pGEM 2.1 vector (Promega) and transformed into E. coli following the manufacturer's instructions. For each cloning reaction, 96 colonies were selected for plasmid extraction.
|
View this table: [in a new window] |
TABLE 1. Primers used in this study
|
Sequence and phylogenetic analysis.
Sequences were edited in Sequencher 4.1 (Gene Codes Co., Ann Arbor, Mich.) and subjected to BLAST (2) or BLASTX searches for protein sequences. 16S rRNA gene sequences were subjected to chimera check in RDP (9) and aligned in an ARB database (http://www.arb-home.de/). Closely related sequences from the ARB database and from BLAST searches were used as reference taxa for phylogenetic analyses, and two archaeal sequences from the ARB database were used as an outgroup. We selected putative integron integrase genes by screening for the presence of the integron integrase-specific insertion (36) and aligned these sequences with other integron integrase proteins and the XerC outgroup (13) in ClustalX.
Alignments were subjected to Bayesian phylogenetic analysis as implemented in MRBAYES (21). Separate analyses were performed for all data combined (environmental genes and published gene sequences collected from GenBank and various sequencing projects) and for reduced taxa data sets for which both intI and 16S rRNA genes were available from the same organisms. For the IntI amino acid data, Bayesian analysis employed the Jones model of sequence evolution (26). For the 16S rRNA data, we used the GTR+ gamma model of evolution. For the IntI analyses, 250,000 generations were run and trees were sampled every 100 generations. For the 16S rRNA analyses, 1,000,000 generations were run and trees were sampled every 100 generations. Burn-in values were determined by plotting the likelihood scores against generation number and retaining trees for which stationarity was evident. In addition, all alignments were subjected to phylogenetic analyses in PAUP* (version 4; D. L. Swofford, Sinauer Associates, Sunderland, Mass.) using both the maximum parsimony optimality criterion and the neighbor-joining tree-building algorithm. Maximum parsimony and neighbor-joining phylogenetic inferences were subject to bootstrap analyses with 1,000 replicates. Finally, all phylogenies were tested with collections of outgroup sequences to confirm the robustness of these estimates.
We examined the relationship between the organismal and SI integrase trees by comparing inferred phylogenies of SI integrase and 16S rRNA genes from the same organisms, using sequences from GenBank and various sequencing projects. IntI genes are available for two subspecies (BAM and Q) of Pseudomonas stutzeri and for two subspecies (badrii and campestris) of Xanthomonas campestris; however, individual 16S rRNA genes are not available for these organisms. P. stutzeri integrase genes are nearly identical (<0.05% difference in amino acid sequence), and so we arbitrarily selected one integrase gene, Q, to represent this lineage. X. campestris genes are also nearly identical, and we selected the subspecies campestris integrase gene to represent this lineage because there was more sequence information available for this gene. A test of concordance between 16S rRNA gene and IntI protein trees was accomplished using the Shimodaira-Hasegawa test (48) and the Wilcoxon signed-rank test (52). These tests compare the likelihood and parsimony, respectively, scores of alternative trees using both the 16S rRNA gene and integrase data. Finally, the extent that the two genes recorded the same evolutionary history was estimated by determining agreement subtrees (implemented in PAUP* [version 4; Sinauer Associates]).
For the gene cassette analysis, sequences were determined to be cassettes if (i) they possessed the eight invariant residues of attC sites (50), (ii) the ends were flanked by two putative IntI-like simple sites including complementary 1R and 1L recombination sequences (51), and (iii) the recombination sites flanked an ORF of greater than 80 amino acids. For some sequences, the stop codon was derived from the 1L sequence. Putative translated sequences were subjected to pBLAST searches, and matches were considered significant if the e value was <0.001.
Nucleotide sequence accession numbers.
The GenBank accession numbers for the gene cassettes are AY271679 to AY271689. The accession numbers for the 16S rRNA gene sequences are AF337861 to AF337888 and AY274120 to AY274164. The accession numbers for the integrase gene sequences are AY283623 to AY283638.
|
|
|---|
|
View this table: [in a new window] |
TABLE 2. Metal concentrations in the tailings used for this study
|
![]() View larger version (27K): [in a new window] |
FIG. 2. A 50% majority rule consensus tree of 16S rRNA genes derived from Bayesian phylogenetics. An asterisk indicates a node with a Bayesian posterior probability of >0.95, maximum parsimony bootstrap support of >80, and neighbor-joining bootstrap support of >80. The sequences from this work are the D series. The tree is rooted with Methanosarcina acetivorans (M59137) and Natronobacterium chahannaoensis (AJ004806). Branch lengths are drawn proportional to the amount of evolution based on uncorrected genetic distances. Accession numbers are as follows: Telluria chitinolytica, X65590; Nitrosovibrio tenuis, M96405; Thiobacillus thioparus, M79426; Methylophilus methylotrophus, L15475; X. campestris ATTC, 339113; WD260, AJ292673; O. anthropi, D63837; MNF4, AF292996; MNG7, AF292997; 19514, AF097791; Sphingomonas sp. strain JSS-28, AF031240; Flavobacterium ferrugineum, M28237; Microscilla sericea, M58794; 49511, AF097805; SBR1071, AF268996; SBR2013, AF269000; C105, AF013530; C002, AF013515; 611, Y11629; ii3_15, Z95725; iii1_8, Z95729; Pelobacter acetylenicus, X70955; Acidimicrobium ferrooxidans, U75647; BA149, AF323777.
|
![]() View larger version (42K): [in a new window] |
FIG. 3. A 50% majority rule consensus tree of IntI proteins derived from Bayesian phylogenetics. The sequences from this work are the I series, and numbers in parentheses denote the number of identical sequences obtained in this study. The tree was rooted using XerC from E. coli (P22885) and Salmonella enterica serovar Typhimurium (AAF33443) (14). Branch lengths were drawn proportional to the amount of evolution based on uncorrected genetic distances. An asterisk indicates a Bayesian posterior probability of >0.95, maximum parsimony bootstrap support of >80, and neighbor-joining bootstrap support of >80; + indicates a Bayesian posterior probability of >0.95 and either maximum parsimony bootstrap support of >90 or neighbor-joining bootstrap support of >90. Novel lineages are indicated with arrows. Accession numbers (when available) and sequencing project homepages are as follows: S. putrefaciens, AAK01408; S. oneidensis, MR-1 (The Institute for Genomic Research [TIGR] website [http://www.tigr.org]); IntI9, AAK95987; N. europaea (DOE Joint Genome Initiative website [http://www.jgi.doe.gov]); IntI1, AAM89398; IntI3, AAO32355; G. sulfurreducens (TIGR website); P. stutzeri BAM, AAN16071; P. stutzeri Q, AAN16061; P. alcaligenes, AAK73287; X. campestris pv. campestris, AAK07444; X. campestris pv. badrii, AAK07443; Xanthomonas sp. strain CIP, AAK07447; T. denticola (TIGR website); IntI6, AAK00307; IntI7, AAK00305; IntI8, AAK00304; Gemmata obscuriglobus (TIGR website); V. fischeri, AAK02079; Microbulbifer degradans (DOE website); V. salmonicida, CAC35342; V. cholerae, NP_232687; V. mimicus, AAD55407; V. metschnikovii, AAK02074; V. vulnificus CMCP6, AAO10775; V. vulnificus, AAN33109; Listonella pelagia, AAK02082; Vibrio natriegens, AAO38263; V. parahaemolyticus, AAK02076; G. metallireducens (Environmental Biotechnology Center, University of Massachusetts website [http://zdna.micro.umass.edu/]); IntI_IEL, AAN16072.
|
![]() View larger version (28K): [in a new window] |
FIG. 4. Gene trees for the SI integrase and 16S rRNA genes. Sequences were obtained from GenBank and various sequencing projects. Trees are majority rule consensus trees determined using Bayesian analysis of the amino acid (IntI) and DNA (16S rRNA) alignments. An asterisk indicates a Bayesian posterior probability of >0.95, maximum parsimony bootstrap support of >80, and neighbor-joining bootstrap support of >80. A + indicates a Bayesian posterior probability of >0.95 and either maximum parsimony bootstrap support of >90 or neighbor-joining bootstrap support of >90. Thick lines indicate shared cospeciation events for the 16S rRNA gene and integrase trees. The 16S rRNA gene tree was rooted using Methanococcus jannaschii (L77117). The tree was rooted using XerC from E. coli (P22885) and S. enterica serovar Typhimurium (AAF33443) (14). Accession numbers for integrase genes are presented in the legend to Fig. 3. 16S rRNA gene accession numbers were as follows: G. obscuriglobus, X54522; T. denticola, M71236; N. europaea, AF037106; X. campestris, X95917; Microbulbifer degradans, AF055269; P. stutzeri, U65012; P. alcaligenes, D84006; S. oneidensis, AF039055; S. putrefaciens, X81623; V. salmonicida, X70643; V. fischeri, X70640; V. metschnikovii, X74712; Listonella pelagia, X74722; V. natriegens, X74714; V. parahaemolyticus, M59161; V. vulnificus, X76334; V. cholerae, X76337; V. mimicus, X74713; G. metallireducens, L07834; G. sulfurreducens, U13928.
|
|
View this table: [in a new window] |
TABLE 3. Structures of gene cassettes from heavy-metal-contaminated mine tailingsa
|
|
|
|---|
The bacterial community in a metal-contaminated site.
The bacterial divisions represented in the mine tailings (Fig. 2), Proteobacteria, Bacteroidetes, Acidobacteria, Actinobacteria, and candidate division TM7, are common in other, noncontaminated and contaminated soil surveys (7, 12, 22). We found no evidence for the existence of clades unique to the contaminated site. However, we did find a significant number of sequence repeats: of the 67 rRNA genes sequenced, only 40 were unique. This lack of diversity has been reported from other contaminated soils (7, 43). There were several sequence repeats related to O. anthropi, a sequence from the genus Sphingomonas, a sequence that falls in the candidate division TM7, and a sequence that falls in the Bacteroidetes. Organisms from the Sphingomonas genus are common in contaminated soils (42, 43). O. anthropi is known for the ability to utilize a variety of nitrogen sources, including D and L amino acids (4), urea- formaldehyde (25), and methylammonium (14). This result may suggest the presence of unusual nitrogen sources in these tailings. The other two highly repeated sequences have no known cultured close relatives and, therefore, the significance of their presence in this sample is unknown.
Diversity of integrons in a metal-contaminated site.
In this study we discovered 14 new integron integrase genes (Fig. 3). Nield et al. (35) used a similar molecular approach and identified three new integrase genes from several environments. Although our primers amplify only 444 nucleotides of intI genes and, unlike the primers used by Nield et al. (35), they do not amplify associated gene cassettes, the primer sets used here amplified a more diverse array of genes.
Currently, there are 32 unique integron integrase genes available in GenBank and various sequencing projects, and 22 of these genes are found exclusively in the Gammaproteobacteria. Our 16S rRNA gene library (Fig. 2) included only one sequence that was closely related to an organism known to harbor an integron (X. campestris) and only included four Gammaproteobacteria-like sequences. This may suggest that some of the 14 new integrase genes that we sequenced (Fig. 3) are from organisms previously not known to contain integrons. Although all IntI sequences from our library form a clade with known integron integrases that is exclusive of the XerC outgroup, the relationships between these integrase genes are largely unresolved (Fig. 3). Therefore, it is difficult to speculate on the phylogenetic origins of these integrase genes. More integrase genes from known organisms need to be sequenced and the evolution of integrase genes (see below) needs to be better understood before we can make inferences about the phylogenies of these previously undescribed genes.
Evolution of SI integrase genes.
We further examined the relationship between the organismal and SI integrase trees by comparing inferred phylogenies of SI integrase and 16S rRNA genes from the same organisms, using sequences from GenBank and various sequencing projects (Fig. 4). We found some shared speciation events between the IntI and 16S rRNA gene trees (Fig. 4), including support for the clustering of P. stutzeri and P. alcaligenes, support for the clustering of Shewanella oneidensis and Shewanella putrefaciens, and support for the Vibrio integrase clade (excluding V. fischeri). However, a statistical comparison of trees derived from 16S rRNA and SI integrase genes from the same taxa allowed us to reject the hypothesis that SI integrases track organismal phylogeny. Both HGT and the sampling of paralogous genes could explain the lack of concordance between the two gene trees.
We cannot exclude that other, paralogous intI gene families exist. Indeed, there is support for gene duplication and translocation to plasmids in Vibrio vulnificus (Fig. 3). However, we note that this gene duplication is likely to be a recent event, because the sequences are nearly identical and are therefore unlikely to explain the lack of concordance between the integrase and 16S rRNA gene trees. In addition, some of the integrase gene sequences we used are from entire genome sequences (e.g., N. europaea, V. cholerae), and there were no paralogous integrase genes found. Other integrase genes, however, are from partial genome sequences or pure-culture studies that may have missed paralogous integrase genes.
HGT of integrase genes could also explain the differences between the integrase and 16S rRNA gene tree. For example, V. fischeri is "out of place" on the integrase gene tree, clustering with M. degradans and falling outside the well-supported Vibrio clade (Fig. 4). IntI genes are likely to have been present in the ancestor of the entire Vibrio clade, because these genes largely mirror the organismal (16S rRNA gene) tree (Fig. 4). Therefore, it is possible that the ancestor of V. fischeri lost its Vibrio-type integrase gene and inherited a divergent integrase gene by HGT. The phylogeny of Geobacter sulfurreducens and Geobacter metallireducens integrase genes is also different from the organismal relationships. Comparison of 16S rRNA genes from these two organisms indicates that they have approximately the same number of pairwise differences as Vibrio parahaemolyticus and Vibrio metschnikovii yet, unlike V. parahaemolyticus and V. metschnikovii, their integrase genes do not cluster together (Fig. 4). It is possible that integrase genes are evolving more quickly in the Geobacter lineage. However, HGT could also explain the difference between the Geobacter 16S rRNA genes and integrase gene trees. For example, an integrase gene could have been present in the ancestor of the two Geobacter species, and this gene could have been lost in one lineage and another, more distant integrase gene gained. Alternatively, both lineages could have gained different integrase genes after their divergence. More Geobacter lineages should be sampled for integrase genes to test these two hypotheses.
Rowe-Magnus et al. (45) suggested that SIs were present in the ancestor of the Proteobacteria. If we assume that X. campestris marks the most basal lineage of Gammaproteobacteria sampled (as suggested by the 16S rRNA tree), then its basal placement on the IntI tree may indicate that SIs were present in the ancestor of the Gammaproteobacteria (45). We surveyed for the presence of integrons in the published, completed genome sequences available in GenBank. Surprisingly, we failed to detect integron integrase genes in many Gammaproteobacteria, even in organisms with close relatives that contain integrons. For example, neither P. aeruginosa nor Pseudomonas putida has intI genes, yet other species of Pseudomonas are known to contain integrons (20, 55). Furthermore, we only found intI genes in 5 of the 25 sequenced Gammaproteobacteria. Assuming an ancient origin, the apparent lack of integrons in many Gammaproteobacteria suggests that integrons have been lost numerous times in diverging lineages. The implications of this inference remain to be explored.
Selection pressures and gene cassettes.
Unlike Stokes et al. (49), we found many repeats in our gene cassette library. Although this could be due to PCR bias or lower diversity of the microbial community, it could also be a reflection of selection and could imply that these genes are important in this environment. Further support for the importance of these repeated gene cassettes came from examination of the associated recombination sites. In many cases, the ORF was entirely conserved while the recombination sites were not complementary, preventing them from being recognized by the integrase enzyme. If these gene cassettes were essential for survival, this would select for the loss of these recombination sites to prevent gene excision.
Most of the gene cassettes that we sequenced were related to genes of unknown function (Table 3). For each sequence that had a significant pBLAST hit, we searched the DNA adjacent to the matches, looking for integrase genes or integron recombination sites. With the exception of HMIC7, which is related to an SI cassette, we found no indication that these genes were also located on integrons. We sequenced four repeats of a cassette related to a gene of known function, hydroxylaminobenzene mutase. Homologues of this gene have been found on other mobile gene elements, including a plasmid in P. putida (40) and a transposon in Pseudomonas pseudoalcaligenes JS45 (11, 18), and our results suggest that it is also mobilized by integrons. Hydroxylaminobenzene mutase catalyzes a reduction reaction in the catabolism of nitroaromatics (40). Nitroaromatics are xenobiotic compounds that are used as explosives and solvents, both of which are used in gold extraction. The ability to catabolize nitroaromatics may be a metabolic advantage, or it may be important in detoxification. This suggests that integrons may be important for gene transfer in response to selective pressures other than the presence of antibiotics.
We thank Kevin Shiley for help with construction and sequencing of the 16S rRNA gene library and Hatch Stokes for the sequence of primer HS287. We also thank members of the Schmidt and Martin labs and three anonymous reviewers for insightful comments on the manuscript.
|
|
|---|
This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»