Previous Article | Next Article ![]()
Applied and Environmental Microbiology, December 2005, p. 8846-8854, Vol. 71, No. 12
0099-2240/05/$08.00+0 doi:10.1128/AEM.71.12.8846-8854.2005
Copyright © 2005, American Society for Microbiology. All Rights Reserved.
Mary E. Lidstrom,1,3 and
Ludmila Chistoserdova1*
Departments of Chemical Engineering,1 Biology,2 Microbiology, University of Washington, Seattle, Washington 981953
Received 20 June 2005/ Accepted 7 September 2005
|
|
|---|
|
|
|---|
|
|
|---|
Sediment sample collection.
Sediment samples were collected on 24 July 2003 and on 11 January 2005 from a 63-m deep station in Lake Washington, Seattle, Wash. (47°38.075'N, 122°15.993'W), as described previously (11). The former sample was used for constructing fosmid libraries, and the latter was used for RNA extraction.
Metagenome library construction.
DNA isolation from the sediment was carried out as described before (11, 13). The DNA was electrophoresed in 1% low-melting-point agarose (Invitrogen), and the fraction representing fragments of approximately 40 kb was excised from the gel and recovered using agarase, as directed by the manufacturer (Fermentas). The resulting DNA fragments were cloned into the CopyControl pCC1FOS vector (Epicenter) as instructed by the manufacturer. The ligated DNA was packaged into MaxPlax lambda (Epicenter) and transfected into the EPI300-T1 plating cells, as instructed by the manufacturer. Fosmid-containing E. coli colonies were selected on Luria-Bertani (LB) solid medium supplemented with chloramphenicol (15 µg/ml). The packaging extract was titered and appropriately diluted to yield 1,000 E. coli colonies per plate. Library 1 was constructed by pooling approximately 36,000 colonies resulting from multiple transfections, as described below. To construct library 2, a total of 36 separate transfections were performed, resulting in a metagenomic library consisting of 36,000 clones distributed between 36 separate plates (1,000 colonies per plate). Colonies from each of the 36 plates were washed off with LB medium as pools, and cells were precipitated and resuspended in a minimal medium (9). Each pool was divided into two aliquots, one of which was frozen after adding 10% (vol/vol) dimethyl sulfoxide and stored at 80°C, while the second was used for DNA extraction using the QIAGEN Miniprep kit, as instructed by the manufacturer.
DGGE analysis.
Fragments of the small subunit rRNA gene of approximately 195 and 585 bp, respectively, were PCR amplified from each of the 36 pools using the 341fGC/536r and 341fGC/926r primer pairs (21). PCR amplifications were carried out under the following conditions: 95°C for 3 min, followed by 25 cycles of 95°C, 55°C for 40 s, and 72°C for 1.5 min, with a final extension for 10 min. Denaturing gradient gel electrophoresis (DGGE) was performed using the DGene system (Bio-Rad). Aliquots (20 µl) of the PCR products were loaded onto 10% acrylamide gel (37.5:1; Bio-Rad) containing a linear gradient of formamide-urea from 30 to 60%. Three reference samples of 16S rRNA gene fragments amplified from E. coli EPI300 were included per each gel. Gels were run for 15 h at 60 V in 0.5x TAE electrophoresis buffer (18). After electrophoresis, gels were soaked in ethidium bromide solution for 30 min, illuminated in UV, and manually analyzed.
Analysis of fae and fhcD genes in the metagenomic library.
DNA preparations isolated from each of the 36 pools of library 2 were used as templates to PCR amplify fae and fhcD genes. Details of the amplification protocols have been described previously (11, 13). The resulting PCR products were cloned into the pCR2.1 vector, and three to five randomly selected clones from each cloning were sequenced by using the M13F primer and the sequencing kit BigDye3.1 (Applied Biosystems) according to the manufacturer's instructions. Sequence analysis was carried out by the Department of Biochemistry DNA sequencing facility at the University of Washington, using an ABI 3700 high-throughput capillary DNA analyzer.
DNA-DNA hybridization.
Library 1 and pool 10 of library 2 were appropriately diluted and plated onto solid media to produce single colonies. Clones were manually arrayed on nylon filters, and the filters were treated as previously described (18). DNA probes were labeled with dCT32P by using the Random Primed DNA Labeling Kit (Roche). Hybridizations were carried out overnight at 45°C in 30 ml of hybridization buffer (2x SSC [1x SSC is 0.15 M NaCl plus 0.015 M sodium citrate], 5x Denhardt solution, 20% formamide, 0.1% sodium dodecyl sulfate) containing 50 µl of the labeled probe. Filters were then washed in 0.5x SSC-0.1% sodium dodecyl sulfate buffer three times for 15 min at 50°C and then dried and exposed to X-ray film (Kodak). Clones identified as positive were inoculated into 100 ml of liquid LB medium, grown to early exponential phase (A600 = 0.4), and induced by 1 ml of the CopyControl induction solution (Epicenter) for 5 h. Fosmid DNA was extracted by alkaline lysis and ethanol precipitation, as previously described (18). DNA was resuspended in 0.3 ml of H2O and further purified by using the QIAGEN PCR purification kit according to the manufacturer's instructions.
RNA extraction and RT-PCR amplification.
RNA from the sediment sample was extracted as described previously (16), with the following modification. An additional purification step was carried out after the DNase I digestion step, using the RNeasy columns (QIAGEN). Reverse transcription-PCR (RT-PCR) amplifications were carried out by using the One-Step RT-PCR kit (QIAGEN) and 0.2 µg of RNA. Reaction mixtures were incubated at 60°C (fhcD and fae) or 55°C (mch) for 2.5 h, followed by 15 min of denaturation at 96°C and then 45 cycles of 95°C for 40 s, 60° or 55°C for 40 s, and 72°C for 1.5 min, with a final extension at 72°C for 10 min. The resulting PCR products were cloned into the pCR2.1 vector, and 20 randomly selected clones from each cloning were sequenced by using the M13F primer and the BigDye3.1 kit. To increase specificity of RT-PCR amplifications targeting divergent fhcD, mch, and fae genes, new primer sets were designed as follows. All available sequences of divergent fhcD, mch, and fae were aligned by using the AlignX program of the VectorNTI package (Invitrogen). Regions of high conservation were identified and the following primers were designed for RT-PCR amplification: fae, fae-NGf (5'-CACACATCGACCTGATCATSGG-3'), and fae-NGr (5'-GGATGAAVACGCCGACCAGGA-3'); fhcD, fhcD-NG58f (GAGGCYTTCGACATGCGSGCGG-3'), and fhcD-NG895r (5'-GGAAGTGGTGCTTSCCGAG-3'); and mch, mch-NG422f (5'-GGCCTCSCAGTACGCCGGCTGGG), and mch-NG1000r (5'-GGGATCGAYCTTRTAGAAGTC-3'). The new primer sets were tested on fosmid DNA pools, as well as on total DNA isolated from the Lake Washington sediment, with positive results (data not shown). As negative controls, DNA samples isolated from cultured proteobacteria were used, with negative results (data not shown).
Fosmid insert sequencing and sequence annotation.
Three chosen fosmid inserts were entirely sequenced by primer walking using the BigDye3.1 kit. Sequence assembly and editing were performed by using the VectorNTI software package. Open reading frame (ORF) identification and gene annotation were performed by using the BLAST programs (National Center for Biotechnology Information). Translated protein sequences were also compared to the sequences in the Gemmata obscuriglobus genome (http://tigrblast.tigr.org/ufmg/) by using TBLASTN analysis.
Phylogenetic analysis.
Polypeptide sequences were aligned by using the CLUSTAL W program (17) and manually curated. Phylogenetic analyses were carried out by using the PHYLIP program package (8). Distance, parsimony, and maximum-likelihood analyses were performed, with 100 bootstrap analyses for each.
Nucleotide sequence accession numbers.
Sequences of the fosmid inserts have been deposited with GenBank under the accession numbers DQ084247, DQ084248, and DQ084250. Partial sequences of fhcD, fae, and mch genes identified in the present study have been deposited under accession numbers DQ173653 to DQ173667 and DQ176037, DQ173643 to DQ173652, and DQ176322 to DQ176324, respectively.
|
|
|---|
![]() View larger version (98K): [in a new window] |
FIG. 1. Sample of 16S rRNA gene DGGE fingerprint patterns obtained with the primer set 341fGC/536r. Lanes: 1, DGGE patterns of 16S rRNA gene PCR products obtained from sediment DNA; 2, DGGE pattern for E. coli EPI100; 3 to 11, DGGE patterns of PCR products obtained from metagenomic library 2 clone pools 3 to 11.
|
![]() View larger version (47K): [in a new window] |
FIG. 2. Distribution of 16S rRNA, fae, and fhcD genes in clone pools of library 2. Numbers in the top right corners of each box in the grid indicate the number of bands detected by DGGE analysis. Phylotypes were designated as sequences revealing <95% identity to each other at the amino acid level.
|
![]() View larger version (43K): [in a new window] |
FIG. 3. Consensus phylogenetic tree showing relations of the Fae phylotypes uncovered in the present study (in boldface) to previously known Fae sequences. In parentheses, alternative names are shown (11, 16; the present study). Sequences that were also detected via RT-PCR are marked by asterisks. The nodes that separate the novel sequences (denoted by gray circles) are supported by bootstrap values of at least 94% in at least two out of three analyses performed (see Materials and Methods).
|
![]() View larger version (45K): [in a new window] |
FIG. 4. Consensus phylogenetic tree showing relations of the FhcD phylotypes uncovered in the present study (in boldface) to previously known FhcD sequences. In parentheses, alternative names are shown (13; the present study). The sequence that was also detected via RT-PCR is marked by an asterisk. The nodes that separate the novel sequences (denoted by gray circles) are supported by bootstrap values of at least 75% in at least two out of three analyses performed (see Materials and Methods).
|
Clone LWBAC-L1N9 was shown to contain an insert of approximately 25.3 kb with an average G+C DNA content of 68%. The fae gene identified within this fragment corresponded to the L1N9 sequence. The sequence of a part of the fosmid insert estimated to be approximately 50 bp remained unresolved, apparently due to a strong hairpin structure (indicated by an arrow in Fig. 5). A total of 14 potential ORFs were identified within the insert (Fig. 5), including three of the H4MPT-linked C1 transfer gene homologs: the fae, the previously described mtdC (20), and an orf9 homolog. The predicted functions of the remaining 11 ORFs and their best hits in the databases are listed in Table S1 in the supplemental material. None of these ORFs showed high similarity to any known sequences. Identities with top hit sequences ranged between 23 and 50%, and top hit sequences belonged to Proteobacteria, gram-positive bacteria, and Archaea.
![]() View larger version (22K): [in a new window] |
FIG. 5. Gene content and organization of fosmid clones analyzed in this work. Genes conserved between the genomic fragments are connected by shaded areas. Other bacterial genera include Cytophaga, Flavobacterium, Cyanobacteria, Aquificae, Chloroflexi, Bacteroides, etc. Arrows show location of gaps in sequence. Numbers show percent amino acid identities between genes in BAC10-4 and BAC10-10; numbers in parentheses show the percent identity between genes in BAC10-4 and BAC-L1N9.
|
Clone LWBAC10-10 was shown to contain an insert of approximately 33.6 kb, with an average G+C DNA content of 67%. The fhcD sequence identified in this insert corresponded to the FhcD3 phylotype. The sequence of a part of the insert estimated to be approximately 50 base pairs remained unresolved, apparently due to a strong hairpin structure (indicated by an arrow in Fig. 4). Sequence analysis revealed the presence of 30 potential ORFs. Of these, 10 were homologous to the H4MPT-linked C1 transfer genes, with identities ranging from 26 to 59%, distributed between Proteobacteria, Planctomycetes, and Archaea. Immediately upstream of orf17, two genes are located, predicted to encode selenocysteine biosynthesis enzymes. The remaining ORFs did not reveal significant similarity to known sequences, with top hit identity levels ranging from 24 to 46%, distributed between Proteobacteria, gram-positive bacteria, Planctomycetes, Archaea, and bacteria of deeply diverging divisions (Table S1 in the supplemental material).
No strong conservation in gene order or content was found between the three analyzed genomic fragments, with the exception that the fae-mtdC pair conservation is common to Planctomycetes (12, 20). Only one hypothetical protein-coding ORF (located downstream of mtdC and encoding a conserved archaeal hypothetical protein) was conserved between the inserts of LWBAC-L1N9 and LWBAC10-10. Although nine H4MPT-linked C1 transfer genes were shared between LWBAC10-10 and LWBAC10-4 inserts, little conservation in their order was observed, with the exception that the orf1-orf9 pair is typical of Proteobacteria (12). The stretch of genes in the LWBAC10-10 insert, mch-orf5-orf7-orf17 has been found highly conserved in Proteobacteria (12). The polypeptide sequences translated from the C1 transfer genes present in the inserts were compared to each other and to the sequences in public databases, using BLAST analyses. The newly identified sequences were found more related to each other than to the previously known sequences (with an exception of orf9 in the BAC-L1N9 insert that was more related to proteobacterial orf9 sequences), with the levels of sequence identity ranging from 52 to 79% (Fig. 4). Phylogenetic analyses of these sequences further confirmed that they all diverged significantly from the sequences known for cultivated microbes belonging to Proteobacteria, Planctomycetes, or Archaea, forming deep branches on phylogenetic trees (Fig. 6). In many cases, however, the branching pattern differed for different proteins and different analyses and in general, nodes for the novel sequences and the planctomycete sequences were poorly resolved, apparently resulting from high protein divergence within both the novel group and the planctomycete sequences. Based on the low sequence similarity, the low degree of gene clustering conservation with known microbial groups and the phylogenetic tree patterns, it is likely that the fosmid insert sequences described above belong to microbes of a as-yet-undescribed, uncultivated phylum within Bacteria. Possibly, these sequences represent yet unknown, deeply branching members of Planctomycetes.
![]() View larger version (23K): [in a new window] |
FIG. 6. Consensus phylogenetic trees of Mch (A), Orf9 (B), Orf5 (C), and Orf7 (D) polypeptides. Bootstrap values are shown for distance, parsimony, and maximum-likelihood analyses. Nodes without bootstrap values are not strongly supported. M. extorquens, B. xenovorans, M. capsulatus, and M. flagellatus are Proteobacteria; G. obscuriglobus, Gemmata sp. strain Wa1-1, and R. baltica are Planctomycetes; M. thermoautotrophicus, M. kandleri, M. mazei, M. barkeri, M. acetivorans, A. fulgidus, and M. jannaschii are Archaea.
|
|
|
|---|
The metagenome analysis confirmed the relative abundance of the genes involved in H4MTP-linked C1 transfers in the microbial population inhabiting Lake Washington sediment. A comparison of the total number of fae and fhcD genes recovered from the metagenome to the number of 16S rRNA genes suggests that approximately 21 to 25% of the microbes represented in the metagenome possess these genes. Likely, this number is an underestimation for the total population since no gammaproteobacterial sequences related to the Methylomonas/Methylobacter group were uncovered, apparently due to cloning biases noted previously (11), and it is known that methanotrophs of this group are abundant in the sediment (1, 5, 6). Our analyses demonstrate that a large fraction of the H4MPT-linked C1 transfer genes is represented by the divergent genes (38% in the fhcD sequence subset and 77% in the fae sequence subset), a finding consistent with the previous PCR-based surveys using total environmental DNA (11, 13).
Three fosmid inserts carrying divergent fae and fhcD genes were sequenced in order to obtain data supporting (or rejecting) the deeply branching nature of the organisms in question. Indeed, we demonstrated that genes representative of the divergent groups previously detected in three independent PCR-surveys (fae, fhcD, and mtdA/B) cluster together within the genomic fragments sequenced. Other C1 transfer genes identified within the fosmid inserts also diverged deeply from the genes previously identified in cultured Proteobacteria, Planctomycetes, or Archaea. The genes outside of the C1 transfer gene clusters provided little useful phylogenetic signal, since they shared extremely low levels of similarity with known genes, and their top hits were distributed between various representatives of Bacteria or Archaea (Table S1 in the supplemental material). The three sequenced genomic fragments, while more similar to each other than to known organisms, still showed a high level of divergence in gene sequence for the homologous ORFs, in gene order conservation and in gene content, suggesting that the organisms from which these genomic fragments originated are not closely related to each other. These data provide additional evidence that these organisms may comprise a deeply branching clade. This new clade would most likely fall within the bacterial kingdom of life, based on higher hits with bacterial sequences for most genes and based on phylogenetic analyses. Possibly, the new sequences represent novel, deeply branching Planctomycetes. Our data indicate that this novel clade represents a significant fraction of the total microbial population in Lake Washington sediment, thus implying a potential ecological significance. The data on mRNA detection presented here demonstrate that the novel C1 transfer genes are expressed under in situ conditions, suggesting a function in C1 cycling. The proposed function in C1 cycling is also supported by previous enrichment data (13, 16). However, the exact role in C1 cycling and the nature of primary substrates for these organisms in the environment remain unknown. These will be addressed in our future studies, which will include expanded metagenome sequencing and expression analyses.
Mark Dodobara is acknowledged for technical assistance, and Sergey Stolyar and David Stahl are acknowledged for providing access to the DGGE apparatus. The Institute for Genomic Research is acknowledged for early release of the G. obscuriglobus genomic sequence (funded by the U.S. Department of Energy).
Supplemental material for this article may be found at http://aem.asm.org/. ![]()
Present address: CEA/Cadarache, DSV/DEVM/LEMiR, Bât 161, 13108 St-Paul lez Durance, France. ![]()
|
|
|---|
This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»