Previous Article | Next Article ![]()
Applied and Environmental Microbiology, December 2005, p. 8506-8513, Vol. 71, No. 12
0099-2240/05/$08.00+0 doi:10.1128/AEM.71.12.8506-8513.2005
Copyright © 2005, American Society for Microbiology. All Rights Reserved.
University of Delaware, College of Marine Studies, Lewes, Delaware 19958
Received 2 May 2005/ Accepted 9 September 2005
| ABSTRACT |
|---|
|
|
|---|
| INTRODUCTION |
|---|
|
|
|---|
Cytophaga-like bacteria are hypothesized to be important in the hydrolysis and mineralization of biopolymers in the oceans. Cultured isolates of Cytophaga-like bacteria are proficient in degrading carbohydrate biopolymers, such as cellulose and chitin (21), which are constituents of high-molecular-weight DOM (3). Efficient utilization of biopolymers in high-molecular-weight DOM and in detritus particles might explain the high levels of free-living Cytophaga-like bacteria and their especially high levels on detritus particles in the ocean (12-14, 29). The hypothesis that high-molecular-weight DOM is consumed by Cytophaga-like bacteria is supported by radiotracer studies that examined the consumption of protein and chitin by uncultured bacteria (8). Finally, Cytophaga-like bacteria grow more rapidly in seawater incubation mixtures supplemented with concentrated high-molecular-weight DOM (11). Examining hydrolase genes in environmental DNA could be another approach to link uncultured bacterial biopolymer degradation and other biogeochemical processes.
In this study we examined genes in the Sargasso Sea Whole Genome Sequence (WGS) data set (32) encoding hydrolases potentially used by marine Cytophaga-like bacteria for degrading biopolymers in high-molecular-weight DOM. Our focus on Cytophaga-like bacteria was motivated by the desire to understand the role of these bacteria in the utilization of high-molecular-weight DOM. PCR primers were designed for the most abundant type of endoglucanase identified in the Sargasso data set and used to screen a fosmid library constructed with prokaryotic DNA from the western Arctic Ocean. The cloned hydrolase was expressed in Escherichia coli and assayed for various hydrolase activities because gene function could not be determined with confidence using amino acid similarity alone due to the similarity of the hydrolase to both cellulases and peptidases. Our results highlight the value of experimental data that support actual gene function. The complete sequence of the fosmid bearing Arctic Cytophaga-like DNA supports the hypothesis that Cytophaga-like bacteria are especially adapted to utilizing biopolymers in high-molecular-weight DOM.
| MATERIALS AND METHODS |
|---|
|
|
|---|
Fosmid library screening.
The Arctic fosmid library was constructed using DNA isolated from cells in the <0.8-µm size fraction of a seawater sample collected at a depth of 10 m in the Chukchi Sea (72°19.33'N, 151°59.07'W) in July 2004. For DNA isolation and library construction we used procedures described previously (10), except that the bacterial biomass was collected by vacuum filtration rather than by tangential flow filtration. Pools of 96 fosmid clones were screened for 16S rRNA genes by denaturing gradient electrophoresis (DGGE) of PCR amplicons generated with primers GC358F and 517R (28). Selected bands resolved on an 8% polyacrylamide gel containing a 25% to 55% denaturant gradient (13.8% to 22% formamide and 10.5% to 23% urea) were reamplified and sequenced. Phylogenetic classification was performed using BLAST and the ARB sequence analysis tool (26).
PCR primers were designed to amplify the celM-like genes identified in the Sargasso WGS data set using the tool for identifying consensus primers in the Oligo primer analysis software package (Molecular Biology Insights, Inc.). The celM gene primers CelM298F (5'-GCTCCTTCAAAATGGG-3') and CelM559R (5'-AACCTCAGCAATCATAAATCC-3') were selected based on successful in silico amplification of four of the most complete (>990-bp) Sargasso celM-like genes. These primers were then used to screen the Arctic fosmid library. Pools of 96 fosmid clones were screened by performing a celM gene PCR with primers at a concentration of 200 nM in a buffer containing 0.7 mM MgCl2 provided with the Taq polymerase (Promega). The thermal cycling conditions consisted of 35 cycles of denaturation at 96°C, primer annealing at 50°C, and DNA polymerization at 72°C. The celM-bearing clones in PCR-positive plate pools were identified by further PCR screening of the 96 pooled clone rows and columns of the 96-well plate.
The celM-bearing Arctic clone was completely sequenced by the Joint Genome Institute. The sequence was analyzed using the annotation tools available in the Artemis DNA sequence viewer obtained from the Welcome Trust Sanger Institute (http://www.sanger.ac.uk/Software/Artemis/) and using the FGENESB website (http://www.softberry.ru/) (Softberry, Inc.).
Phylogenetic analysis.
16S rRNA gene sequences were aligned using the ARB fast aligner (26), and celM gene sequences were aligned using a CelM amino acid sequence alignment generated using ClustalW (6). Distance matrices and neighbor-joining trees were constructed using PHYLIP (15).
Expression analysis of CelM.
The celM gene of the Arctic fosmid clone (Arctic CelM) was subcloned into an expression vector for activity analyses. The entire celM gene of the Arctic CelM clone was amplified using PCR primers CelM516U (5'-CACCATGGCAACAAAAAAAATACTT-3') and CelM1648L (5'-AGATGTTTCTACTTTACTTAACCCAGA-3') and was cloned into the pET101 expression vector and E. coli TOP10 (Invitrogen) by following the procedure provided by the manufacturer. The celM pET101 construct was subsequently transformed into E. coli BL21(DE3) for expression analysis.
In order to examine the size and amino acid sequence of Arctic CelM, overnight cultures (500 ml) of E. coli bearing the celM-pET101 construct or the pET101 vector alone were pelleted by centrifugation and washed with phosphate-buffered saline (PBS) (8 g of NaCl per liter, 0.2 g of KCl per liter, 1.44 g of Na2HPO4 per liter, 0.24g of KH2PO per liter; pH adjusted to 7.4). The pellet was resuspended in 5 ml of PBS, and the cells were lysed using sonication. Cell debris was removed by centrifugation at 100,000 x g for 60 min at 4°C. The molecular weight of the expressed protein was estimated by sodium dodecyl sulfate-polyacrylamide gel electrophoresis and compared to that expected from the 1,094-bp celM gene. Amino acid sequencing by liquid chromatography-mass spectrometry of the expressed protein was performed by the Campus Chemical Instrument Center, Ohio State University, using a procedure described by Hanson and Tabita (18). The Arctic CelM was digested with trypsin, and the sequences of 21 peptides covering 63% of the original protein were then matched to the sequence of Arctic CelM.
Cell lysates were assayed for glutamyl aminopeptidase activity using glutamine-p-nitroanilide (Glu-pNA) (Peptide Institute, Inc.) as described by L'Anson et al. (25). Twenty microliters of 5 mM Glu-pNA was added to 180 µl of lysate and incubated for 18 h at 37°C or 60°C (cloned Clostridium thermocellum CelM). The reaction was stopped by addition of 100 µl of 30% (vol/vol) acetic acid. The sample was then centrifuged at 10,000 x g for 5 min, and the absorbance at 410 nm was determined.
Hydrolysis of carboxymethylcellose (CMC) was assayed by monitoring the production of reducing sugars using the procedure outlined by Wu et al. (34). The reaction mixture contained 20 µl of 1% CMC (Polysciences, Inc.) and 180 µl of lysate and was incubated at 37°C for 18 h. The reaction was terminated by addition of 0.2 ml of 2% sodium carbonate and 1 ml of cyanide-carbonate solution (10 mM KCN, 50 mM Na2CO3). Two milliliters of 0.05% potassium ferricyanide was added, and the solution was vortexed and boiled for 30 min. The tubes were cooled, and the absorbance at 420 nm was determined.
Cellulase activity was assayed by measuring the hydrolysis of fluorogenic and chromogenic substrate analogs, including 4-methylumbelliferyl-ß-D-glucopyranoside, 4-nitrophenyl-ß-D-cellobioside, and 4-nitrophenyl-ß-D-cellotetraoside (Sigma-Aldrich). The reaction mixture contained 20 µl of 5 mM substrate and 180 µl of lysate. The reaction mixtures were incubated for 18 h at 37°C or 60°C (cloned C. thermocellum CelM) and terminated by addition of glycine carbonate buffer (pH 9.7). The sample was then centrifuged at 10,000 x g for 5 min, and the absorbance at 410 nm was determined (nitrophenyl-linked substrate) or the preparation was assayed for blue fluorescence under UV light (methylumbelliferyl-linked substrate).
Protease activity was measured using fluorescein isothiocyanate-conjugated bovine albumin. The reaction mixture contained 20 µl of a 1-mg/ml labeled albumin solution and 180 µl of lysate. The mixture was incubated at 37°C for 18 h, and the reaction was terminated by addition of 5% trichloroacetic acid. The sample was centrifuged at 10,000 x g for 15 min, and the green fluorescein fluorescence in the supernatant was assayed under UV light. Controls were treated with proteinase K.
Nucleotide sequence accession number.
The nucleotide sequence of the Arctic fosmid was deposited in GenBank under accession number DQ272742.
| RESULTS AND DISCUSSION |
|---|
|
|
|---|
The Cytophaga-like bacteria identified in the Sargasso WGS data set were related to other Cytophaga-like bacteria seen previously in marine environments, based on 16S rRNA gene phylogeny. The Sargasso Cytophaga-like sequences were grouped into nine clusters of marine Cytophaga-like bacteria (Fig. 1), although one Sargasso gene (contig AACY01458001) remained separate from the other Cytophaga-like bacterial sequences (Fig. 1). Because the shotgun sequencing approach has no PCR step, we might have expected the Sargasso WGS data set to include completely new types of Cytophaga-like bacteria not seen previously in PCR-based studies. However, the Cytophaga-like bacteria in the Sargasso study were closely related to bacteria already identified in PCR clone libraries.
|
|
Abundance of CelM in the Sargasso WGS.
CelM is the most abundant cellulase-like hydrolase in the Sargasso data set (Fig. 2) based on a BLASTP analysis using 113 cellulases (EC 3.2.1.4) available from the Swiss-Prot database (March 2005). CelM occurs 30 times in the Sargasso data set, outnumbering all other cellulases classified in glycosyl hydrolase families, according to the nomenclature used by the Carbohydrate-Active Enzymes (CAZY) database (http://afmb.cnrs-mrs.fr/CAZY/) (Fig. 2). Family 5 glycosyl hydrolase is the second most abundant type of cellulase and occurs on 27 contigs. Cellulases in families 8, 9, 10, and 12 occur on eight or fewer contigs. No contig contained more than one cellulase.
|
|
Despite possible problems with the deep branches in the celM gene phylogeny, the presence of closely related clusters of celM genes does provide some insight into the types of bacteria in the Sargasso Sea that have celM. Although celM genes from related bacteria, such as Clostridium sp. strains, do not always group together, when celM genes do group together, they come from related bacteria. For example, all 14 Firmicutes celM genes belong to a single clade, as do the six Pyrococcus celM genes (Fig. 3). Similarly, all 15 of the Sargasso celM genes cluster together with celM from C. hutchinsonii (Fig. 3), suggesting that all of these genes are associated with Cytophaga-like bacteria. Although most of the celM genes in the Sargasso data are not linked to 16S rRNA genes, the phylogenetic analysis suggests that despite the broad diversity of bacteria that potentially have this gene, celM seems to be restricted to Cytophaga-like bacteria in the Sargasso Sea.
Gene content of an Arctic fosmid bearing celM.
To test the hypothesis that Cytophaga-like bacteria having celM also possess genes encoding other hydrolases, we examined a celM-containing genome fragment from an uncultivated Cytophaga-like bacterium. This analysis was performed using a fosmid library of prokaryotic DNA collected from the Arctic Ocean.
The Arctic fosmid library had 4,800 clones (average insert size, 40 kb) that were largely derived from prokaryotic DNA. Twenty-seven clones carried 16S rRNA genes, as determined by DGGE screening. BLASTN analysis of the DGGE band sequences revealed that 23 clones contained prokaryotic DNA and four clones carried DNA from the photosynthetic picoeukaryote Mantoniella squamata. The clones carrying prokaryotic DNA included 10 clones bearing Cytophaga-like 16S rRNA and four clones identified as clones carrying DNA from Gammaproteobacteria. The remaining clones with 16S rRNA genes belonged to the Alphaproteobacteria, Betaproteobacteria, and Actinobacteria (three each). No archaeal clones were detected with DGGE primers for this group.
One clone in the Arctic fosmid library (Arctic CelM) was positive as determined by the celM gene PCR. Similar to our results with the Sargasso data set, the Arctic CelM clone also carried a Cytophaga-like 16S rRNA gene detected in the DGGE analysis. A sequence analysis extending upstream and downstream of the celM gene PCR priming sites revealed a 1,089-bp open reading frame. A BLASTP analysis indicated that this open reading frame encodes a protein that is 63% identical and 77% similar to CelM in C. hutchinsonii. The amino acid sequence of Arctic CelM does not appear to have a signal peptide found in the secreted proteins examined so far, based on an analysis using SignalP (http://www.cbs.dtu.dk/services/SignalP/).
Sequence analysis of the Arctic CelM fosmid clone revealed 28 genes, including 25 open reading frames, 16S and 23S rRNA genes, and a 16S-23S rRNA internal transcribed spacer (Table 2; see Table S1 in the supplemental material). Phylogenetic analysis placed the 16S rRNA gene in marine Cytophaga-like clade 1, which also has genes from the Sargasso Sea and other marine environments (Fig. 1). Many of the protein-encoding genes were also most similar to genes in Cytophaga-like bacteria, which further indicates that the Arctic CelM fosmid clone contains DNA from a Cytophaga-like bacterium. Sixty percent of the genes on the Arctic fosmid were most similar to genes in Bacteroidetes, and 40% were most similar to genes in Cytophaga-like bacteria. Seven (30%) of the fosmid genes were most similar to genes in C. hutchinsonii. The celM gene exhibited the highest level of similarity (63% identity) to a gene in C. hutchinsonii. Other genes similar to genes in C. hutchinsonii include genes encoding acyl coenzyme A synthetase (COG0365), dihydroorotate dehydrogenase (COG0167), and a hypothetical protein, which were 62 to 78% identical to genes in the Arctic fosmid (see Table S1 in the supplemental material). Two genes were 50% identical to peptidyl-prolyl cis-trans isomerase and glycerol kinase genes.
|
Peptidase activity of the Cytophaga-like celM gene.
Most studies of genes cloned from environmental DNA have relied on sequence analysis to infer what metabolic function the genes might mediate. Function is commonly assigned according to the highest level of similarity in a BLAST analysis using a large database, such as the GenBank database. Although this approach is easy and widely used, it can be misleading because there is often little or no experimental evidence for the function of many, if not all, of the genes identified by the search. This was the case in our analysis of celM. The only experimental evidence for endoglucanase activity of CelM is evidence for the celM gene in C. thermocellum (24). The Arctic CelM was 29% identical and 48% similar to the C. thermocellum CelM (Table 1). Other genes similar to the Arctic celM gene were genes encoding family M42 peptidases. The peptidase most similar to the Arctic CelM was a protein in Bacillus cereus, which was 32% identical and 55% similar to the Arctic CelM.
To clarify the true identity of Arctic CelM, PCR primers were designed to amplify the entire gene and then were used to subclone the celM gene into an expression vector. The molecular mass of the protein expressed from the Arctic CelM clone determined by polyacrylamide gel electrophoresis was 34 kDa, which was consistent with the size of the Arctic celM open reading frame (1,089 bp). The amino acid sequence of the expressed protein determined by liquid chromatography-mass spectrometry (21 peptide fragments and 63% coverage) was identical to that of the expected protein based on the celM sequence in the Arctic CelM fosmid clone.
CelM encoded in the Arctic fosmid tested positive for glutamyl aminopeptidase activity when it was assayed with Glu-pNA (Fig. 4). The peptidase activity of the Arctic CelM in crude E. coli extracts was much higher than the activity in extracts of E. coli expressing the C. thermocellum CelM; the activity of the C. thermocellum CelM was not different from that of the negative control, the expression vector alone (Fig. 4). Because of the amino acid similarity of the Arctic CelM and the C. thermocellum CelM to zinc metalloproteases, we expected that the peptidase activity of the Arctic CelM would be higher with the addition of Zn (30). In fact, Zn addition actually reduced the activity of the Arctic CelM and had no effect on the activity of the C. thermocellum CelM (Fig. 4). The absence of peptidase activity in the C. thermocellum CelM is noteworthy because based solely on amino acid similarity one might predict that this protein would have peptidase activity due to the presence of a putative peptidase M42 domain (PFAM 05343).
|
Conclusions.
Sequence analysis of environmental DNA from uncultivated microbes is a potentially powerful tool for uncovering the metabolic capabilities of microbes in the environment and broadening our view of microbial ecology. However, gene sequences must be translated into biological functions, and this is a limiting step in using environmental sequence data to address ecological questions. This study exposed the shortcomings of assigning gene function based on sequence similarity alone. The difficulty in distinguishing CelM from the M42 family of peptidases based on sequence similarity was overcome only by determining the activity of the expressed protein. While expression analysis is labor-intensive, it extracts the most complete and accurate information from environmental sequence data.
Functional assays circumvent the problem of assigning biological functions to genes recovered from metagenomic libraries, but this approach has its shortcomings. Sophisticated approaches based on complementation of genes in mutant E. coli host cells have proven to be useful in detecting genes that are active in the fermentation of glycerol (23). In contrast, rather simple assays using selective media (31) and looking for distinctive coloration of colonies (17) have been effective for detecting antibiotic resistance genes. Similarly straightforward assays are available to detect genes encoding hydrolases, such as xylanase (5) and chitinase (9). However, the effectiveness of any functional assay depends on successful expression in E. coli or alternative hosts, such as those proposed for studying uncultured soil bacteria (16, 27).
The combination of sequence and expression analyses should allow us to start to examine the role of CelM in biopolymer degradation by Cytophaga-like bacteria, although we need more information about signal peptides used by these bacteria and other marine microbes. If CelM is exported using a signal peptide unlike those studied to date, then it may liberate low-molecular-weight peptides that either are transported directly into the cell or are hydrolyzed further to even smaller peptides or free amino acids. However, if CelM remains in the cell, it may play a role analogous to that of the PepA glutamyl aminopeptidase in Lactococcus lactis (25), which acts on intracellular peptides. Regardless, CelM is probably one part of a multicomponent system used by Cytophaga-like bacteria for the consumption of biopolymers in high-molecular-weight DOM and particulate detritus. CelM and other peptidases probably work together with proteases and peptide transporters in the degradation of protein-containing DOM by Cytophaga-like bacteria. Characterizing the genes encoding components of biopolymer utilization systems should be useful in linking uncultured Cytophaga-like bacteria and other bacterial groups to biopolymer degradation and DOM cycling.
| ACKNOWLEDGMENTS |
|---|
We thank Rex Malmstrom for his assistance and the Chief Scientists of the Shelf Basin Interaction project, Jackie Grebmeier and Lee Cooper, for their support during sample collection in the western Arctic Ocean. David Wilson kindly provided the clone of celM from C. thermocellum.
| FOOTNOTES |
|---|
Supplemental material for this article may be found at http://aem.asm.org/. ![]()
| REFERENCES |
|---|
|
|
|---|
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| J. Bacteriol. | Microbiol. Mol. Biol. Rev. | Eukaryot. Cell | All ASM Journals |
|---|