Previous Article | Next Article ![]()
Applied and Environmental Microbiology, August 2005, p. 4840-4849, Vol. 71, No. 8
0099-2240/05/$08.00+0 doi:10.1128/AEM.71.8.4840-4849.2005
Copyright © 2005, American Society for Microbiology. All Rights Reserved.
Edward F. DeLong,2,
and
C. Richard Hutchinson1
Kosan Biosciences, Inc., Hayward, California 94545,1 Monterey Bay Aquarium Research Institute, Moss Landing, California 950292
Received 28 August 2004/ Accepted 9 March 2005
|
|
|---|
|
|
|---|
Sponges harbor large numbers of diverse bacteria in their tissue. At present, microbial sponge communities and their genomes are poorly understood. The results of recent studies primarily relying on fluorescence in situ hybridization (14, 16, 28) or 16S rRNA sequencing (19, 26, 56, 57) indicate that most sponges harbor uniform but phylogenetically complex microbial populations that are quite distinct from those of marine plankton or marine sediments. It has been hypothesized that the bacteria in these communities synthesize many of the associated bioactive compounds (1, 8, 52). Whereas cultivating the invertebrates for natural product synthesis is generally cost-prohibitive or impossible, the cultivation of a microorganism could secure an unlimited and inexpensive supply of a potentially important compound. However, many bacteria associated with sponges have not been cultured despite considerable effort (26, 55, 57), pointing to the need for alternative strategies. Genomic approaches aimed at isolating the biosynthetic genes and expressing them in surrogate hosts is one such alternative. This usually begins with the construction and screening of metagenomic libraries, in which large pieces of DNA isolated from mixed populations without prior cultivation are cloned and screened for targeted genes or bioactivities (45). Such an approach has resulted in the identification of a number of short biosynthetic pathways and novel biocatalysts from several soil metagenomic libraries (2, 7, 17, 54). Recently, parts of the putative pederin, bryostatin, and onnamide biosynthesis genes have been cloned from metagenomic libraries of a beetle (34, 37), a bryozoan (20, 21), and a sponge (35, 36), respectively.
We are interested in the marine sponge Discodermia dissoluta because a promising antitumor compound, discodermolide (18), can be isolated from it. The structure of discodermolide is consistent with biosynthesis by a bacterial type I modular polyketide synthase (PKS) (49), which generally consist of a set of multifunctional modules. Each module is responsible for one cycle of polyketide chain elongation. The minimal module contains a ketosynthase (KS), an acyltransferase (AT), and an acyl carrier protein (ACP) domain that together catalyze a two-carbon extension of the chain. After each such extension, the resulting ketone can either be left as is or converted to a ß-hydroxyl, a double bond, or an alkane group by stepwise processing of ketoreductase (KR), dehydratase (DH), and enoyl reductase (ER) domains. The substrate specificities of the ATs, the order and number of the modules, and the composition of catalytic domains within each module provide a "code" for the structure of the synthesized polyketide. Furthermore, the PKS genes for a particular polyketide are usually clustered and colinear with the order of its biosynthesis. Consequently, the structure of the polyketide product of an unknown PKS can be approximated from the sequence of its genes (49).
Although we did not identify a PKS with a modular structure consistent with biosynthesis of discodermolide, we obtained a detailed picture of the diversity of genes for biosynthesis of polyketides and nonribosomal peptides (15) in the complex microbial community of Discodermia dissoluta. Here we describe this diversity and a method to screen extensive sponge-derived metagenomic libraries for diverse PKS genes that should be useful for identifying target PKS genes in other sponges. Based on DNA sequence data, we also show the potential of the filamentous, sponge-specific bacteria Entotheonella spp. to encode nonribosomal peptide synthetases (NRPS) and mixed PKS-NRPSs, and we describe a novel, giant PKS that could be involved in the biosynthesis of multimethyl-branched fatty acids in sponge-associated bacteria.
|
|
|---|
Enrichment of sponge-associated bacteria.
Procedures for the separation or enrichment of sponge-associated bacteria from sponge cells have been described for several sponges (1, 42, 44) and were applied in a similar way in the present study. Frozen sponge tissue was cut into small pieces (<1 cm3) and placed into sterile PBSE buffer (1x phosphate-buffered saline plus 10 mM EDTA at 10 ml/g of sponge). Collagenase (Roche Applied Science, Indianapolis, IN) was added at a final concentration of 500 µg/ml, and the mixture was shaken on ice for 30 min. For further disruption, the sponge suspension was blended briefly. The suspension was filtered (45-µm pore size) and centrifuged at 500 x g (6 min, 4°C). The pellet was washed twice with PBSE. Microscopic examination indicated that the pellet was enriched for filamentous bacteria, although sponge nuclei and unicellular bacteria were still present. The supernatant was centrifuged at 8,800 x g (20 min, 4°C). The pellet was resuspended in 20 ml of PBSE, and 10-ml samples were layered onto 20-ml Percoll cushions (15% Percoll in PBSE). The samples were centrifuged at 750 x g (10 min, 4°C), and the cells carefully removed from the top of the Percoll cushion and washed twice with PBSE. This fraction was highly enriched for coccoid unicellular bacteria.
DNA isolation.
Cell pellets from bacterial enrichments were resuspended in buffer (0.5 M NaCl, 100 mM EDTA, 10 mM Tris [pH 8.0]) and treated with lysozyme (150 µg/ml for 1 h at 37°C) and proteinase K (0.5 mg/ml)-1% sodium dodecyl sulfate (SDS) for 2 h at 50°C. After lysis, DNA was extracted with phenol-chloroform (two to three times) and chloroform and was either concentrated by using Centricon 100 spin concentrators (Millipore, Billerica, MA) or precipitated with isopropanol. DNA was treated with DNase-free RNase and/or further purified by cesium chloride equilibrium density gradient ultracentrifugation (43). For total sponge DNA isolation, a section of frozen sponge was placed under liquid nitrogen in a mortar, broken into pieces, and pulverized to a fine powder. The nitrogen was allowed to boil off, and small aliquots of the powder were dispersed into lysis buffer (8 M urea, 2% Sarkosyl, 350 mM NaCl, 50 mM EDTA, 50 mM Tris [pH 7.5]) at 100 ml per g of sponge tissue with slow stirring at room temperature. The lysate was extracted twice with phenol-chloroform, and the DNA was spooled from the upper aqueous phase under cold ethanol. The DNA was digested with DNase-free RNase, extracted with phenol-chloroform and ethanol precipitated.
PCR and cloning.
In order to isolate KS gene fragments from sponge-derived DNA samples, the following degenerate PCR primers were used: degKS2F.gc (5'-GCSATGGAYCCSCARCARCGSVT) and degKSR5.gc (5'-GTSCCSGTSCCRTGSSCYTCSAC), which were biased for G or C in some positions, and degKS2F.i (5'-GCIATGGAYCCICARCARMGIVT) and degKS5R.i (5'-GTICCIGTICCRTGISCYTCIAC), which contained inosine in some positions. A typical reaction volume of 50 µl contained 100 to 500 ng of sponge-derived DNA, 200 pmol of each primer, 0.2 mM deoxynucleoside triphosphate (containing 7-deaza-dGTP), 10% dimethyl sulfoxide, and 2.5 U of Taq DNA polymerase (Roche Applied Science). Cycle steps consisted of denaturation (94°C for 40 s), annealing (55°C for GC-biased primers and 40°C for inosine containing-primers for 40 s), and extension (72°C for 75 s) for 35 to 40 cycles. In order to isolate adenylation domain gene fragments of nonribosomal peptide synthetases (NRPSs), degenerate PCR primers degNRPS-1F.i (5'-AARDSIGGIGSIGSITAYBICC) and degNRPS-4R.i (5'-CKRWAICCICKIAIYTTIAYYTG) were used; these primers were designed as described previously (51). PCR conditions were identical except for longer extension times (105 s). PCR products of approximately 700 bp (KS domains) or 1,000 bp (NRPS domains) were gel purified (QIAGEN, Inc., Valencia, CA) and cloned by using a TOPO TA cloning kit as described by the manufacturer (Invitrogen Life Technologies, Carlsbad, CA). Clones with correctly sized inserts were sequenced. Eubacterial 16S rRNA gene fragments were amplified with PCR primers Eubac-27F and 1492R as described previously (9).
Construction and screening of metagenomic libraries.
To determine whether varying the methods of library construction would increase the diversity of the metagenomic sponge libraries, we built fosmid libraries by using randomly sheared DNA from enrichments of unicellular or filamentous bacteria and cosmid libraries by using partially digested DNA from total sponge tissue. Including the DNA isolated from total sponge tissue for library construction would capture genomes that might have been lost during cell separation. For fosmid libraries, a copy control fosmid library production kit (vector pCC1FOS) from Epicentre (Madison, WI) was used according to the manufacturer's instruction. The protocol for cosmid libraries followed that of Stratagene (La Jolla, CA) for construction of libraries in SuperCos-1 with two modifications. First, the cosmid vector was a derivative of SuperCos-1 in which the 4.2-kb AfeI fragment was self-ligated to remove the simian virus 40 and neo marker sequences. Second, after scaling up the optimal partial Sau3AI digests chosen from a pilot experiment, they were size fractionated on 10 to 50% sucrose gradients (35 ml in 25 mM Tris-HCl [pH 8], 25 mM EDTA, 100 mM NaCl) (43). Chloramphenicol (fosmids)- or ampicillin (cosmids)-resistant clones were grown in 384-well microtiter plates (LB plus 7% glycerol plus antibiotic) and spotted onto 22-by-22-cm nylon membranes (18,432 clones in duplicate; Amplicon Express, Pullman, WA). The membranes were hybridized overnight at 42°C in 30% formamide-containing hybridization buffer (43) and washed at low stringency with 2x SSC (1x SSC is 0.15 M NaCl plus 0.015 M sodium citrate) plus 0.1% SDS four times for 10 min each time at room temperature and two times for 30 min each time at 55°C. DNA probes were labeled with a random-primed DNA labeling kit (Roche Applied Science) by using [
-32P]dCTP, and hybridization signals were visualized by using a Typhoon 9410 scanner (Amersham Biosciences, Piscataway, NJ). All Southern blot hybridizations were carried out under the same conditions.
DNA sequencing and phylogenetic analysis.
DNA sequencing was carried out by using an ABI 3730 sequencer (Applied Biosystems, Foster City, CA). PCR amplicons cloned into pCR2.1-TOPO were sequenced with M13 universal and reverse primers, and the ends of cosmid inserts were sequenced with modified T3 and T7 primers (5'-TCTCTGTTTTTGTCCGTGG and 5'-TTCCCCGAAAAGTGCCAC). After induction to high copy number, the ends of fosmid inserts were sequenced with primers recommended by the manufacturer (Epicentre). To determine the entire DNA sequence of fosmid or cosmid inserts, random shotgun libraries of 2- to 3-kb Sau3A fragments were prepared and sequenced to about sixfold coverage, and small gaps were closed by "primer walking." Sequence data were analyzed on a Macintosh computer by using Sequencher, MacVector, and the BLAST server of the National Center for Biotechnology Information. 16S rRNA sequences were aligned by using the ARB software package (27). Preliminary phylogenetic analysis was performed by using the ARB parsimony algorithm. Aligned sequences were then exported and subsequently analyzed by using PHYLIP 3.62 (J. Felsenstein, University of Washington, Seattle). Evolutionary distances were calculated by using the F84 distance correction, a transition/transversion ratio of 2, and one rate substitution category, and empirical base frequencies were determined by using dnadist. Phylogenetic trees were generated by neighbor joining and randomized order of input species. Bootstrap datasets (1,000 each for both parsimony and distance analyses) were generated by using seqboot. Parsimony trees were generated by using dnapars, randomizing the species input order. KS and NRPS domain sequences were aligned by using CLUSTAL W, and phylogenetic trees were created by using the neighbor-joining algorithm of CLUSTAL X (6). A total of 1,000 bootstraps were performed, and trees were visualized with the TREEVIEW program.
Nucleotide sequence accession numbers.
Fifty-six representative 16S rRNA sequences and all KS and NRPS domain sequences shown in Fig. 2 and 4 were deposited in GenBank and given accession numbers AY89070 to AY897125 (16S rRNAs), AY897138 to AY897201 (KS domains), and AY897126 to AY897137 (NRPS domains). The two sponge-associated PKS gene clusters described have the accession numbers AY907537 (SA1_PKS) and AY907538 (SA2_PKS).
![]() View larger version (33K): [in a new window] |
FIG. 2. Phylogenetic analysis of type I PKS KS domains from sponge-associated microorganisms. Bootstrap values of >500 calculated from 1,000 bootstrap trees (neighborhood-joining algorithm) are indicated at the nodes. The bar indicates 10% sequence divergence. The tree was rooted with Escherichia coli KasI (6573501) as an outgroup (not shown). Sponge-derived KS domains are in boldface and have the following letter code: sp, from genomic DNA; fos/cos, from fosmid/cosmid DNA (highlighted by asterisks); T, from total sponge; and F or U, from enrichment of filamentous or unicellular bacteria, respectively. Known and hypothetical KS domains from GenBank used in this tree, which are given either with their gene symbol or accession number, represent the closest homologs of the sponge-derived sequences (as judged by BLAST) and selected more distant relatives. Thsw-sp, KS domains isolated from the sponge Theonella swinhoei.
|
![]() View larger version (30K): [in a new window] |
FIG. 4. Phylogenetic analysis of NRPS adenylation (A) domains from sponge-associated microorganisms. Bootstrap values of >500 calculated from 1,000 bootstrap trees (neighborhood-joining algorithm) are indicated at the nodes. The bar indicated 10% sequence divergence. The tree was rooted with E. coli Lys1 (P40976) as an outgroup. Sponge-derived NRPS A-domains are in boldface and have the same letter code as in Fig. 2. Known and hypothetical NRPS A-domains from GenBank used in this tree, which are given either with their gene symbol and amino acid specificity or accession number, represent the closest homologs of the sponge-derived sequences (as judged by BLAST) and selected more distant relatives.
|
|
|
|---|
Phylogenetic diversity of the sponge-microbe community.
The phylogenetic diversity of bacteria associated with D. dissoluta was surveyed from 160 eubacterial 16S rRNA sequences cloned from total sponge DNA or DNA of bacterial enrichments. The majority of 16S rRNA sequences in the unicellular and filamentous fraction were related to acidobacteria (n = 22; 45%) and
-proteobacteria (n = 23; 45%), respectively. Members of other phylogenetic groups that were identified multiple times were Chloroflexi (n = 21),
-Proteobacteria (n = 17),
-Proteobacteria (n = 8), Thermus/Deinococcus group (n = 9), Spirochaetes (n = 5), and Actinobacteria (n = 5). The presence of eubacteria from most of these groups has also been reported for other sponges (19). Not surprisingly, 76% of all eubacterial 16S rRNA sequences were most similar (91 to 99% identical) to other sponge-derived 16S rRNAs (see Table S1 in the supplemental material). Interestingly, all of the
-proteobacteria, which make up the majority of the filamentous fraction, were closely related and found to be affiliated with Entotheonella, a proposed novel sponge-specific taxon (46). Figure 1 shows the phylogenetic relationship of the six different strains identified (96 to 98% identical to each other over 1480 bp of 16S rRNA) with the unculturable "type strain" "Candidatus Entotheonella palauensis" (ca. 97% identical over 1,335 bp of 16S rRNA). Moreover, the morphology of the filamentous bacteria found throughout the Discodermia tissue matches that of E. palauensis (46).
![]() View larger version (30K): [in a new window] |
FIG. 1. Phylogenetic tree of 16S rRNA sequences that make up the majority of the filamentous bacteria associated with D. dissoluta (bold) showing their close affiliation with "Candidatus Entotheonella palauensis" and classification within the proteobacteria.
|
Phylogenetic distribution in metagenomic sponge libraries.
To assess the range of prokaryotic DNA inserts in the metagenomic libraries made from either enriched bacterial fractions or total sponge, we sequenced the ends (500 to 600 bp) of ca. 340 random fosmids or cosmids. All sequences were searched against the NCBI nonredundant databases and assigned to the Prokarya (best BLASTX expectation value of
104 to a prokaryotic entry) or Eukarya (best BLASTX expectation value
102 to a eukaryotic entry) or considered not assignable (not meeting cutoff values or comparable similarity to prokaryotic and eukaryotic entries). Based on this analysis, at least 46 and 82% of fosmid inserts from the filamentous and unicellular libraries, respectively, encode prokaryotic proteins or open reading frames (ORFs). The number of clearly assigned eukaryotic fosmid inserts was low in both bacterial libraries and yet significantly higher in the filamentous library (20% compared to 3%). This finding correlates with the higher number of sponge cells observed in the enriched filamentous fraction compared to the enriched unicellular fraction. At least 87% of cosmid and 93% of fosmid inserts from two different total sponge libraries encoded prokaryotic and only 1 to 2% eukaryotic proteins or ORFs, although the construction of these libraries did not include any physical enrichment of sponge-associated bacteria.
Distribution of type I PKS gene clusters in metagenomic sponge libraries.
We used a probe pool consisting of 44 diverse KS domain sequences isolated from sponge-derived genomic DNAs (see above and Fig. 2) to screen high-density filters of macro-arrayed clone libraries at low stringency (an example is shown in Fig. 3A). Not only do these macro arrays allow for high-throughput screening, they also reduce the number of false positives because each clone is spotted in duplicate, thus facilitating the identification of low-homology targets. To capture the genomes of the less-abundant sponge-associated microbes, as well as the abundant ones, we arrayed about 155,000 clones with an average insert size of 35 kb. Allowing for the proportion of bacterial inserts in each library, this screen covered more than 4 Gb, the equivalent of more than 1,000 sponge-associated bacterial genomes of average size (Table 1). The frequency of PKS positive clones was about one in 140 (0.7%) for the unicellular and total sponge library and one in 200 (0.5%) for the filamentous library, yielding 1,025 PKS gene-containing fosmids or cosmids [from here on referred to as PKS f(c)osmids]. Based on end sequencing of 467 PKS f(c)osmids, ca. 80% of the PKS f(c)osmids from the unicellular and total sponge library had a G+C content of >60%. In contrast, the G+C content of 65% of PKS fosmids from the filamentous library was <60%, indicating that this library contained a very different set of PKS gene clusters.
![]() View larger version (55K): [in a new window] |
FIG. 3. Hybridization experiments at low stringency using a KS probe pool. (A) Colony hybridization of a section of a macroarrayed total sponge cosmid library (9,216 clones). (B) Southern blot hybridization of a BamHI digest of randomly chosen KS-positive fosmids from the unicellular library. Arrows at the top and bottom depict fosmids containing an apparently abundant PKS gene cluster and fosmids containing large multimodular PKSs, respectively.
|
|
View this table: [in a new window] |
TABLE 1. Distribution of modular PKS gene clusters in metagenomic libraries of D. dissoluta
|
Distribution of mixed PKS-NRPS and NRPS gene clusters in metagenomic sponge libraries.
Another striking difference between the filamentous and unicellular libraries was that 13% of the PKS fosmids from the filamentous library had end sequences with high homology to NRPS genes, whereas no NRPS-related end sequences were found on any PKS fosmids from the unicellular library (Table 1). In agreement with this, degenerate primers specific for NRPS adenylation (A)-domains amplified correctly sized DNA fragments from some PKS fosmids in the filamentous library but from none in the unicellular library. These results encouraged us to examine the frequencies of mixed PKS-NRPS and NRPS gene clusters in the bacterial sponge libraries. NRPS A-domain amplicons were cloned and sequenced from several mixed PKS-NRPS fosmids and from total sponge DNA. A total of 12 different NRPS A-domain sequences were identified, which were only 42 to 61% identical to their closest homologs in GenBank, mostly cyanobacterial NRPS A-domains (a phylogenetic tree in shown in Fig. 4). These A-domain amplicons were used as probes to screen parts of the filamentous and unicellular libraries at low stringency. About 1% of the fosmids in the filamentous library hybridized strongly to these probes, including 33% of the PKS fosmids, which indicates that these fosmids probably contain mixed PKS-NRPS gene clusters. In contrast, no NRPS-positive fosmids could be detected in the unicellular library (see Fig. S2 in the supplemental material for hybridization data).
Distribution of "trans-AT"-type PKS gene clusters in metagenomic sponge libraries.
It was shown recently that the KS domains of the unusual "trans-AT" type PKSs can be distinguished from those of the more usual "cis-AT"-type PKSs by a phylogenetic approach (35). Thus, we searched the end sequences of the PKS f(c)osmids for sequences with highest homology to "trans-AT" type PKS genes. As with NRPS-containing gene clusters, Table 1 shows there were no "trans-AT" type PKS genes in the unicellular library, whereas ca. 11% of the end sequences of PKS fosmids from the filamentous library appeared to be of the "trans-AT" type. The most abundant PKS gene cluster in the filamentous library belonged to this type, too. We sequenced one fosmid of this cluster in order (i) to confirm that it truly was a "trans-AT" type PKS and (ii) to obtain an insight into the nature of this abundant PKS gene cluster in the filamentous library, which could be encoded by Entotheonella ssp. The 31-kb DNA sequence had an overall G+C content of 54.8% and contained the last three modules of a modular PKS terminated by a thioesterase. As expected, none of the modules contained ATs. Adjacent to the PKS we identified a pathway that is thought to be responsible for the incorporation of exomethyl or exomethylene groups at the ß-keto position of polyketides (11, 37). A more detailed description of this PKS cluster (SA2_PKS) can be found in Fig. S3 and Table S2 in the supplemental material.
Distribution of large multimodular PKS gene clusters in metagenomic sponge libraries.
The modular PKSs that synthesize bioactive polyketides are usually large and comprise multiple modules; for example, the erythromycin PKS contains six, the avermectin PKS 12 and the nystatin PKS 18 modules (49). The discodermolide PKS is predicted to consist of 10 or 11 modules. Of approximately 1,000 PKS f(c)osmids, 13 (1.3%) contained only PKS sequence (
35 kb) and thus were part of large PKSs with more than five modules. Restriction mapping and Southern blot hybridization revealed that these PKS f(c)osmids belonged to four distinct PKS gene clusters, and additional overlapping PKS f(c)osmids containing the start and/or end of some of these clusters were identified. The most abundant large multimodular PKS gene cluster was covered by 10 overlapping f(c)osmids spanning 125 kb (Fig. 5). Its full DNA sequence was determined from four f(c)osmids. Overlapping segments 1 to 14 kb long of nine f(c)osmids had identical sequences and therefore come from the same genome and not a set of closely related genomes, as is possible in metagenomic libraries (53). Because these f(c)osmids gave strong hybridization signals in the initial screen and our analysis of all PKS f(c)osmids by Southern blot hybridization and/or end sequencing was very rigorous, it seems unlikely that we missed a significant number of additional clones of this PKS gene cluster. Therefore, we can estimate the relative abundance of the microorganism that encodes this PKS: seven cosmids (representing 250 kb) were identified in the total sponge library, which corresponds to a twofold coverage of this 125-kb locus. Since the total sponge library contained an equivalent of about 400 average bacterial genomes (Table 1), this PKS gene cluster would be encoded by a bacterium that makes up ca. 0.5% (2 in 400) of the total sponge microbial community (assuming no significant cloning bias).
![]() View larger version (26K): [in a new window] |
FIG. 5. Organization of the most abundant large multimodular PKS gene cluster in metagenomic libraries from D. dissoluta. The module and color-coded domain structure of the PKS genes is shown at the top, and the predicted polyketide product at the bottom. Crossed-out domains within the PKS are predicted to be inactive. The closest homologs and amino acid identity of ORFs flanking the PKS are as follows: ORF1, 34% to Stigmatella aurantiaca phosphopantetheinyl transferase MtaA (AAF19809); ORF2, 46% to Streptomyces avermitilis SIMX4 (AAK06800); ORF3, 60% to Sphingomonas sp. strain KA1 insertion sequence ATP-binding protein (BAC56754); ORF4, 57% to Magnetococcus sp. strain MC-1 transposase (ZP_00291288); ORF5 to ORF7, 42 to 49% to Mesorizhobium loti intergrase/recombinase ORFs NP10678 to NP10680.
|
2.7 MDa. The PKS contains a starter module and 14 complete and 1 incomplete extender modules. The last module is truncated by what appears to be a transposon insertion. Although this transposition event could render the PKS inactive, it does not necessarily have to (see Discussion). The starter module consists of a KSS domain (active-site cysteine replaced by serine), which is similar to starter modules of polyene polyketide synthases, e.g., nystatin PKS (3). The KS domains were only 55 to 59% identical to their closest KS domain homologs in the database (hypothetical PKS from Mycobacterium avium, accession NP_961164) and belonged to the sponge-specific KS domain group (Fig. 2). All modules contain ATs, which appear to be specific for the incorporation of malonyl extender units. The most remarkable features of this large PKS gene cluster are the presence of (i) a complete set of reductive domains (KR, DH, and ER) in all except one module, which lacks the ER, and (ii) C-methyltransferase domains in 8 of the 14 modules (Fig. 5). Upon close inspection of amino acid sequences of all domains, only two KR domains (Fig. 5) showed signs of degeneration of sequence homology, suggesting that they are inactive. All other domains seemed well conserved, with an interesting variation: in five KRs the putative catalytic tyrosine (40) is replaced by a histidine. It is very likely that this particular change in so many KR domains leaves these otherwise normal KRs active. According to the well-established rules for polyketide biosynthesis (49) and assuming the loading of acetate by the starter module, the putative product of this unusual PKS would not be a complex polyketide but rather a multimethyl-branched C30 fatty acid, as depicted in Fig. 5. |
|
|---|
The metagenomic libraries made from total sponge tissue surprisingly contained about 90% prokaryotic inserts. Only a few sponge genome sizes have been estimated and range from as little as 60 Mb (31) to as much as 1,700 Mb (22). If there are actually 100 to 200 bacterial cells per sponge nucleus as we estimated, the genome of D. dissoluta would have to be at the small end of the size range in order to give rise to metagenomic libraries with such high proportions of prokaryotic DNA inserts. Alternatively, sponge DNA might not clone as well as bacterial DNA, biasing the metagenomic libraries toward bacterial inserts. There has been at least one report that, for unknown reasons, genomic sponge DNA was difficult to digest with restriction endonucleases (47). Interestingly, we recently made a fosmid library from total genomic DNA of another marine sponge and found that this metagenomic library also contains >90% bacterial inserts (unpublished results). At least for the two sponge metagenomic libraries we made, it appears that, with regard to the proportion of bacterial DNA inserts, the enrichment of bacteria does not provide any advantage over the use of total sponge tissue.
Metagenomic libraries made from total sponge and enriched unicellular bacteria showed several similarities, e.g., similar G+C profiles and PKS frequencies, which agrees with our microscopic observation that the vast majority of sponge-associated bacteria in D. dissoluta are unicellular. The apparent lack of NRPS genes in the unicellular library indicates either that such gene clusters are rare in the unicellular bacteria associated with D. dissoluta or that the NRPS genes encoded by these unicellular bacteria are significantly divergent from the primers and probes used for their detection.
The metagenomic library made from enriched filamentous bacteria was notably different from the other libraries; for example, the G+C content was significantly lower, and it contained numerous NRPS and mixed NRPS-PKS gene clusters. Nonribosomal peptides or mixed polyketides and peptides have not been reported for D. dissoluta, but several such compounds exhibiting bioactivity have been isolated from other Discodermia species, e.g., discodermins from D. kiiensis (29) or calyculins from D. calyx (24). Since the filamentous bacterial fraction appears to be dominated by Entotheonella spp., it is tempting to speculate that the abundance of NRPS genes might reflect the high abundance of Entotheonella genomes in this library. Interestingly, Entotheonella spp. have been found thus far only in sponges of the family Theonellidae, where they are thought to produce dicyclic glycopeptides such as theopalauamide (1, 46). Based on microscopy, the filamentous bacteria are only a minor part of the sponge-associated microbial community of D. dissoluta. The example shown here demonstrates the importance of enriching for subpopulations whenever possible, since this yielded a metagenomic library that was enriched in otherwise rarely found NRPS and mixed NRPS-PKS gene clusters.
We successfully isolated, assembled, and characterized a >100-kb PKS gene cluster encoded by a microbe that appears to make up <1% of the sponge bacterial community. If active, this PKS is predicted to synthesize a multimethyl-branched C30 fatty acid. Interestingly, two of the other three large PKS gene clusters identified in the sponge metagenomic libraries contained PKSs with very similar structures (partially sequenced; data not shown), suggesting the existence of a family of PKSs for multimethyl-branched fatty acids in sponge-associated bacteria. Very similar multimethyl-branched fatty acids are known as part of complex glycolipids in mycobacteria, e.g., mycoserosic and phthioceranic acids (30). They are formed by monomodular, multifunctional PKS enzymes, MAS and Pks2, which act like iterative vertebrate fatty acid synthases. In contrast to those of the vertebrate enzymes, however, the AT and KS domains of these enzymes are predicted to show selectivity for methylmalonyl coenzyme A (CoA) over malonyl-CoA, producing multimethyl-branched acids from n-acyl primers. If the PKS from the D. dissoluta metagenome described here indeed produces such a multimethyl-branched fatty acids, it would do so by a different and novel mechanism: a large multimodular PKS would synthesize these fatty acids de novo from malonyl-CoA, placing methyl branches at specific positions by the action of C-methyltransferase domains within its modules. It is conceivable that these fatty acids are directly transferred to lipid components and do not require a thioesterase for release, which is the case for mycoserosic and phthioceranic acids (30). If so, the transposon insertion in module 15 could still leave an active PKS enzyme with 14 modules. The fatty acid composition of D. dissoluta has not been studied. However, mono-, di-, tri-, and tetramethyl-branched, as well as polyunsaturated, fatty acids have been identified in marine and freshwater sponges (4, 10, 41). It has been hypothesized that these unusual fatty acids come from the sponge-associated bacteria. Our findings support this idea and provide a first clue into the biosynthesis of multimethyl-branched fatty acids in sponges.
It is difficult to determine whether a metagenomic sponge library is truly representative of the sponge-microbial community. Bacterial species may be lost during enrichment procedures. For this reason, the use of total sponge tissue to prepare metagenomic libraries seems to be the least biased. However, differential cell lysis and DNA stability, as well as cloning biases during library construction, can lead to genome misrepresentation. Since these parameters are difficult to measure, it seems advisable to build in as many variations as possible when one considers exploring metagenomic sponge libraries. At present, it cannot be known whether the abundant PKS gene clusters observed come from truly abundant species or rather easily accessible and clonable sponge-associated bacterial genomes. Much more work is needed to address this issue. Our observation that (i) none of the KS domains identified from sponge-associated bacteria were related to the common "cis-AT" type polyketide synthases of actinobacteria and (ii) the clear majority of PKS gene clusters isolated from the sponge metagenome were apparently rather small (one to three modules) might indicate that the genomes of some polyketide producing sponge-associated bacteria could have been underrepresented for the aforementioned reasons or are so rare that they were difficult to access in these metagenomic libraries. This could explain why none of the isolated multimodular PKS gene clusters appear to be responsible for the biosynthesis of discodermolide, the most abundant polyketide in this sponge.
This study was supported in part by a Small Business Innovative Research grant from the National Institutes of Health (1R43CA97889-01).
Supplemental material for this article may be found at http://aem.asm.org/. ![]()
Present address: University of Nebraska, E249 Beadle Center, 19th & Vine St., Lincoln, NE 68588. ![]()
Present address: Massachusetts Institute of Technology, Department of Civil and Environmental Engineering & Division of Biological Engineering, 48-427 MIT, 15 Vassar St., Cambridge, MA 02139. ![]()
|
|
|---|
-proteobacterium, "Candidatus Entotheonella palauensis". Mar. Biol. 136:969-977.[CrossRef]
-proteobacterium. Mar. Biol. 138:843-851.[CrossRef]
This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»