Previous Article | Next Article ![]()
Applied and Environmental Microbiology, May 2009, p. 2841-2849, Vol. 75, No. 9
0099-2240/09/$08.00+0 doi:10.1128/AEM.02698-08
Copyright © 2009, American Society for Microbiology. All Rights Reserved.
,
Max Planck Institute for Terrestrial Microbiology, Karl von Frisch Strasse, 35043 Marburg, Germany,1 U.S. Department of Energy Joint Genome Institute, 2800 Mitchell Drive B100, Walnut Creek, California 94598-16982
Received 25 November 2008/ Accepted 26 February 2009
|
|
|---|
|
|
|---|
Recently, we were able to isolate strain Pei191T, the first pure-culture representative of the TG1 phylum, from the gut of a humivorous scarab beetle larva, Pachnoda ephippiata (14). Based on the 16S rRNA gene sequence, strain Pei191T is a member of the "intestinal cluster," which consists of sequences derived from invertebrate guts and cow rumen (20) and is only distantly related to the so-called "endomicrobia," a lineage of TG1 bacteria comprising endosymbionts of termite gut protozoa (24, 42, 54). It is an obligately anaerobic ultramicrobacterium that grows heterotrophically on glucose and produces acetate, hydrogen, ethanol, and alanine as major products (14). The species description of "Elusimicrobium minutum," with strain Pei191T as the type strain, and the proposal of "Elusimicrobia" as the new phylum name are published in a companion paper (14).
Here, we report the complete genome sequence of E. minutum, focusing on a reconstruction of the metabolism of this strictly anaerobic bacterium. The implications of these findings are discussed in light of physiological data, and potential functions indicated by the genome annotation are compared to requirements imposed by the intestinal environment. Using the concatenated sequences of 22 single-copy marker genes of E. minutum and of the uncultivated "Candidatus Endomicrobium" strain Rs-D17, an endosymbiont of termite gut flagellates (22), we also investigated the phylogenetic position of Elusimicrobia relative to other bacterial phyla.
|
|
|---|
Genome sequencing, assembly, and gap closure.
The genome of E. minutum was sequenced at the Joint Genome Institute (JGI) using a combination of 8-kb and 40-kb Sanger libraries and 454 pyrosequencing. All general aspects of library construction and sequencing performed at the JGI can be found at http://www.jgi.doe.gov/. The 454 pyrosequencing reads were assembled using the Newbler assembler (Roche). Large Newbler contigs were chopped into 1,871 overlapping fragments of 1,000 bp and entered into the assembly as pseudo-reads. The sequences were assigned quality (q) scores based on Newbler consensus q-scores with modifications to account for overlap redundancy and to adjust inflated q-scores. A hybrid assembly of 454 and Sanger reads was performed using the Paracel Genome Assembler. Possible misassemblies were corrected, and gaps between contigs were closed by custom primer walks from subclones or PCR products. The error rate of the completed genome sequence of E. minutum is less than 1 in 50,000.
Annotation.
Sequences were automatically annotated at the Oak Ridge National Laboratory according to the genome analysis pipeline described in Hauser et al. (18). All automatic annotations with functional predictions were also checked manually with the annotation platform provided by Integrated Microbial Genomes (IMG) (37). For each gene, the specific functional assignments suggested by the matches with the NCBI nonredundant database were compared to the domain-based assignments supplied by the COG (Clusters of Orthologous Groups), PFAM (Protein Families Database of Alignments and HMMs), TIGRFAM, and InterPro databases and, if necessary, corrected accordingly. When it was not possible to infer function or COG domain membership (reverse-position-specific BLAST against COG position-specific scoring matrices with e-value of >10–2), genes were annotated as predicted to be novel. For all the genes, the subcellular location of their potential gene products was determined based on the presence of transmembrane helices and signal peptides. Putative transport proteins were compared to those in the Transport Classification Database (http://www.tcdb.org). Genes were viewed graphically with IMG. Metabolic pathways were reconstructed using MetaCyc as a reference data set (7). Detailed information about the automatic genome annotation can be obtained from the JGI IMG website (http://img.jgi.doe.gov/w/doc/about_index.html). Insertion sequences were detected with IS Finder (http://www-is.biotoul.fr/).
Phylogenetic analyses.
A concatenated gene tree was created using a set of 22 conserved single-copy phylogenetic marker genes derived from the set used by Ciccarelli et al. (9). The marker genes were extracted from E. minutum and 279 microbial reference genomes (including "Candidatus Endomicrobium" strain Rs-D17) in the IMG database, version 2.50 (38), concatenated, and aligned with MUSCLE (11). The alignment and sequence-associated data (e.g., organism name) were then imported into ARB (33) and manually refined. A mask was created using the base frequency filter tool (20% minimal identity) to remove regions of ambiguous positional homology, yielding a masked alignment of 3,982 amino acids, which is available on request from the authors. Several combinations of outgroups to the TG1 taxa (E. minutum and "Candidatus Endomicrobium" strain Rs-D17) were selected for phylogenetic inference to establish the monophyly of the TG1 phylum and to identify any specific associations with other phyla that may exist (10). Maximum-likelihood trees were constructed from the masked datasets using RAxML-VI-HPC, version 2.2.3 (53).
The phylogenetic relationships of the [NiFe] hydrogenase were determined using the ARB program suite (33). The sequences of E. minutum and Thermoanaerobacter tengcongensis were aligned with the sequences of the large subunit given in Vignais et al. (57). Highly variable positions (<20% sequence similarity) were filtered from the data set, resulting in 560 unambiguously aligned amino acids, and phylogenetic distances were calculated using the protein maximum-likelihood algorithm provided in the ARB package.
Clustered, regularly interspaced short palindromic repeats (CRISPR) arrays were identified using PILER-CR (12). Prophages or other elements targeted by CRISPRs were identified by pairwise comparison of spacers to the rest of the genome using BLASTN (2).
Nucleotide sequence accession number.
The complete nucleotide sequence and annotation of E. minutum have been deposited in the GenBank database under accession number CP001055.
|
|
|---|
![]() View larger version (37K): [in a new window] |
FIG. 1. Genomic organization of the E. minutum chromosome. The two outermost rings show the genes encoded on the forward and reverse strand (scale in megabase pairs). The third ring depicts the location of tRNA genes. The fourth ring shows the G+C content, and the innermost ring shows the GC skew. The polyketide synthase (PKS) and rRNA operons have relatively high G+C contents; a prophage and several predicted novel genes have relatively low G+C contents. GC skew was used to identify the origin of replication (Ori).
|
|
View this table: [in a new window] |
TABLE 1. Summary of the functional assignment, according to COG domain, of the 1,529 protein-coding genes in the E. minutum genomea
|
Phylogeny and taxonomy.
As expected for the first cultivated representative of a candidate phylum, many genes from the E. minutum genome are only distantly related to homologs identified in genomes from other bacterial phyla. The recent publication of a composite genome of "Candidatus Endomicrobium" strain Rs-D17, recovered from a homogeneous population of endosymbionts isolated from a single protist cell in a termite hindgut (22), provides a phylogenetic reference point for analysis. A comparative analysis of 22 concatenated single-copy marker genes confirmed a highly reproducible relationship between E. minutum and "Candidatus Endomicrobium" strain Rs-D17 (Fig. 2), as predicted already by 16S rRNA-based phylogeny (20). The analysis also reinforced the phylum-level status proposed for the Elusimicrobia lineage (formerly TG1) (23) since no robust associations to other bacterial phyla were identified.
![]() View larger version (26K): [in a new window] |
FIG. 2. An unrooted maximum-likelihood tree of 280 bacterial genomes, including the two sequenced representatives of the phylum Elusimicrobia, representing the regions of the bacterial domain currently mapped by genome sequences. The tree is based on a concatenated alignment of 22 single-copy genes. Reproducibly monophyletic groups of taxa (>98% bootstrap values, except for the Deltaproteobacteria at 82%) are grouped into wedges for clarity. The apparent relationship between Elusimicrobia and the Synergistetes is not stable.
|
![]() View larger version (43K): [in a new window] |
FIG. 3. Schematic overview of the energy metabolism in E. minutum. Sugars are degraded via the EMP and PFOR (blue box). NADH is recycled by reduction of acetyl-CoA to ethanol or, at low hydrogen partial pressure, by the cytoplasmic [FeFe] hydrogenase. Reduced ferredoxin is regenerated by the membrane-bound [NiFe] hydrogenase. Amino acids are metabolized by transamination with pyruvate and subsequently oxidatively decarboxylated to the corresponding acids by several homologs of PFOR (yellow box). Alanine can be generated not only by transamination but also by reductive amination of pyruvate (green box). The export of alanine generates a sodium-motive force, which is coupled to the proton-motive force, the synthesis/hydrolysis of ATP via ATP synthase, and the proton-dependent uptake of amino acids or oligopeptides. Pathways were reconstructed based on the manually annotated genome and results from batch culture experiments (14).
|
![]() View larger version (8K): [in a new window] |
FIG. 4. Maximum-likelihood tree of [NiFe] hydrogenases, based on the deduced amino acid sequences of the large subunit. The sequences of E. minutum and T. tengcongensis fall within the radiation of the sequences assigned to group IV [NiFe] hydrogenases by (54). The topology of the tree was tested separately by neighbor-joining and RAxML, with bootstrapping provided in the ARB package (31).
|
|
View larger version (8K): [in a new window] |
FIG. 5. Organization of the genes encoding the subunits of the [FeFe] hydrogenase of T. tengcongensis (48) and their predicted homologs in E. minutum. The displayed length is proportional to the size of the corresponding open reading frame. hydA, hydB, and hydC have deduced amino acid sequence identities of 46, 56, and 40%, respectively; hydD is not present in E. minutum. White symbols, hypothetical function.
|
Anabolism.
Although the presence of fructose 1,6-bisphosphatase indicates the possibility for gluconeogenesis via the EMP, E. minutum requires a hexose for growth (14). The absence of genes coding for 2-oxoglutarate dehydrogenase, succinate dehydrogenase, and succinyl-CoA synthetase is typical for strict anaerobes and documents that E. minutum does not possess a complete tricarboxylic acid (TCA) cycle. The reductive branch of the incomplete TCA cycle is initiated by phosphoenol pyruvate (PEP) carboxykinase and allows the interconversion of oxaloacetate, malate, and fumarate. The oxidative branch of the pathway starts with citrate synthase and allows the formation of 2-oxoglutarate. Typical for anaerobic microorganisms, the citrate synthase of E. minutum belongs to the Re-type (32). The products of the incomplete TCA cycle are precursors of several amino acids. The biosynthetic pathways for the formation of glutamate, glutamine, proline, aspartate, lysine, threonine, and cystathione are present. Also the pathways for the formation of alanine, cysteine, glycine, histidine, and serine, starting with intermediates of the EMP, are almost fully represented by the corresponding genes (see Table S1 and Fig. S1 in the supplemental material). However, the genes for the synthesis of other proteinogenic amino acids (arginine, asparagine, isoleucine, leucine, methionine, phenylalanine, tyrosine, tryptophan, and valine) are lacking, which would explain why E. minutum requires small amounts of yeast extract in the medium (14).
The genome of E. minutum does not possess an oxidative pentose phosphate pathway, which is typically involved in the regeneration of NADPH. This important coenzyme is probably regenerated by the alternative route of pyruvate formation from PEP (formation of oxaloacetate by PEP carboxykinase, NADH-dependent reduction of oxaloacetate by malate dehydrogenase, and NADP+-dependent oxidative decarboxylation of malate by malic enzyme) (Fig. 3), as proposed for Corynebacterium glutamicum (45). NADP+ is required for the de novo biosynthesis of nucleic acids. The presence of the genes required for the nonoxidative pentose phosphate pathway (transaldolase and transketolase) allows the reconstruction of the pathways for purine and pyrimidine nucleotide biosynthesis almost completely (see Table S1 in the supplemental material) and also explains the catabolism of ribose via the EMP (14).
The genes coding for the synthesis of lipopolysaccharides and peptidoglycan are also well represented (see Table S1 in the supplemental material). This is in agreement with the results of electron microscopy, which showed that E. minutum possesses the typical cell envelope architecture of gram-negative bacteria (14). The pathways for vitamin synthesis are absent or at most rudimentary (see Table S1 in the supplemental material), which would be another reason why the bacterium requires small amounts of yeast extract in the growth medium (14).
A large open reading frame (3,008 amino acids) was assigned to the polyketide synthase gene family. Interestingly, the polyketide synthase gene shows a relatively high G+C content (46%) (Fig. 1), suggesting an origin from horizontal gene transfer. The presence of a polyketide synthase and of a putative nonribosomal peptide synthetase (1,284 amino acids) is rather unusual for anaerobic bacteria (48). The function of the two enzymes remains to be investigated.
Peptide degradation.
E. minutum has a particular pathway for catabolic utilization of amino acids, which may lead to additional energy conservation (Fig. 3). The pathway comprises the transfer of amino groups from peptide-derived amino acids to pyruvate via a homolog of a nonspecific aminotransferase (58), resulting in alanine formation. The 2-oxoacids produced by the transamination can be oxidatively decarboxylated to the corresponding acyl-CoA esters, probably by the gene products annotated as 2-oxoacid:ferredoxin oxidoreductases. Substrate-level phosphorylation is accomplished via an acyl-CoA synthetase (ADP-forming), resulting in the formation of ATP and the corresponding fatty acid. The genome also encodes proton-dependent oligopeptide transporters, ABC-type transport systems for peptides, and numerous proteolytic and peptolytic enzymes, some of which have typical signal peptides, indicating extracellular proteinase activity (see Table S1 in the supplemental material).
A comparable peptide utilization pathway is also present in Pyrococcus furiosus (19, 34, 36). Besides the PFOR, a homodimer that typically oxidizes only pyruvate and a few other oxoacids, e.g., 2-oxoglutarate (39), E. minutum also possesses a homologue of a heterotetrameric 2-oxoisovalerate:ferredoxin oxidoreductase with a broad substrate specificity, especially for branched-chain 2-oxoacids (19). In addition, a putative two-subunit indolepyruvate:ferredoxin oxidoreductase is present. The large number of different acyl-CoA esters resulting from the oxidative decarboxylation of various amino acids seem to be converted to their corresponding acids by a single ADP-dependent acetyl-CoA synthetase; the homolog in P. furiosus is reportedly rather unspecific and also processes branched-chain derivatives (35).
The operation of this peptide utilization pathway in E. minutum is supported by the observation that most proteinogenic (and even some nonproteinogenic) amino acids are converted to their corresponding oxidative decarboxylation products during growth on glucose. Further evidence was provided by 13C labeling, which demonstrated that the carbon skeleton of the putative transamination product, alanine, is derived from glucose (14). In principle, E. minutum also possesses the capacity for the net amination of pyruvate to alanine (Fig. 3), which has been proposed to function as an additional electron sink in P. furiosus (26).
A combination of glucose fermentation with the oxidative decarboxylation of an amino acid can increase the free-energy change of the metabolism, as exemplified by the case of valine (
G°' values are calculated according to reference 56; data for isobutyrate are from reference 60):
![]() |
![]() |
However, since substrate-level phosphorylation in the peptide utilization pathway occurs at the expense of ATP generation from carbohydrates (i.e., pyruvate oxidation), the cofermentation of amino acids becomes energetically productive only if this opens up the possibility for additional energy conservation. Interestingly, E. minutum possesses a Na+/alanine symporter, which could couple export of the accumulating alanine with the generation of an electrochemical sodium gradient. Together with the H+/Na+ antiporter encoded in the genome, the sodium gradient can be converted into a proton-motive force, which would either drive the generation of additional ATP via ATP synthase or avoid the hydrolysis of ATP necessitated by the dissipation of the proton motive force in other transport processes (27), such as the proton-dependent import of amino acids or oligopeptides (Fig. 3).
Secretion.
A large number of proteins (40%) encoded in the genome of E. minutum contain a signal peptide, indicating their export from the cell (see Table S1 in the supplemental material). These putatively exported proteins comprise almost all of the proteins in COG category U (intracellular trafficking, secretion, and vesicular transport) and more than half of the predicted novel proteins.
The results of the manual annotation revealed that E. minutum possesses a variant of the general secretion pathway. The Sec translocon (encoded by secADFYEG) lacks a SecB subunit; SecB is probably replaced by one of the more general chaperones (DnaJ or DnaK) (59). There are numerous genes encoding the typical type II secretion system (T2SS), but several essential components of the machinery are missing in the annotation (Table 2). Most of these components are poorly conserved (encoded by gspABCNS) (8) and might have simply escaped detection. Some of the missing elements might have been annotated as elements of type IV pili (T4P), which are related structures with numerous similar components (55). T4P are probably absent in E. minutum because the PilMNOP components, which are essential for functional pili (5, 6), are lacking, and no pilus-like structures are seen in ultrathin sections of E. minutum (14). The absence of gspL and gspM in E. minutum is more critical because the encoded proteins have no homologs in T4P and are usually indicative of a T2SS. However, the T2SSs of Acinetobacter calcoaceticus and Bdellovibrio bacteriovorus also lack the GspLM components (8), and the pathogen Francisella tularensis subsp. novicidia uses a T2SS that even lacks the GspLMC components to export chitinases, proteinases, and β-glucosidases (17). The presence of two ATPases in E. minutum, which are typical for T4P, does not necessarily argue against a T2SS; the T2SS of Aeromonas hydrophila also has two ATPases, and they are thought to increase the efficiency of the secretory process (47).
|
View this table: [in a new window] |
TABLE 2. Comparison of the components of the T2SS (gsp genes) and T4P (pil genes) present in A. hydrophila and F. tularensis subsp. novicida with those of E. minutuma
|
Comparative analysis revealed that only the encoded N-terminal methylase domain is conserved between the E. minutum pilE-like genes and pilE genes from other organisms. This effectively reduces the comparable region to only
50 amino acids and compromises phylogenetic inference. However, it appears that most of the E. minutum copies (57/60) form a monophyletic group, which suggests a large lineage-specific expansion of this gene family or at least an expansion of the gene domain (data not shown). Indeed, the numerous copies of the pilE-like genes of the E. minutum genome alone increase the size of the COG4968 family in the IMG database by almost 10% because there are only 682 representatives present in 1,087 other microbial genomes (38). Since E. minutum lacks observable pili and since many of the pilE-like genes appear in operons of diverse function, we speculate that this gene family is involved in some other aspect(s) of endogenous regulation, perhaps not related to pili or secretion at all, and has undergone a lineage-specific expansion in response to environmental selection.
In addition to the type II-like secretion system, the genome contains numerous ABC transporters (see Table S1 in the supplemental material). Together with outer membrane efflux proteins (outer membrane protein and membrane fusion protein), they may constitute type I secretion systems with various functions.
Oxygen stress.
In agreement with the obligately anaerobic nature of E. minutum, the genome contains no cytochrome genes and no pathways for the biosynthesis of quinones, corroborating the absence of any respiratory electron transport chains. However, E. minutum has a six-gene "oxygen stress protection" cluster consisting of ruberythrin (rbr), superoxide reductase (sor), rubredoxin:oxygen oxidoreductase (roo), and rubredoxin (rub) (Fig. 6). The roo gene of E. minutum has similarity to the corresponding genes of Desulfovibrio gigas and Moorella thermoacetica, which have been shown to reduce molecular oxygen by reduced rubredoxin (15, 49). The presence of an oxygen-reducing system may explain the ability of E. minutum to retard the diffusive influx of oxygen into deep-agar tubes (14) and may play an important role in survival in the intestinal tract of insects, a habitat constantly exposed to the influx of oxygen (4, 31).
![]() View larger version (12K): [in a new window] |
FIG. 6. Organization of the genes encoding the oxidative stress protection cluster in M. thermoacetica, D. gigas, and their predicted homologs in E. minutum. The displayed length is proportional to the size of the corresponding open reading frame. The genes for ruberythrin (rbr), superoxide reductase (sor), rubredoxine:oxygen oxidoreductase (roo), rubredoxin (rub), and rubredoxin-like (rbl) in E. minutum have high sequence similarities to their homologs in Desulfovibrio spp. and other Deltaproteobacteria. White symbol, hypothetical function.
|
These activities were supported by the 2007 Community Sequencing Program. D.H. and W.I.-O. were supported by stipends of the International Max Planck Research School for Molecular, Cellular, and Environmental Microbiology and the Deutscher Akademischer Austauschdienst. This work was financed in part by a grant of the Deutsche Forschungsgemeinschaft in the Collaborative Research Center Transregio 1 (SFB-TR1) and by the Max Planck Society. Other parts of this work were performed under the auspices of the U.S. Department of Energy's Office of Science, Biological and Environmental Research Program and by the University of California, Lawrence Berkeley National Laboratory, under contract DE-AC02-05CH11231, Lawrence Livermore National Laboratory under contract DE-AC52-07NA27344, and Los Alamos National Laboratory under contract DE-AC02-06NA25396.
Published ahead of print on 6 March 2009. ![]()
Supplemental material for this article may be found at http://aem.asm.org/. ![]()
|
|
|---|
This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»