Previous Article | Next Article ![]()
Applied and Environmental Microbiology, February 2006, p. 1532-1541, Vol. 72, No. 2
0099-2240/06/$08.00+0 doi:10.1128/AEM.72.2.1532-1541.2006
Copyright © 2006, American Society for Microbiology. All Rights Reserved.
Amir Ghadiri,3 and
Alison E. Murray1*
Desert Research Institute, 2215 Raggio Parkway, Reno, Nevada 89512,1 Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, Massachusetts 02139,2 Amersham Biosciences, 928 E. Argues Avenue, Sunnyvale, California 940863
Received 25 April 2005/ Accepted 8 November 2005
|
|
|---|
and
),
Bacteroidetes, and high-G+C gram-positive bacteria.
Gene-finding and annotation analyses identified 244 total open reading
frames. Amino acid comparisons of 123 and 113 Antarctic
bacterial amino acid sequences to mesophilic homologs from
G+C-specific and SwissProt/UniProt databases, respectively,
revealed widespread adaptation to the cold. The most significant
changes in these Antarctic bacterial protein sequences included a
reduction in salt-bridge-forming residues such as arginine, glutamic
acid, and aspartic acid, reduced proline contents, and a reduction in
stabilizing hydrophobic clusters. Stretches of disordered amino acids
were significantly longer in the Antarctic sequences than in the
mesophilic sequences. These characteristics were not specific to any
one phylum, COG role category, or G+C content and imply that
underlying genotypic and biochemical adaptations to the cold are
inherent to life in the permanently subzero Antarctic
waters. |
|
|---|
Water temperatures in the coastal water off the Antarctic Peninsula do not exceed 2°C and remain at 1.8°C for most of the year (Palmer Station [http://iceflo.icess.ucsb.edu:8080/data/default.htm]). While little is known about the specific niches different Antarctic bacterial species occupy or the extent and nature of species diversity, extensive work has been done to study microbial activity and the impact of microbial processes on carbon and nitrogen cycling (10, 22). Clearly, many Antarctic bacteria have adapted to cold conditions and achieve growth rates comparable to those in temperate environments (16). However, only recently have comparative genomic studies been used to reveal possible cold adaptations at the predicted amino acid level in bacterial and archaeal genomes (25, 34). One of the major adaptations to cold includes modifications to structural features of proteins. Specific amino acid usage patterns and structural characteristics have emerged from analyses of a limited number of psychrophilic protein crystal structures (12, 14), mesophile and thermophile genomes (20), and psychrophile genomes (25, 34). Amino acid modifications are necessary to overcome structural stability and thermodynamic hurdles that occur at both temperature extremes (32, 33). In general, when cold-adapted proteins are compared to mesophilic proteins, they have fewer structural features that promote stability and more that increase protein flexibility. Structural features of cold-adapted proteins include fewer ion pairs, fewer arginine residues, fewer polar, H-bond-forming residues, fewer proline residues in protein loops, and fewer aromatic interactions than those in mesophilic proteins (12, 15). These conclusions are based on relatively few cloned and sequenced genes and not on genome or genome fragment analyses (33). Myriad, disparate strategies are employed between and among cold-adapted proteins in a fashion that is not yet predictable (14). Protein families may adopt certain strategies, but a given microbial genome may employ any combination of strategies.
The motivation for this
investigation was twofold. First, we sought to explore environmental
genomes of several uncultured Antarctic bacterioplankton isolates.
These genomic data, in combination with previous studies
(4,
8) of the same library,
can be used in the future to target these organisms and their gene
expression profiles in the natural environment. Second, we were
interested in testing hypotheses of cold adaptation derived from
limited psychrophilic enzymes across a large data set from diverse
bacteria. This study reports on six environmental fosmid clones from a
library created from DNAs collected in nearshore waters off Palmer
Peninsula, Antarctica (4).
These clones were selected for complete sequencing based on their
ecological and evolutionary relevance. Each represents a different
uncultivated marine bacterial group, four of which contain ribosomal
operons and two of which contain phylogenetically conserved coding
regions that we used to infer their affiliation with the
Cytophaga/Flavobacteria/Bacteroidetes (CFB)
group and
-proteobacterial lineages. Here we
present a comparative genomic analysis of the six fragments and data
concerning amino acid modifications in cold-adapted microorganisms that
appear to be consistent themes in bacterial
genomes.
|
|
|---|
Sequencing of the small-subunit (SSU) rRNA gene PCR fragments was performed on an ABI Prism 3730 instrument at the Nevada Genomics Center. Well locations of fosmid clones selected for complete sequencing were then identified via row and column DGGE screening in which the correct bands were matched with the original pooled 96-well plate reactions. DNAs from the selected fosmid clones were purified using a Plasmid Mega kit (QIAGEN). Fosmid DNAs were treated with plasmid-safe DNase (Epicenter) to remove remaining host genomic DNA. Fosmid DNAs were then subcloned using a TOPO ShotGun subcloning kit (Invitrogen Life Technologies). In brief, approximately 10 to 20 µg of fosmid DNA was sheared using a nebulizer at 16 psi for 75 s, resulting in fragmented DNA in the size range of 1,500 bp to 3,000 bp. The DNA was then blunt end repaired, dephosphorylated, inserted into vectors, and transformed into One Shot TOP10 electrocompetent Escherichia coli cells. Successful clones were selected for both ampicillin and kanamycin resistance. Genome sequencing was conducted at Amersham Biosciences (now part of GE Healthcare) and the Joint Genome Institute.
Two gene finding programs, FgenesB (Softberry Inc.) and GLIMMER (TIGR), identified open reading frames (ORFs). The results from gene finding using GLIMMER were then processed through TIGR's annotation engine and packaged in the TIGR database MANATEE. A BER search (BLAST Extend Repraze) was performed for each ORF included in the MANATEE database. The two automated annotations were compared manually and combined.
Evolutionary distance was employed to determine relationships between homologous protein coding regions, using PHYLIP (v3.62). Protein distance matrices were calculated using the Jones-Taylor-Thornton model after bootstrapping using the ProtDist package. Bootstrap analysis (1,000 iterations) was performed in PHYLIP using SeqBoot. The PHYLIP package Fitch was used to reconstruct phylogenetic trees from maximum likelihood distance matrices generated with nonresampled amino acid sequence data.
A method for amino
acid usage analysis was developed. Combined SwissProt/UniProt protein
database and custom G+C-specific databases were created
locally. The G+C-specific databases were created for each
Antarctic bacterial fosmid with genome sequences from at least 10
mesophilic bacteria that were
±2.5% of the fosmid
G+C content. BLASTP analysis was performed for all Antarctic
predicted proteins against the two types of databases. All results that
had at least two BLAST hits with an expect value of
<1015 were parsed along with their best
matches (up to five). These data were stored in a customized MYSQL
database and subjected to further amino acid analyses. Amino acid
compositions and the protein parameters GRAVY (grand average of
hydropathicity) and aliphaticity were calculated on the ExPASy
website
(http://us.expasy.org
and references therein). Predictions of natural disordered residues,
lengths of the disordered regions, and overall PONDR scores in the
protein coding regions were calculated according to the algorithms VLXT
and VL3 described by Dunker et al.
(11), using licensed
software (PONDR; Molecular Kinetics). The VLXT algorithm was only used
to predict regions of disorder of >45 residues. All of the
parsed Antarctic and mesophilic amino acid sequences (described above)
of >30 residues were analyzed. The PONDR results were uploaded
to the customized MYSQL database and analyzed. Statistical comparisons
of all amino acid contents and of associated parameters and
disorder-associated values of the mesophile averages (n
= 2 to 5) versus the Antarctic data values were made using a
one-sample t test in Statistica
(StatSoft).
Nucleotide sequence accession numbers.
The annotated sequences
determined for this study have been submitted to GenBank under
accession numbers DQ295237 to
DQ295242.
|
|
|---|
- and
-proteobacteria, respectively) represent commonly detected
phylogenetic groups, although Ant4D3 is only distantly related (91% SSU
rRNA gene sequence identity) to its nearest cultivated relative (Fig.
1). Four of the fragments contained rRNA genes, while Ant29B7 and Ant24C4
did not. These inserts contained conserved protein coding regions that
made phylogenetic identification possible. |
View this table: [in a new window] |
TABLE 1. Summary
of the six environmental genome fragments analyzed
|
![]() View larger version (22K): [in a new window] |
FIG. 1. Distribution
of SSU rRNA gene phylotypes identified in the late-winter Antarctic
bacterioplankton clone library. Solid bars represent numbers of unique
phylotypes in the designated phyla/classes, and open bars represent
total numbers of phylotypes found in the
phyla/classes.
|
-proteobacteria contained
50 ORFs, with only 5 ORFs identified as either hypothetical or
conserved hypothetical proteins. The majority of the coding regions
(68%) were most closely related to
-Proteobacteria
and were typically >70% identical to
-Proteobacteria ORF
homologs.
Affiliation of genomic fragments without SSU rRNA genes.
BLASTP searches against a newer
nonredundant database than the one used for the BER search were
performed with each of the ORFs identified in Ant24C4. Silicibacter
pomeroyi, a common coastal marine bacterium and the only
Roseobacter sp. with a known genome sequence, was added to
this newer database and was consistently identified as the closest
homolog (e.g., ORFA019 had 77% identity and ORFA038 had 73% identity;
see Table S2 in the supplemental material for protein distances of six
of the predicted proteins). Protein distance matrices for six amino
acid sequences versus the top BLASTP hits in the nonredundant database
helped substantiate the conclusion that Ant24C4 is most likely
affiliated with the
-Proteobacteria, specifically the
Roseobacter clade (see Table S2 in the supplemental
material).
Phylogenetic analysis of the universal target region of the groEL gene product (21) placed Ant29B7 within the CFB group (Fig. 2). Ant29B7 branches with 14 members of the CFB for which groEL sequences are available. The closest related genome sequences to Ant29B7, those of Bacteroides thetaiotaomicron and Porphyromonas gingivalis (18/30 predicted proteins listed in Table S1 in the supplemental material), were also identified from protein distance comparisons to top 10 BLASTP hits (data not shown).
![]() View larger version (25K): [in a new window] |
FIG. 2. Neighbor-joining
tree representing the 184- to 186-amino-acid universal
target region of groEL gene products from Ant29B7 and its
nearest
neighbors.
|
![]() View larger version (45K): [in a new window] |
FIG. 3. Linear
ORF maps for the six fully sequenced fosmids from the Antarctic marine
picoplankton library. ORFs are color coded according to their COG
affiliations and to highlight ribosomal operons, where they exist.
Highly conserved genes (equivalogs) with gene names in TIGR families
(hidden Markov models) are numbered in the figure and highlighted in
bold in Table S1 in the supplemental
material.
|
(ii) Amino acid synthesis and transport.
COG categories
identified by color in Fig.
3 indicate the large
number of amino acid transport and metabolism genes identified in the
Antarctic genome fragments, especially for Ant4E12 and Ant4D3. Ant4E12
contains a large operon for glutamate biosynthesis that spans almost 13
kb (ORFF007 to -17). Ant4D3 contains genes for two hydrolases
responsible for histidine biosynthesis (ORFD002 and -3), asparaginase
(ORFD019), and a suite of amino acid transporters (ORFD021 to -24).
There were also genes for aspartate and glutamate biosynthesis (ORF045,
-047, and -050). Two copies of the gene for an ABC-type polar amino
acid transport protein appear consecutively in this environmental clone
(ORFD022 and ORFD023). The coding regions are significantly divergent
and are duplicated in other bacterial genomes. Phylogenetic analysis
showed clustering of these two proteins in distinct branches (see Fig.
S1 in the supplemental material). The ABC transporter from ORFD022
branches together with representatives from the
-Proteobacteria, while the ABC transporter from
ORFD023 branches between two groups of
-Proteobacteria.
(iii) Cold metabolism ORF products.
Numerous predicted proteins from each
bacterial genome fragment could play an important role in cold
tolerance. Chaperones were identified in three fragments,
namely, Ant4D5 (groES; ORFE011), Ant24C4 (dnaK;
ORFA058), and Ant29B7 (groES and groEL;
ORFB033 and -34). Three other cellular role categories
suspected to be important for life in the cold were protein synthesis,
DNA metabolism and transcriptional regulation, and DNA binding
proteins. Numerous tRNA aminoacylation and nucleotide base-modifying
proteins were annotated, including synthetases (Ant4D5, ORFE025;
Ant39E11, ORFC031; and Ant4E12, ORFF005), transferases (Ant39E11,
ORFC008; and Ant29B7, ORFB026), and a hydrolase (Ant24C4, ORFA014).
Genes involved in DNA metabolism included those encoding a
DNA helicase (Ant39E11, ORFC013), topoisomerase I (Ant29B7,ORFB025), and DNA polymerase IIIb (Ant39E11, ORFC). Ant4D3 contained
recG (ORFD025), an RNA polymerase subunit gene (rpoZ;
ORFD027), a tyrosine recombinase gene (ORFD043), and a CbbY-like
protein gene (ORFD053). At least one transcriptional regulatory-like
protein was identified in each bacterial fragment, except Ant4E12, and
these proteins ranged in size from 70 to 420 amino
acids.
Amino acid usage. (i) Arginine and proline.
There were 123 and
113 predicted proteins that had at least two homologs (E value,
<1015) in the G+C-specific and
SwissProt/UniProt databases, respectively. Amino acid sequences from
all Antarctic predicted proteins, except those for Ant24C4, showed
significantly reduced Arg/(Arg + Lys) ratios compared to their
mesophile homologs (Table
2). These results were consistent regardless of the G+C content of
the genomes under comparison. Ant4D3 had significantly reduced arginine
usage in 22 of 34 ORF products analyzed against the SwissProt/UniProt
database. Half of the amino acid sequences analyzed were also
significantly different from homologs in the G+C-specific
protein database. Only two and five ORFs had significantly increased
Arg/(Arg + Lys) ratios compared to the SwissProt/UniProt and
G+C-specific databases, respectively.
|
View this table: [in a new window] |
TABLE 2. Results
of amino acid analysis for predicted proteins from six Antarctic
bacterial genome fragments versus their mesophilic
homologsa
|
(ii) Aliphaticity and GRAVY.
The
aliphatic index was significantly lower for 35 of 107 (33%) sequences
analyzed against custom G+C-specific databases. Seventeen amino
acid sequences (16%) had significantly increased aliphatic
indices compared to those of the mesophiles. The
results were similar for amino acid sequences analyzed against the
SwissProt/UniProt database (Table
2). ORFs in Ant4D3,
Ant24C4, Ant4D5, and Ant39E11 showed significant reductions in
aliphaticity. The calculations of GRAVY indicated that for the
sequences with significant BLASTP results, there was no increase in
hydrophilicity for the ORF products from the Antarctic genome
fragments, except for Ant4D5. Sequences analyzed against the
G+C-specific databases indicated that 24/107 had significantly
lower GRAVY indices than their homologs, while 31/107 were calculated
to have significantly higher indices than their mesophile homologs.
These results were similar for sequence comparisons to the
SwissProt/UniProt database.
(iii) Reduced Glu and Asp content.
In every genome fragment
except Ant24C4, there was an overwhelming reduction in the acidic
residues Glu and Asp compared to mesophile homologs in either database
(Table 2). Over 60% of the
sequences analyzed against the G+C-specific database had a
significant reduction in Glu and/or Asp. Less than 15% of the putative
proteins had increases in these residues, and almost all (11 of 14) of
these came from Ant24C4. Ant24C4 was the only fragment in this or any
amino acid usage category to exhibit a strong trend with the
G+C content of the genomes under comparison. Of the 19 Ant24C4
sequences analyzed against the G+C-specific database, 11 had
increased Glu and Asp usage and 1 had decreased usage. Fewer sequences
had significant alterations, with either increased or decreased usage
(three and two, respectively), compared to the SwissProt/UniProt
database.
Protein disorder.
One hundred twelve amino acid sequences
from Antarctic genome fragments and 509 mesophilic homologs from
genomes whose G+C contents were similar
(±
2.5%) to those of the Antarctic fosmid inserts were
analyzed for regions of disorder, using the VL3 and VLXT algorithms
(for regions of disorder of >45 residues)
(11). The longest region
of disordered residues was longer in 27/111 sequences from the
Antarctic fragments than in their homologs, in contrast to only 9 that
were shorter (Table
3). Ant4D3 had only eight predicted proteins with significantly more total
disordered residues than their closest homologs, while 19 sequences had
fewer total disordered residues. We did not exclude data based on a
minimum number of disordered residues or based on the average
prediction score (PONDR score; Table
3), and there is a high
variance in the data with low PONDR scores (i.e., few disordered
bases). Amino acid sequences from Ant4E12 appeared to be more
disordered than those from their mesophilic homologs. The same trends
were found for the average PONDR scores. The VLXT algorithm is a better
predictor of long disordered regions (>45 residues). The
stretches of disordered residues of >45 amino acids calculated
with the VLXT algorithm were also compared. In every instance but two,
the Antarctic amino acid sequences had longer disordered
stretches.
|
View this table: [in a new window] |
TABLE 3. Results
of PONDR for predicted proteins from six Antarctic bacterial genome
fragments versus their mesophilic homologsa
|
|
|
|---|
Environmental genomic library and description of sequenced fosmid inserts.
The library contained at least 105
bacterial rRNA gene-containing inserts representing 56 different rRNA
gene sequences (Fig. 1).
We selected fosmid clones that were ecologically relevant (e.g.,
Ant4D3, an abundant
-proteobacterial phylotype) and/or
phylogenetically unique (e.g., Ant4D5, with one
Gemmatimonadetes phylotype).
The Ant4D3 SSU
rRNA gene affiliated with the
-Proteobacteria was the most
commonly encountered SSU rRNA gene sequence in the late-winter clone
library (14/105 sequences detected). PCR-DGGE analysis of Antarctic
planktonic rRNA gene fragments revealed this phylotype at different
depths and throughout the year (see Fig. S2 in the supplemental
material). This is in contrast to Ant39E11, which was absent at these
depths and during these time points (see Fig. S2, ladder marker B, in
the supplemental material). Ant4D3 is distantly related to a cultivable
representative (91%) (28)
within the OMG and much more closely related (95 to 99%) to SSU rRNA
clones from polar environments, including a common phylotype detected
in DGGE surveys of the Arctic ocean
(2). The Ant4D3 genome
fragment sequence contributes functional information regarding amino
acid biosynthesis, DNA metabolism, and protein translocation and
transport of this potentially important microorganism in polar
waters.
Genomic sequences from two other fosmid clones
are related to bacterioplankton from the commonly occurring
Roseobacter group (Ant24C4) and the abundant CFB
group (Ant29B7) in Antarctic waters. Ant24C4 was closely affiliated
with
-Proteobacteria, most
specifically with the recently reported
Silicibacter pomeroyi
(26).
Phylogenetic analysis of the universal target region
(21) of the
groEL gene indicated that Ant29B7 was closely related to
bacteria in the CFB group.
Ant4E12 is the first marine
actinobacterium-related genome fragment to be sequenced. This group has
often been encountered in oceanographic surveys (
3.5% of
environmental sequences and isolates recovered from seawater)
(37). Although the roles
of this group in ocean systems are not known, their abilities to
degrade high-molecular-weight carbon and to produce antimicrobial
compounds are of interest.
Two clones, Ant4D5 and Ant39E11, were sequenced that were distantly related to cultivated bacteria (<90% over the entire SSU rRNA gene; Table 1) but appear to be periodically active in the environment. Each clustered with several uncultivated rRNA gene environmental clones from either deep-sea or polar latitudes (92 to 96% identical). In the case of Ant4D5, this is the first description of genome characteristics of this newly described bacterial phylum, i.e., Gemmatimonadetes (41).
Three themes: biogeochemistry, amino acid biosynthesis/transport, and cold tolerance.
Little is known
about the physiology of the Antarctic microbes targeted in this study.
Small genome fragments provide interesting snapshots of the
physiological and metabolic capabilities of an organism and are
practical for complex community screening
(3,
4,
8,
19). Caution should be
used when interpreting the biogeochemical relevance of genes or genome
fragments without expression studies or rate measurements. However,
gene contents may accelerate our ability to bring related organisms
into culture or to measure their activities in situ.
Many predicted proteins were identified in the broad themes of biogeochemical relevance, amino acid synthesis and transport, and aids in cold tolerance. Gene finding identified an anaerobic ferrous iron transport operon (ORFC019 and -020) in Ant39E11. Total iron, but especially ferrous iron, concentrations in the surface ocean are very low (24); therefore, it is assumed that iron acquisition is an important bacterioplankton process. Two peptidases in ANT39E11 were identified from the M25 family (ORFC043) and the M23 family (ORFC032). Uncommon peptidases such as these two may have ecological relevance in Antarctic marine bacteria that rely on the short periods of high phytoplankton biomass for their carbon. Members of the CFB group in marine environments are commonly thought to be associated with particles (9) and are abundant in the Southern Ocean (1). Ant29B7, also a CFB member, contained genes for gliding motility and ammonia metabolism.
Ant24C4, the Roseobacter-affiliated bacterial genome fragment, contained predicted proteins implicated in phosphonate utilization and ammonia oxidation. Interestingly, ammonia oxidation was not reported for Silicibacter pomeroyi, for which the only complete Roseobacter genome is available (26). The putative amoA gene from Ant24C4 is more identical to that of Magnetospirillum magnetotacticum (37% amino acid identity) than that of Silicibacter pomeroyi (29% identity).
Given the importance and cost of protein synthesis, there are two main reasons to hypothesize that amino acid transport and biosynthesis pathways might be interesting in cold environments. First, given the high cost of amino acid biosynthesis (especially the high-molecular-weight residues Trp, Try, Arg, and Phe), scavenging amino acids in cold environments may be important and cost-effective. Second, when this scavenging is not possible, the proteins involved in biosynthesis will be modified in comparison to their mesophilic homologs. All of the putative proteins for Ant4D3 and Ant4E12 that were implicated in amino acid biosynthesis had at least two amino acid modifications (described in Table 2) that are indicative of cold adaptation. Twelve of these 14 putative proteins had significant reductions in the Arg/(Arg + Lys) ratio compared to their mesophilic homologs (further described in the following section).
Putative proteins involved in cellular roles (chaperones, protein synthesis, DNA metabolism, transcriptional regulation, and DNA binding proteins) that might be particularly affected by cold were abundant in our data set. These putative proteins appear to be highly modified, particularly in terms of decreased Arg/(Arg + Lys) ratios and decreased polar residue usage. In Ant4D3, all five of the putative proteins within these role categories had significantly reduced Arg/(Lys + Arg) ratios. Four of the five had decreased usage of polar residues, and four had significantly increased serine usage (data not shown). Our global analysis did not turn up significant increases in serine usage in the Antarctic bacteria compared to their mesophilic homologs; however, a significant increase in serine usage was reported for Colwellia psychrerythea 34H (25) compared to mesophile and thermophile genomes.
Amino acid usage and cold adaptation.
For enzymes, increased flexibility and
decreased stability translate into greater entropy
(17). The thermodynamic
effect of cold adaptation is a reduction in the temperature dependence
of the maximum catalytic rate
(14,
15). This can
be achieved through structural plasticity and decreased stability
during the activation of an enzyme-substrate complex, resulting in a
reduction of the activation enthalpy and an increase in the activation
entropy of the reaction. These consequences are similar to reactions
involving intrinsically disordered protein motifs
(11). These are regions
without a defined three-dimensional structure or a time-averaged
canonical set of Ramachandran angles
(11). The kinetic effects
of disorder mimic the effects of cold adaptation, including high
specificities coupled to low affinities, binding plasticity, the
creation of very large interaction surfaces, and higher rates of
association and dissociation
(11,
13,
15,
17,
18).
The analysis of all Antarctic ORFs with significant BLASTP results (at least two BLASTP hits with E values of <1015) indicated widespread adaptations to the cold, stenothermal environment. About half of the predicted proteins and their closest homologs in the two types of databases used (SwissProt/UniProt and G+C-specific databases) were analyzed for amino acid usage, disordered regions, hydrophilicity, polarity, and aliphaticity. The results suggest that, for these analyses, there is no significant G+C effect on amino acid usage. Finally, it appears that proteins from psychrotrophs have increased regions of natural disorder than do mesophilic bacteria.
(i) Reductions in arginine and proline.
Decreased
arginine and proline usage in psychrophilic enzymes has two main
effects, namely, a reduction in salt bridges and an increase in the
entropy of the unfolded protein
(13,
33). Arginine-to-lysine
substitutions that lower the Arg/(Arg + Lys) ratio were
significant in a previous study comparing 21 psychrophilic enzymes
(18). In almost every
protein included in our analysis, the ratio Arg/(Arg + Lys) was
reduced compared to those of significant BLASTP matches. Perhaps given
the frequency of this occurrence, the psychrophilic benefit, and the
simplicity of the Arg-to-Lys substitution (a second-position G-to-A
purine substitution), this amino acid usage "rule" in
cold-adapted proteins is G+C independent. As more psychrophile
genomes become available, this hypothesis will be interesting to
test.
(ii) Reductions in aliphaticity and GRAVY.
In general, it is
assumed that an increase in protein flexibility is an advantage at cold
temperatures (13). An
increase in the protein core flexibility through reduced hydrophobic
and/or nonpolar interactions may increase cold-temperature reaction
rates of enzymes and increase the efficiency of other
structure-dependent processes such as substrate and DNA binding
(14,
17,
18). The analyses of
aliphatic indices and GRAVY scores did not yield the same results. This
could be a result of our analysis methods, which do not account for the
class or structural regions (exposed versus buried) of putative
proteins being analyzed. Presumably, there are various degrees of
selective pressure on amino acid usage for the various functional
classes of proteins. A more detailed analysis separating functional
classes and including whether residues are buried or exposed is needed
(18,
25,
34). Nonetheless, our
results suggest that there are significant characteristics of amino
acid usage associated with these Antarctic bacterial genome fragments.
For example, the arginine repressor argR (ORFF08) from Ant4E12
encodes a signal protein that shows numerous adaptations to cold,
including a lowered proline content and a deletion of 12 hydrophobic
residues immediately following the disordered DNA binding domain. The
tatA gene (ORFD04) from Ant4D3 also encodes an amino acid
deletion of a stretch of 10 extremely hydrophobic residues, including
Val, Ile, and Leu. While these deletions do not significantly change
the GRAVY score of the entire protein, a comparative analysis of the
tertiary structures of these Antarctic bacterial proteins compared to a
mesophile structure would help elucidate the importance of such
hydrophobic deletions on structure and function.
The
artM gene duplication is found in many bacteria.
Phylogenetic analysis of the two predicted artM
protein products shows that they branch in separate clusters (see Fig.
S1 in the supplemental material). One branch is highly ordered based on
phylogenetic relationships within the Bacteria and within
class in the case of the Proteobacteria. ORFD022 from Ant4D3
in the phylogenetically ordered branch has a 16-amino-acid deletion of
hydrophobic residues. The predicted protein from ORFD023 has numerous
insertions and deletions compared to its nearest
neighbors, including members of the
- and
-Proteobacteria. The deletions are
almost always from very hydrophobic regions, suggesting that the
structural rigidity imparted by hydrophobicity is minimized in this
cold-adapted protein. Future work will identify if the two proteins are
expressed.
(iii) Reductions in Glu and Asp.
The ratio (Asn +
Gln)/(Asn + Gln + Glu + Asp) strongly favored
the less polar residues in the Antarctic bacterial amino acid
sequences. Only ORFs from Ant24C4 did not demonstrate this tendency.
Otherwise, the trend was independent of the G+C content of the
database. Using substitution matrices, Gianese et al. found that a
Glu
Ala substitution was one of the most significant in their
cold adaptation structural analysis
(18). Glutamate
substitutions were favored in helix structures and in exposed positions
of the tertiary structure. The effect of this substitution is to reduce
ion pairing, H bonding, and other electrostatic interactions. Also, the
substitution of a hydrophilic residue for a hydrophobic one on the
surface may destabilize the protein
(7). While we did not
discriminate between exposed and buried residues, the overwhelming
trend in our data suggests that these polar residues are often
substituted.
(iv) Increases in disorder.
Based on analyses
of disorder in the predicted proteins of the Antarctic bacterial genome
fragments, our data suggest that disordered regions of amino acids are
longer in psychrotrophic and psychrophilic bacteria. Our hypothesis is
that disorder increases entropy and is necessary to compensate for the
structural rigidity encountered at lower temperatures. Cold
temperatures place demands on access to enzymatic active sites and
binding regions. These demands can be mitigated via increased disorder,
decreased hydrophobicity, reduced salt bridges, and hydrophilic
insertions and are common phenomena in the Antarctic bacterial amino
acid sequences analyzed here.
We have found that proteins involved in regulation and signal transduction have altered amino acid usages compared to their mesophilic homologs. This is probably due to the need for the structural plasticity necessary for signal binding or recognition, as reported in other cold adaptation genome studies (11, 34). For example, tatB (ORFD05) encodes a significant C-terminal modification to a proline-rich region following an extremely disordered putative binding site. The transcriptional regulator encoded by atoC (ORFB029) contains two binding sites that under cold, rigid conditions require enhanced flexibility and binding access. A reduced number of prolines and a more disordered region around the ATP binding site define the C-terminal section of the protein. The N-terminal helix-turn-helix DNA binding site is defined by much weaker order-disorder transitions, resulting in a higher average disorder strength than those of the mesophiles. These are definitive cold adaptation mechanisms (13).
Among the chaperones, the groES ORF (ORFE011) from Ant4D5 encoded a polar insertion of 16 amino acids that lengthened a stretch of disordered amino acids. The chaperonin genes groES and groEL were also present in the Ant29B7 fosmid. Similar to the groES gene from Ant4D5, ORFB033, the groES gene from Ant29B7, encoded a hydrophobic amino acid deletion. The disordered stretch of the Ant29B7 protein was longer than those of similar mesophile proteins, providing more flexibility.
A sulfur transferase with a rhodanese domain was found in Ant29B7 (ORF061). This protein was significantly more hydrophilic in Ant24C4 due to a large 16-amino-acid deletion of hydrophobic amino acids than in its mesophilic homologs. The N-terminal side of the protein consisted of a disordered region of 40 bases. Stretches of disordered residues were consistently found to be longer in the Antarctic bacterial amino acid sequences than in their mesophilic homologs (Table 3.)
These Antarctic bacterial shotgun clones (40 to 44 kb each) have provided a diverse suite of genomic information about central metabolism, environmental sensors, stress responses, cellular transport, and amino acid modifications that demonstrate cold adaptation. Furthermore, our approach has enabled us to study both environmentally relevant organisms and unique members of the community that do not have close relatives that have been cultivated or genome sequence information available (with the exception of Ant24C4, which appears to fall in the Roseobacter clade). Amino acid analysis of the coding regions from these psychrophilic/psychrotrophic marine bacteria spanning four phyla revealed pervasive amino acid modifications characteristic of cold adaptability or decreased structural rigidity. If these modifications are ubiquitous in Antarctic marine organisms, we expect to find similar adaptations in the deep-sea microbes that inhabit >75% of the ocean below 4°C.
We are especially indebted to our collaborators E. Rubin and associates at the Joint Genome Institute for DNA sequencing services. We thank the DRI IT staff, including S. Liu and P. Neeley, and B. Beck at the Nevada Center for Bioinformatics for providing direction in protein structure analysis. We thank Mihailo Kaplarevic and Garrett Taylor for sharing their programming and database expertise. We also thank J. Campbell and Integrated Genomics, Chicago, IL, for access to Polaribacter filamentus genome data and the Joint Genome Institute for providing DNA sequencing.
Supplemental material for this article may be found at http://aem.asm.org/. ![]()
Present address: Symbio Corporation, 1455 Adams Dr., Menlo Park, CA 94025. ![]()
|
|
|---|
This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»