Previous Article | Next Article ![]()
Applied and Environmental Microbiology, April 2009, p. 2139-2147, Vol. 75, No. 7
0099-2240/09/$08.00+0 doi:10.1128/AEM.02352-08
Copyright © 2009, American Society for Microbiology. All Rights Reserved.
Ea21-4 and Its Relationship to Salmonella Phage Felix O1
,
Andrew M. Kropinski,2
Alan J. Castle,1 and
Antonet M. Svircev3*
Department of Biological Science, Brock University, 500 Glenridge Avenue, St. Catharines, Ontario L2S 3A1, Canada,1 Public Health Agency of Canada, Laboratory for Foodborne Diseases, 110 Stone Road West, Guelph, Ontario N1G 3W4, Canada,2 Agriculture and Agri-Food Canada, 4902 Victoria Avenue North, P.O. Box 6000, Vineland Station, Ontario L0R 2E0, Canada3
Received 14 October 2008/ Accepted 22 January 2009
|
|
|---|
Ea21-4, infecting Erwinia amylovora, Erwinia pyrifoliae, and Pantoea agglomerans strains has been determined. The unique sequence of this terminally redundant, circularly permuted genome is 84,576 bp. The
Ea21-4 genome has a GC content of 43.8% and contains 117 putative protein-coding genes and 26 tRNA genes.
Ea21-4 is the first phage in which a precisely conserved rho-independent terminator has been found dispersed throughout the genome, with 24 copies in all. Also notable in the
Ea21-4 genome are the presence of tRNAs with six- and nine-base anticodon loops, the absence of a small packaging terminase subunit, and the presence of nadV, a principle component of the NAD+ salvage pathway, which has been found in only a few phage genomes to date.
Ea21-4 is the first reported Felix O1-like phage genome; 56% of the predicted
Ea21-4 proteins share homology with those of the Salmonella phage. Apart from this similarity to Felix O1, the
Ea21-4 genome appears to be substantially different, both globally and locally, from previously reported sequences. A total of 43 of the 117 genes are unique to
Ea21-4, and 32 of the Felix O1-like genes do not appear in any phage genome sequences other than
Ea21-4 and Felix O1. N-terminal sequencing and matrix-assisted laser desorption ionization-time of flight analysis resulted in the identification of five
Ea21-4 genes coding for virion structural proteins, including the major capsid protein. |
|
|---|
In the case of Erwinia amylovora, the causative agent of fire blight, many infecting phages have been described, encompassing all three families of Caudovirales, the tailed phages (13, 18, 33, 54, 59, 74). The diversity of these phages, particularly those described by Gill et al. (18), is extensive, but sequence information is limited to the unpublished genome of Era103 (GenBank accession no. NC_009014) and a 3.3-kb region of the
Ea1h genome (26). Both of these phages are members of the Podoviridae (54, 74). No sequence data have previously been described for E. amylovora phages belonging to the Siphoviridae or the Myoviridae. While the precise nature of their evolutionary and taxonomic relationships has been complicated by processes such as horizontal gene transfer, there are usually substantial differences among the proteomes of phages in different families (49, 56).
To address this lack of information, we sequenced the genome of E. amylovora phage
Ea21-4. This phage was previously isolated from soil beneath a pear tree showing active fire blight symptoms (17, 18; referred to as Eram4 in reference 17). It belongs to the Myoviridae family, and restriction analysis suggested a genome size of
75 kb (17), which is almost twice the size of the Era103 genome. Like all of the phages isolated by Gill et al. (18),
Ea21-4 also has a broader host range than the E. amylovora phages isolated in other studies. In addition to infecting a genetically diverse range of E. amylovora isolates (17, 33),
Ea21-4 readily infects isolates of the Asian pear pathogen, Erwinia pyrifoliae (A. M. Svircev, W. S. Kim, and H. Bench, unpublished), and many isolates of the related orchard epiphyte, Pantoea agglomerans (33). It has shown great promise as a biological control agent for fire blight but could not be detected with the available PCR tools (33). Sequence analysis of
Ea21-4 should permit the development of new tests for monitoring populations of this and related phages in the orchard.
Here, we report our analysis of the complete genome of E. amylovora phage
Ea21-4, the first sequence information from an E. amylovora phage belonging to the Myoviridae, and only the second complete genome of an E. amylovora phage. We evaluate the genomic and proteomic relationships of
Ea21-4 to previously sequenced phage genomes, particularly Salmonella phage Felix O1, which is its closest known relative at this time.
|
|
|---|
Ea21-4 and its host, E. amylovora Ea6-4, have been described previously (17, 18, 33). For genomic sequencing,
Ea21-4 lysates were prepared in liquid culture and syringe filtered using 0.2-µm-pore-size surfactant-free cellulose acetate filters (Nalgene, Rochester, NY). Phage Felix O1 was obtained from Nammalwar Sriranganathan (Virginia Polytechnic Institute and State University, Virginia-Maryland Regional College of Veterinary Medicine, Blacksburg, VA).
Transmission electron microscopy (TEM).
Suspensions of at least 109 PFU/ml of
Ea21-4 were prepared by centrifuging 10 ml of filtered lysate at 16,000 x g for 75 min and then resuspending the phage pellet in 1 ml of 5 mM Tris-HCl (pH 8.0) containing 0.1 mM EDTA. Otherwise, microscopy procedures were as previously described (18).
DNA isolation and sequencing.
Genomic phage DNA was isolated by organic extraction. Ten milliliters of lysate was concentrated by centrifugation at 16,000 x g for 45 min at 4°C and then resuspended in 700 µl of SM buffer (50 mM Tris-Cl [pH 7.5], 0.1 M NaCl, 8 mM MgSO4). Bacterial nucleic acids were digested by adding 1 µl each of 1 mg/ml of DNase I and RNase A and then incubating the mixture at 37°C for 30 min. An equal volume of 20% (wt/vol) PEG 8000 in 2.5 M sodium acetate was added, and the mixture incubated on ice for 2 h. After centrifugation at 15,000 x g for 10 min at 4°C, the phage pellet was resuspended in 500 µl of SM buffer. Phage particles were lysed by adding 5 µl of 10% sodium dodecyl sulfate (SDS), 500 µl of 0.5 M EDTA (pH 8.0), followed by incubation at 65°C for 15 min. Phage nucleic acids were then purified in a three-stage organic extraction using equal volumes of buffered phenol, then phenol-chloroform (1:1), and finally chloroform. At each stage the aqueous and organic phages were mixed by gentle inversion for 3 min, followed by centrifugation at 13,500 x g for 5 min at room temperature. DNA was precipitated by adding sodium acetate to a final concentration of 0.3 M and adding an equal volume of 100% ethanol, until the phage DNA had just precipitated (more ethanol resulted in the coprecipitation of bacterial polysaccharides). The precipitated DNA was collected by centrifugation at 15,000 x g for 15 min at 4°C, washed with 70% ethanol, and air dried before being resuspended in 10 mM Tris-HCl (pH 7.5) and stored at –20°C. DNA concentration was determined by sample absorbance at 260 nm (DU640 spectrophotometer; Beckman Coulter, Mississauga, Ontario, Canada).
Approximately 100 µg of DNA was sent to Agencourt Bioscience (Beverly, MA) for sequencing and assembly: the shotgun genome library was constructed by mechanically shearing the
Ea21-4 DNA; the resulting 3- to 4-kb fragments were cloned into their proprietary pAGEN vector using the BstXI adaptor; Escherichia coli colonies carrying the cloned vectors were selected; and inserts from the selected clones were sequenced for 10x coverage. The genome was finished by primer walking to a Phrap 40 quality. Sequencing reactions were conducted by using an ABI Prism 3730xl DNA analyzer and BigDye Terminator v3.1 reagents (Applied Biosystems, Carlsbad, CA).
Sequence analysis.
Correspondence between the genome sequence and the original phage was confirmed by comparing virtual digests of the sequence with the restriction patterns produced by digesting phage DNA with MvnI (Roche Diagnostics, Laval, Quebec, Canada), BglII (MBI Fermentas, Hanover, MD), EcoRI (Invitrogen Canada, Burlington, Ontario, Canada), and BamHI (New England Biolabs, Ipswich, MA) according to the manufacturers' instructions. Open reading frames were identified by using Kodon (version 3.1; Applied Maths, Austin, TX). Genes were identified from among the predicted coding sequences (CDSs) based on the presence of an ATG or GTG start codon, followed by at least 30 additional codons, and an upstream sequence resembling the consensus ribosome-binding site, GGAGGT (62, 63). Similarity to described proteins in the global database (available through the National Center for Biotechnology Information [NCBI] at www.ncbi.nlm.nih.gov) was tested by using the BLASTP algorithm. Hits were considered significant only if the e-value was
10–5. Protein domain searches were conducted through BLAST and by using Pfam 21.0, available from the Sanger-Wellcome Trust (15). Biochemical characteristics of proteins (molecular size, pI, etc.) were predicted by using Artemis v.9 (57).
Phage-encoded tRNA genes were identified with tRNAscan-SE v.1.21, using the default parameters (37). Supplementary tRNA classification was conducted with TFAM 1.3, using the default parameters for bacterial TFAM model 0.2 (70).
Promoters were identified based on sequence homology to the extended consensus E. coli promoter, TTGACA(N12-14)TGNTATAAT, immediately upstream of an annotated gene. Rho-independent terminators were identified using Transterm (12).
DNAMAN software (Lynnon Corp., Pointe-Claire, Quebec, Canada) was used to identify direct repeats in the
Ea21-4 genome. Sequence logos were created using WebLogo v.2.8.2 (6, 60).
Genomic comparisons at the nucleotide level were made with Mauve 2.2.0 (8), using a progressive alignment with the default settings. Comparisons at the proteomic level were made using CoreGenes (http://binf.gmu.edu:8080/CoreGenes2.0/custdata.html) (78).
Protein alignments and distance phylograms were conducted with CLUSTAL W2 (31, 72) using the default parameters with the neighbor-joining method. Prediction of protein transmembrane helices was conducted with TMHMM v2.0 (28). Two-way nucleotide sequence comparisons were conducted using BLAST2 (71).
Identification of major structural proteins.
Intact phage particles were purified by cesium chloride gradient centrifugation, according to standard methods (58). A microcentrifuge tube containing 0.5 ml of purified phage suspension was immersed in liquid nitrogen until frozen (
30 s) and vacuum dried overnight using a ThermoSavant MicroModulyo (E-C Apparatus, Holbrook, NY). The sample was resuspended in gel loading buffer containing 1% SDS (58) and denatured in a boiling water bath for 5 min.
Proteins were separated by denaturing gel electrophoresis (SDS-polyacrylamide gel electrophoresis) on a one-dimensional 12% gel. For N-terminal sequencing, proteins were transferred to a polyvinylidene difluoride membrane and stained using Coomassie blue R-250 according to standard protocols (58). Stained bands were excised from the membrane using a razor blade and stored at 4°C. N-terminal Edman microsequencing of the first five to six residues per protein was conducted by the Advanced Protein Technology Centre at the Hospital for Sick Children (Toronto, Ontario, Canada), and the resulting sequences were compared to the hypothetical translated sequences of all predicted open reading frames in the
Ea21-4 genome. For matrix-assisted laser desorption ionization-time of flight (MALDI-TOF) mass spectrometry, the major proteins isolated from Felix O1 and
Ea21-4 were excised from the Coomassie blue-stained gel and subjected to in-gel trypsin digestion as described by Zachertowska et al. (77), with the following modifications: gel slices were destained with 25 mM ammonium bicarbonate-50% (vol/vol) acetonitrile in water on a benchtop shaker for 20 min, proteins were alkylated for 30 min instead of 1 h, and the concentrated extracted peptides were acidified by adding 4 µl of 10% (vol/vol) trifluoroacetic acid. The extracted peptides were purified by using ZipTip C18 microcolumns (Millipore Corp., Bedford, MA) according to the manufacturer's instructions and spotted with
-cyano-4-hydroxycinnamic acid matrix on an MTP 384 ground steel MALDI plate (Bruker Daltonics, Billerica, MA). Peptide spectra were acquired using a Bruker Reflex III MALDI-TOF spectrometer in reflectron detection mode with external calibration. Bruker software was used to produce peptide peak lists that were identified based on custom proteome databases constructed from the
Ea21-4 genome annotation produced during this study and the available Felix O1 genome annotation. Protein identifications were considered if at least four peptides per protein, giving at least 15% sequence coverage, were matched to the database prediction. One missed tryptic cleavage was allowed per peptide, including cysteine S-carbamidomethylation and methionine oxidations.
Nucleotide sequence accession numbers.
The GenBank accession number for the genome of E. amylovora phage
Ea21-4 is EU710883. The 3.3-kb fragment of
Ea1h and the complete genomes of
Era103 and Felix O1 were downloaded from the NCBI GenBank database (AJ278614, EF160123, and AF320576, respectively)
|
|
|---|
Ea21-4 genome.
Ea21-4 is a contractile, tailed virus belonging to the Myoviridae family (Fig. 1). The icosahedral head is 60 nm across, and the uncontracted tail is 90 nm long. The single copy genome is 84,576 bp long. The G+C content is 43.8% which, like most phage genomes (55), is AT-rich relative to that of its bacterial host (see Table 1).
![]() View larger version (185K): [in a new window] |
FIG. 1. TEM image of E. amylovora phage Ea21-4 in the uncontracted (left) and contracted (right) states. Bar, 50 nm.
|
|
View this table: [in a new window] |
TABLE 1. General features of the genomes of three complete and partially sequenced E. amylovora phages, as well as Salmonella phage Felix O1
|
Ea21-4 either include substantial direct terminal repeats or are circularly permuted with respect to each other. Restriction fragment length polymorphism data support the former possibility. In general, the fragment sizes predicted by in silico restriction digests of the
Ea21-4 sequence are consistent with the restriction fragment length polymorphism patterns obtained by digesting genomic DNA with MvnI and EcoRI. However, an in silico restriction digest of the circularized genome sequence with MvnI predicts a 5,920-bp fragment that would span the repeated sequence that was obtained during the final round of terminal primer walks. Such a fragment was not seen after in vitro digestion of genomic DNA and would have been easily distinguishable from the next larger and smaller restriction fragments (6,912 and 5,362 bp) if it were present. Similarly, in silico digestion of the circularized genome sequence with EcoRI predicts a 5,688-bp fragment that is not produced during genomic digestion but that would be distinguishable from the 5,946- and 4851-bp fragments if it were produced. These results are not consistent with a circularly permuted genome. Restriction with BglII and BamHI was also tested; both enzymes failed to digest genomic
Ea21-4 DNA, and their recognition sites are not present in the genome sequence.
The
Ea21-4 genome contains 117 predicted protein-coding genes (see Table S1 in the supplemental material), 95% of which are initiated with an ATG codon. Putative functions could only be assigned to 23 (20%) of the genes based on protein sequence similarities. None of the predicted proteins shows significant similarity to known bacterial pathogenicity factors or to known lysogeny-related genes. Almost all significant BLASTP hits were to phage proteins. The remaining hits were to unknown proteins encoded within bacterial genomes, where the gene for the matched protein was in close proximity to a putative prophage gene, or the bacterial genome was a very early draft that had been subjected only to automated annotation. It is likely that these hits therefore represent genes derived from phages. A cutoff value of 10–5 was used when classifying BLASTP hits as significant. Many studies accept hits with e-values up to 10–3, but the validity of our more stringent cutoff was borne out by visual inspection of the BLASTP results: no alignments were observed with e-values between 3 x 10–3 and 1 x 10–5, and those with e-values up to 3 x 10–3 were invariably of poor quality, encompassing only short stretches of the respective proteins.
Comparative genomics of
Ea21-4.
Forty-two (36%) of the
Ea21-4 genes code for hypothetical proteins unique to this phage. A further 32 (27%) predicted proteins only show significant sequence similarity to predicted proteins in the genome of Salmonella phage Felix O1, suggesting that these proteins may be unique to a group of related phages that includes Felix O1 and
Ea21-4.
The only significant similarity between
Ea21-4 and any previously described phage genome is with Felix O1. Felix O1 is an obligately lytic phage that infects almost all salmonellae and is used as a diagnostic and typing phage (5, 29). Its genome is 1.5 kb larger than the
Ea21-4 genome and has a lower G+C content, despite the similar G+C contents of the genomes of their host genera, Salmonella and Erwinia (Table 1). The head of Felix O1 has been reported to be either 72 nm (1) or 60 nm (35) across, the latter of which is the same size as the
Ea21-4 head.
Ea21-4 and Felix O1 have very different host ranges, but 65 (56%) of the predicted
Ea21-4 proteins are significantly similar to predicted Felix O1 proteins. Since both phages belong to the Myoviridae family, it is not surprising that almost half of their shared proteins appear to be involved in virion morphogenesis, based on the physical location of the genes. For most of the 65 proteins shared by Felix O1 and
Ea21-4, the BLASTP alignments showed 40 to 86% sequence identity across essentially the entire lengths of the respective proteins (see Table S1 in the supplemental material). Among the 33 shared proteins having additional hits besides Felix O1, the complementary Felix O1 protein was the best match in all but two cases (proteins 12 and 113), and the e-values of the non-Felix O1 matches almost always exceeded those of the Felix O1 match by more than 10 orders of magnitude.
A recent study proposed that phages sharing at least 40% homologous or orthologous proteins be classified as belonging to the same genus (32). A holistic comparison of the Felix O1 and
Ea21-4 proteomes using CoreGenes revealed that the two phages share 52.7% of their proteomes in common. This result is consistent with the protein-by-protein comparison using BLASTP and confirms that Felix O1 and
Ea21-4 belong to the same genus: the "Felix O1-like viruses." An alignment of the annotated
Ea21-4 and Felix O1 genomes shows that their shared genes are largely colinear, with no significant genome rearrangements (Fig. 2).
![]() View larger version (28K): [in a new window] |
FIG. 2. Alignment of the annotated Ea21-4 and Felix O1 genomes using Mauve. Both gene maps begin with the rIIA gene and are measured against a ruler in kilobases. Genes transcribed from the negative strand are displaced downward within each gene map. The degree of sequence similarity between aligned regions is indicated by the height of the similarity profile. The tRNA genes are indicated by the filled gene blocks. In Ea21-4 these span bases 21823 through 25229; the 22 Felix O1 tRNA genes are slightly dispersed, between bases 23699 and 30180.
|
Ea21-4 genome is not substantially similar to any other phage. The lack of similarity to described E. amylovora phage genomes is expected since the only previously available sequences were from the Podoviridae.
Morphogenesis.
Sequence-based predictions identified five genes as being involved in virion morphogenesis: genes 46 (terminase), 50 (prohead protease), 64 (baseplate assembly), 69 (tail fiber), and 72 (tail protein). Their locations within the genome suggest that the morphogenesis genes of
Ea21-4 are concentrated in a region that stretches from somewhere between gene 37 (endolysin) and gene 46 (terminase), up to and including gene 74, which is the last gene before a region that is transcribed in the opposite direction and contains genes involved in DNA metabolism.
Most double-stranded DNA (dsDNA) viruses use a DNA packaging motor consisting of two nonstructural proteins, the large and small terminase subunits, which are encoded by adjacent genes. A small but growing number of viruses are now known to deviate from this pattern, using only one protein component plus a hexamer of a genomically encoded RNA called pRNA (21, 64). The best studied of these phages is
29, but others include Cp-1, Sf6, Nf, B103, and GA-1 (20, 39, 41). The presence of only one putative terminase protein in
Ea21-4 and Felix O1 suggests that they may also use this alternative DNA packaging strategy. Like the
29 family of terminase proteins,
Ea21-4 protein 46 contains a terminase-related ATP-binding domain. There is no apparent sequence similarity among identified pRNAs, although their secondary structures are similar (41).
Ea21-4 protein 50 contains a conserved domain with a strong match to the S49 family of serine proteases. Most known phage prohead proteases fall into two classes of serine proteases, which have been designated SH and SK (36). The S49 family falls into the SK class. This group includes proteases in phage, bacterial, archaeal, and eukaryal genomes and may be a derived trait, having replaced SH-type proteases in different phage lineages without affecting the neighboring genes (36).
Large tail fibers were not obvious during TEM examination of
Ea21-4; however, protein 69 contains five regions that are similar to a repeat unit that is common to phage tail fibers (SSAGAHAHSVSGST; Interpro IPR005003). In protein 69 this repeat unit occurs at amino acid residues 227 through 240, 241 through 254, 255 through 268, 293 through 306, and 307 through 320.
Five more structural proteins were identified experimentally. Cesium chloride-purified
Ea21-4 virions were denatured and separated by SDS-PAGE, revealing 10 protein bands after 1 h of destaining. Based on visual comparison to size markers, they were approximately 11, 14, 21, 23, 36, 38, 42, 47, 54, and 120 kDa. The genes encoding four of these proteins were identified based on N-terminal amino acid sequencing (Table 2). In each case, the first six sequenced amino acids (not including formylmethionine) exactly corresponded to the predicted CDS translation. Three of these four proteins had not been identified during searches of the sequence database and were also significantly similar to previously unidentified Felix O1 proteins. The fourth, protein 38, had been tentatively identified as a structural protein because it contains an immunoglobulin (Ig)-like domain from the Big-2 family. At least one protein with an immunoglobulin fold is present in the proteomes of ca. 25% of the fully sequenced Caudovirales phages. Ig-like domains usually appear in tail fiber, baseplate wedge initiator, major tail, major head, or highly immunogenic outer capsid proteins (16). This promiscuity complicates sequence-based annotation since significant local alignments such as those obtained by BLAST searches are more likely to reflect the similarity of Ig-like domain than any true functional homology of the entire proteins.
Ea21-4 protein 38 is most likely a head protein, since the gene encoding this protein is separated from the putative tail morphogenesis genes by gene 50, which encodes the putative prohead protease.
|
View this table: [in a new window] |
TABLE 2. Identification and characteristics of Ea21-4 structural proteins by SDS-PAGE and N-terminal sequencing
|
The 42-kDa protein was by far the most abundant of the isolated structural proteins and is therefore expected to be the major capsid protein. Since a clean N-terminal amino acid sequence was not obtained from this protein, it was subjected to MALDI-TOF fingerprint analysis along with the most abundant protein from Felix O1 virions. For
Ea21-4, 13 of 17 trypsin fragments matched virtual digest fragments of protein 52, with the sequences of the matched fragments accounting for 46% of the protein. For the major Felix O1 protein band, 12 of 20 fragment masses matched protein 112 (NP_944891), accounting for 47% of the protein. Most significantly,
Ea21-4 protein 52 and Felix O1 protein 112 are the same size and share 66% sequence identity across the full length of both proteins. These genes should therefore be annotated as coding for the major capsid proteins of their respective phages.
The gene order of the identified prohead maturation genes in
Ea21-4 and Felix O1 suggests the identity of two genes whose predicted products had no notable sequence similarity to other phage proteins. As is normal among the phage genomes studied to date, the terminase gene in each phage (gene 46 or Felix01p105, respectively) is located a few genes upstream of the prohead protease gene (gene 50 or Felix01p109). The portal protein is expected to lie between them. Based on its size (9, 45), the best candidate in
Ea21-4 is gene 47. The presence of the major capsid gene (gene 52 or Felix01p112) just downstream of the prohead protease gene is also highly conserved. Less universal is the presence of a small gene (gene 51 or Felix01p111) between the protease and the major capsid genes. Structural protein analysis of
Ea21-4 confirmed that this gene is expressed (Table 2). These
Ea21-4 and Felix genes are not significantly similar to any known proteins except each other, but gene order comparisons with phages such as T4 and P22 imply that this small gene encodes a prohead scaffold protein. Scaffold proteins are thought to be extensively cleaved and ejected during prohead maturation (45), which is not consistent with the abundance of protein 51 in purified mature
Ea21-4 virions. However, proteins that function first as scaffold proteins in the viral prohead and then as support proteins in the mature virion are not without precedent among the herpesviruses, which share certain head assembly characteristics with dsDNA phages (68).
Lysis.
All studied dsDNA phages use a holin-endolysin lysis system in which precisely timed holin activity permits passage of the endolysins across the cytoplasmic membrane to disrupt the peptidoglycan layer and lyse the cell. Sequence comparisons revealed that gene 37 of
Ea21-4 encodes an endolysin. Five enzymatic activities have been associated with phage endolysins: the N-acetyl muramidases or "true lysozymes" that are homologous to egg white lysozyme (e.g., coliphage T4), the lytic transglycosylases (e.g., coliphage
), the N-acetylmuramoyl-L-alanine amidases (e.g., coliphage T7), the endo-β-N-acetylglucosaminidases, and the endopeptidases (e.g., Staphylococcus aureus
11). A distance matrix constructed from the
Ea21-4, Felix O1,
Ea1h, T4, T7, and lambda endolysin proteins grouped the
Ea21-4, Felix O1, and
Ea1h endolysins most closely with the T4 or muramidase-type"true lysozyme" (data not shown). This result is consistent with the previous classification of the
Ea1h endolysin as a lysozyme (27).
No holin gene was conclusively identified in the
Ea21-4 genome, but protein 35 is a reasonable candidate for further analysis. To date, more than 250 putative phage holins comprising more than 50 families have been identified, and the lethality of the protein has limited functional analysis of most of them (76). Although a few phage holin genes have been found more than 12 kb from the endolysin gene (38, 61), most are adjacent to it. They share little sequence similarity and are instead classified based on their secondary structure. A key feature is the presence of transmembrane domains. Of the small genes near gene 37 that are transcribed in the same direction (genes 33, 34, 35, 36, 40, and 41), only the predicted product of gene 35 is expected to possess transmembrane helices. Specifically, two helices are predicted, spanning amino acids 5 through 22 and 29 through 51, with the protein termini on the periplasmic side. This secondary structure is consistent with a class II holin except for the orientation of the termini (76), but since the termini are very short and the local versus overall predictions of those orientations by the TMHMM program were not consistent, this discrepancy could be an error in the prediction.
Regulatory sequences.
Two promoter sequences and 24 rho-independent terminators were predicted in the
Ea21-4 genome. Eleven of the terminators contain an identical 20-bp core, GGACTCTTCGGAGTCCTTTT, with slightly variable T-rich extensions on the 3' end (Fig. 3). One additional terminator differs only by the presence of an additional cytosine residue just prior to the poly(T) 3' tail. This conserved terminator is distributed throughout the genome, with representatives on both strands. Certain terminator sequences and genome locations have been identified that are conserved among related phages (41, 48), but it is rare to see a class I hairpin-type terminator having a conserved sequence present in multiple copies throughout the genome. Terminator families of related sequence were identified in the genome of vibriophage KVP40 (42), but no more than 12% of the predicted terminators fell into any one family and most members of a given family were not completely identical across the hairpin and loop structures. In contrast, 50% of the identified
Ea21-4 rho-independent terminators consist of this one identical DNA sequence. The
Ea21-4 conserved terminator is notably absent from the Felix O1 genome. Of the remaining 12 unique terminators found in
Ea21-4, 11 are unidirectional. One bidirectional terminator lies between two head-to-head oriented genes, gene 74 and the thymidylate synthase gene.
![]() View larger version (26K): [in a new window] |
FIG. 3. Sequence logo for the Ea21-4 conserved terminator sequence.
|
Ea21-4 genome contains 26 tRNA genes, all of which lie between genes 42 and 43 and are transcribed in the same direction as those genes (Fig. 4). Two of these had low primary sequence component scores and may therefore be pseudogenes (37). Two other predicted tRNA genes could not be identified. Based on secondary structure predictions from tRNAscan-SE, both have abnormal anticodon loops. Instead of the usual seven bases, the tRNA genes beginning at 21823 and 25073 have six- and nine-base loops, respectively. Functional tRNAs with eight- and nine-base anticodon loops have been engineered and shown to cause translational frameshifting (2, 44, 73), a phenomenon that may occur as a programmed regulatory mechanism in phages (14, 75). At least two apparently wild-type tRNAs with nine-base anticodon loops have been found previously: a predicted UAC-decoding tRNAMet in Astasia longa (65) and a predicted UUA-decoding tRNALeu in Schizosaccharomyces pombe (69). The
Ea21-4 tRNA with a nine-base loop would be a UUA-decoding tRNALeu; a BLAST2 comparison shows no significant similarity to the equivalent S. pombe tRNALeu (data not shown), but this is not surprising since the S. pombe gene would have a eukaryotic rather than a eubacterial lineage. The functionality of the tRNA gene with a six-base anticodon loop is less certain. Attempts to engineer functional tRNAs with six-base loops have had mixed results (2, 7). No such tRNAs have been explicitly noted in previously sequenced genomes, although they may exist in GenBank, whether or not they are annotated as such. No tRNAs with unusual anticodon loops are predicted in the Felix O1 genome. |
View larger version (10K): [in a new window] |
FIG. 4. Ea21-4 tRNA gene cluster. The tRNAScan_SE classifications and predicted tRNA anticodons are indicated. Predicted pseudogenes are marked with an asterisk. The tRNAs with unusual anticodon loops are labeled in red.
|
Apart from identifying the tRNAfMet gene, TFAM's bacterial tRNA model does not appear to be well suited to phage genomes. TFAM tRNA classifications are based on the entire gene sequence, whereas tRNAscan-SE uses only the anticodon loop. The most recent TFAM models have produced very accurate functional predictions of standard tRNAs, while improving nonstandard tRNA prediction (70). However, not including the two undetermined tRNA genes and the tRNAfMet gene, TFAM and tRNAscan-SE only agreed on the functional classification of 9 of 23 tRNA genes in
Ea21-4. None of the mismatches involved TFAM's published caveats, and in most mismatch cases the TFAM scores of the TFAM-predicted and tRNAscan-SE-predicted classifications were close to each other, and weaker overall than the scores of the agreed-upon predictions (data not shown). It is not known whether the bacterial TFAM model performs poorly with other phage genomes, but it seems plausible that the mismatches could be correlated with the overall base composition bias of phages relative to their bacterial hosts. The variation in tRNA sequence identity rules among bacterial taxa is correlated with differences in base composition (3), but the relative accuracy of TFAM predictions on genomes that are particularly AT- or GC-rich has not been tested.
Finally, all but three of the predicted tRNAs produced by
Ea21-4 have a genomically encoded 3' CCA terminus. The other three tRNAs are accommodated by gene 79, which encodes a nucleotidyltransferase capable of attaching the CCA triplet to the 3' end of the tRNA after transcription (11). This 3' CCA is the site of amino acid attachment and of interaction between 23S rRNA and the peptidyl-tRNA during translation (10, 43).
A comparison of the Felix O1 and
Ea21-4 tRNA genes reveals that both possess cognate tRNAs for 19 of the same codons, but not in the same order (see Table S2 in the supplemental material). Only two pairs of tRNA genes have significantly similar gene sequences and can be considered homologous: the respective CAC-decoding His tRNAs, and a UUG-decoding Felix O1 tRNA (Felix01t018) with the
Ea21-4 Leu1 (data not shown).
To date, phage tRNA genes have only been found in the genomes of the dsDNA phages and are more common in virulent phages such as
Ea21-4 than in temperate ones (4). An analysis of 37 phage genomes and their hosts has shown that while phage tRNA genes may be randomly acquired from host genomes, they are selectively retained to compensate for codon usage differences between phage and host (4). Usage differences can exist on a genome-wide basis, as in T4 (30), and perhaps for specific gene clusters, as has been proposed for vibriophage KVP40 (42). Both possibilities should be considered for
Ea21-4 once complete genomes of both E. amylovora and P. agglomerans are available.
Genome replication and nucleotide metabolism.
Several genes encoding proteins directly involved in nucleotide metabolism and DNA replication were identified in the
Ea21-4 genome: thymidylate synthase (gene 75), dihydrofolate reductase (gene 76), a DNA ligase (gene 78), a DNA polymerase (gene 89), dNMP kinase (gene 94), a primase/helicase (gene 95), an exodeoxyribonuclease (gene 97), ribonucleoside triphosphate reductase (genes 103 and 104), a glutaredoxin (gene 105), and a phosphoribosyl pyrophosphate synthetase (gene 113). Based on the distribution of BLASTP hits, two of these genes, encoding the DNA polymerase and the exodeoxyribonuclease, are more closely related to their counterparts from Podoviridae genomes than from Myoviridae genomes. Given the deep ancestral relationships of DNA polymerases, the high degree of sequence identity and interchangeable functionality of DNA polymerase enzymes, and the evolutionary patterns of phage genomes, it is likely fruitless to speculate on what this means for the evolution of the
Ea21-4 genome.
Ea21-4 also contains a nadV homolog. NadV is a principle component of a pyridine nucleotide (NAD+) salvage pathway that has been identified primarily in bacterial genomes. It has also been identified in a few phage genomes (50, 52), and vibriophage KVP40 encodes a complete NAD+ salvage cycle (42). Nicotinamide phosphoribosyltransferase (NadV) catalyzes the conversion of nicotinamide into nicotinamide mononucleotide, which is then converted into NAD+. An abundance of the latter cofactor is required by phage enzymes such as ribonucleotide reductase and thymidylate synthase. E. amylovora is unusual in that it has a strict requirement for nicotinic acid (67). In light of this, the presence of a nicotinamide-scavenging enzyme in the phage proteome might conceivably reflect a phage adaptation to a nicotinate-limited environment or a means by which
Ea21-4 disrupts cellular processes after infection.
Concluding remarks.
The genome of E. amylovora phage
Ea21-4 shows the general functional clustering of genes that is common in phage genomes, but transcriptional studies are needed to more clearly define the locations of the early, middle, and late genes. Although 42 of the 117
Ea21-4 gene products are significantly similar to previously predicted proteins in multiple phages, no notable similarity exists between
Ea21-4 and the previously available sequence information from E. amylovora genomes. The only phage genome to which
Ea21-4 is substantially similar is that of the Salmonella typing phage Felix O1. In fact,
Ea21-4 has the first reported Felix O1-like genome. The similarity between these two genomes extends beyond the fact that both are members of the Myoviridae family. Proteomic work did confirm the existence of shared structural genes, but these two phages also share a total of 32 genes that are not shared with other Myoviridae and that may be unique to the Felix O1-like genus.
From an ecological perspective, the role of the nadV gene in modulating NAD+ metabolism after phage infection may shed light on the unique adaptations of E. amylovora and its phages to the nutritional environment of the plants that are susceptible to E. amylovora infection. On the more basic level of phage-host interactions, the work presented here does not address how
Ea21-4 specifically recognizes Erwinia spp. and Pantoea agglomerans.
This study was supported by the Agriculture and Agri-Food Canada Pest Management Centre Pesticide Risk Reduction and Minor Use programs' Improved Farming Systems and Practices grant. S.M.L. was supported by a postgraduate scholarship from the Natural Science and Engineering Research Council of Canada (NSERC). A.K. is supported by a discovery grant from NSERC.
Published ahead of print on 30 January 2009. ![]()
Supplemental material for this article may be found at http://aem.asm.org/. ![]()
Present address: Division of Healthcare Quality Promotion, Centers for Disease Control and Prevention, Mail Stop C-16, 1600 Clifton Rd. NE, Atlanta, GA 30333. ![]()
|
|
|---|
Ea1h lysozyme in Escherichia coli and its activity in growth inhibition of Erwinia amylovora. Microbiology 150:2707-2714.
29 family of phages. Microbiol. Mol. Biol. Rev. 65:261-287.
29 genomic DNA. Virology 309:108-113.[CrossRef][Medline]
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»