Previous Article | Next Article ![]()
Applied and Environmental Microbiology, April 2009, p. 2259-2265, Vol. 75, No. 8
0099-2240/09/$08.00+0 doi:10.1128/AEM.02551-08
Copyright © 2009, American Society for Microbiology. All Rights Reserved.
,
Department of Plant and Soil Sciences, Delaware Biotechnology Institute, University of Delaware, Newark, Delaware 19711,1 College of Marine Studies, University of Delaware, Lewes, Delaware 199582
Received 7 November 2008/ Accepted 5 February 2009
|
|
|---|
|
|
|---|
Despite the high abundances of viruses in nature, the lack of a shared genetic marker creates a difficult problem when attempting to examine viral genetic diversity in environmental samples (31). Gene g20 encodes a multifunctional protein within the collar between the capsid and tail in T4-like bacteriophages and has been of significant importance in examining the genetic diversity of cyanomyoviruses (22, 24, 32). As well, others have been able to evaluate the diversity of unidentified aquatic picornavirus-like viruses using the RNA-dependent RNA polymerase gene (13). Other studies have attempted to examine phage genetic diversity based on the DNA polymerase gene (6, 21). Unfortunately, not all known phages contain these specific genes; hence, their use as universal markers is markedly inadequate. Thus, molecular methods that do not rely on polymorphism analysis of a single gene product must be used to circumvent these limitations.
Recently, metagenomic approaches (i.e., sequencing of random genomic DNA fragments from whole microbial assemblages) have been used to examine genetic diversity within viral (18) and prokaryotic (10) assemblages. For sediment environments, metagenomic analysis has revealed that the viriobenthos is perhaps the most diverse of all viral assemblages, having been estimated to contain more than 10,000 genotypes per kg of sediment (4). Viral assemblages within a wide range of environments including marine (2, 8) and estuarine (3) waters, soils (20), stromatolites (16), and equine (9) and human feces (5, 40) have been examined. Overall, these studies have shown that a relatively low proportion (
30%) of viral metagenome sequences are similar to sequences found in the nonredundant GenBank database (nr database), but the probability of detecting significant BLAST homologs increases twofold when queries against other viral metagenome sequence libraries are included (3). Thus, the function of most viral genes is currently unknown; however, these genes are broadly distributed among viruses.
While large-scale metagenomics offers unprecedented resolution of the diversity and composition of a viral assemblage, the significant costs and computational requirements preclude routine application in a large collection of environmental samples. Recently, Winget and Wommack (36) introduced a new, low-cost, high-throughput means for genetic analysis of viral diversity utilizing random amplified polymorphic DNA PCR (RAPD-PCR). In this approach, a single 10-bp oligonucleotide serves as both the forward and reverse primers in a single thermocycler reaction. Target sequences in the template DNA are randomly selected; thus, development of a RAPD-PCR assay requires no prior information on the DNA coding content within the sample or organism—a significant advantage considering the largely unknown nature of most viral genes.
In this study, we assess the potential of RAPD-PCR as a tool to examine genotype-scale compositional changes in the Chesapeake Bay viriobenthos and to explore the genetic diversity of viruses within Chesapeake Bay sediments. To our knowledge, this is the first study to use RAPD-PCR for evaluating sediment viral diversity and documenting compositional changes in viriobenthos assemblages over time and geographic location.
|
|
|---|
Viral extractions.
Sediments were processed for removal of viruses as described by Helton et al. (25). Briefly, 2 ml of surface sediment was placed inside sterile 50-ml centrifuge tubes to which 8 ml of 10 mM disodium pyrophosphate and 5 mM EDTA were added. Samples were vortexed at high speed horizontally for 20 min. Larger particles were removed by centrifugation at 2,000 x g for 25 min postagitation. The supernatant was filtered through a 0.45-µm filter (Sterivex; Millipore Corp.). The filtrate was then passed through a 0.22-µm filter (Sterivex) to remove any remaining particles and bacteria. Viral particles in the 0.22-µm filtrate (ca. 8 ml) were concentrated using Centricon-YM30 filters (30,000-molecular-weight cutoff; Millipore), thrice rinsed with sterile TE (100 mM Tris-HCl, 10 mM EDTA), and filtered again with a 0.22-µm filter (Sterivex) prior to storage at –20°C. Several viral concentrate samples were treated with DNase alone, as well as samples treated with heat plus DNase as described by Helton et al. (25) to test for the presence or effects of any free DNA remaining in the concentrated filtrates. Treated samples from both DNase and heat plus DNase treatments were used as templates for testing RAPD-PCR amplification of viral DNA.
RAPD-PCR.
Primer OPA-9 (5'-GGGTAACGCC-3') was used in all PCRs. PCR mixtures (25 µl) contained 2.0 mM MgCl2 (included in 10x buffer), 0.8 mM each deoxyribonucleoside triphosphate (TaKaRa Bio Inc.), 4 µM primer, and 2.5 U of TaKaRa ExTaq HotStart Version (TaKaRa Bio Inc.). Template concentrations were standardized by adding one microliter of a sediment viral extract containing ca. 7 x 107 virus particles to each RAPD-PCR assay mixture. Reactions were carried out in an MJ Research PTC-200 thermocycler using the following parameters: initial denaturation for 10 min at 94°C; 33 cycles of 3 min at 35°C, 1 min at 72°C, and 30 s at 94°C; and a final extension of 10 min at 72°C. All RAPD-PCR products were visualized by gel electrophoresis on a 1.8% Metaphor (FMC BioProducts) agarose gel in 0.5x Tris-borate-EDTA buffer. Gels were stained in a 1x SYBR gold (Invitrogen) bath for 30 min and imaged with a Typhoon 8600 variable-mode imager (Molecular Dynamics, Amersham Pharmacia Biotech) under 560 BP 30/green 532 nm, focal plane plus 3 mm. Resulting band patterns were analyzed using GelCompar II (version 4.50; Applied Maths). The similarity of RAPD-PCR banding patterns was determined using Jaccard's coefficient, and a dendrogram depicting banding pattern similarity was generated using the unweighted pair group method of averages algorithm.
Cloning and sequencing.
Resulting RAPD-PCR amplicons from all three locations (908, 804, and 724) and from three seasons (autumn 2003, spring 2004, and autumn 2004) were used for genetic analyses. In addition, a recurring band of
625 bp (in 12 of 20 samples) was selected from several samples, excised from the agarose gel, and purified with the QIAquick gel purification kit (Qiagen). Entire collections of RAPD-PCR amplicons were purified with the QIAquick PCR purification kit (Qiagen). Both purified products (gel excised bands and whole collections) were cloned using the pCR8/GW/TOPO TA cloning kit (Invitrogen) in accordance with the manufacturer's protocol. This vector was chosen as it includes terminator sequences on either side of the multiple cloning site to prevent the transcription of insert DNA. Cloned products were chemically transformed into One Shot Mach1-T1 (Invitrogen) chemically competent Escherichia coli cells in accordance with the manufacturer's instructions. Clones with plasmid DNA containing RAPD-PCR inserts were purified using the DirectPrep 96 MiniPrep kit (Qiagen).
A total of 518 clones were sequenced using an ABI Prism 3130XL genetic analyzer (Applied Biosystems). From that, 448 sequences were obtained and edited for contaminating vector and primer sequence using Sequencher 4.6 (Gene Codes Corp.) software. Sequence results for the
625-bp band were aligned and compared using Sequencher 4.6 (Gene Codes Corp.) software. All sediment viral sequences were subjected to several BLAST analyses (1). A translated query versus protein (BLASTx) was used for homology searches against the GenBank nr and environmental nonredundant (env-nr) databases. A translated query versus the translated database (tBLASTx) was used for homology searches against the GenBank nucleotide (nt) and environmental nucleotide (env-nt) databases as well as databases for three viral metagenomes: (i) Chesapeake Bay virioplankton (CBV) (3), (ii) Delaware soil viruses (DSV) (K. E. Wommack, S. R. Bench, K. E. Williamson, and M. Radosevich, presented at the International Symposium on Microbial Ecology, ISME-10, Cancun, Mexico, 22 to 27 August 2004), and (iii) other viral (OV) databases (4, 5, 8). Only BLAST results showing expectation values (E) of <0.001 were considered for phylogenetic and putative functional identification.
Nucleotide sequence accession numbers.
All sediment viral sequences were deposited in the GenBank database with accession numbers FJ640107 to FJ640554.
|
|
|---|
45% similarity in RAPD banding patterns. In contrast, sediments from upper bay station 908 had the greatest variability in viral assemblages (i.e., least overall similarity), with all samples showing slightly less than 20% similarity. Viriobenthos assemblages within sediments from station 724 also had a high degree of variability, with samples from 2003 scattered across several clades and separate from 2004 samples. Within each station, sediment viral assemblages from 2004 samples were typically more similar to one another than they were to 2003 samples, indicating a possible interannual change in the Chesapeake Bay viriobenthos. Samples that did not amplify or showed less than four bands per lane were excluded from analyses due to probable degradation of DNA or incomplete amplifications (n = 5 of 25). DNase-treated viral concentrates showed no loss of RAPD-PCR bands compared to untreated controls (data not shown) (36). Samples treated with heat plus DNase prior to RAPD-PCR also showed no loss of bands. However, a noticeable decrease in band intensity was observed.
![]() View larger version (18K): [in a new window] |
FIG. 1. Unweighted pair group method of averages tree of viriobenthos RAPD-PCR banding patterns. Jaccard's similarity scale is listed along the top.
|
625 bp was observed within 12 RAPD-PCR banding patterns from the total population of 20 samples (see Fig. S1 in the supplemental material). DNA within this band was cloned from the RAPD-PCR patterns of two samples (one each from stations 908 and 804). Subsequently, two clones from each of these bands were sequenced. Also included in this analysis was a single clone containing a sequence homologous to the
625-bp band that occurred within a clone library of RAPD-PCR amplicons from another station 804 sample. Alignment of these five amplicon sequences showed a 1.6% overall divergence at the nucleotide level, and each sequence gave the same top BLAST homolog when assessed against environmental sequence databases (CBV, E < 10–20; env-nt, E < 10–9; DSV, E < 10–4). No significant BLAST homology was found in searches against the GenBank nt/nr databases. All RAPD-PCR sequences were compared to sequences in the GenBank nr and env-nr databases as well as three databases of viral metagenomic sequences. In all cases only those homologs showing a BLAST E value of <0.001 were considered significant. Median expectation values for all BLAST sequence homologs within each database and according to taxonomic classification (e.g., virus, bacteria, or environmental) are listed in Table 1. Of the 448 analyzed sediment viral sequences, 54% showed no significant homology to sequences within any of the subject BLAST databases (e.g., nr, nt, env-nr, env-nt, and viral metagenome databases). The frequencies of significant homology to sequences in the GenBank nr/nt and environmental databases were similar, 24 and 22%, respectively.
|
View this table: [in a new window] |
TABLE 1. Log transformed median BLAST E values for RAPD-PCR sequences with significant BLAST homology (E < 0.001)
|
![]() View larger version (20K): [in a new window] |
FIG. 2. Sequence length distribution of RAPD-PCR amplicons versus their log transformed E values (E less than 10–3). Overall average sequence length was 359 bp. (A) All databases combined; (B) GenBank nr database; (C) GenBank nt database; (D) env-nr database; (E) env-nt database; (F) CBV metagenome; (G) OV metagenome; (H) DSV metagenome.
|
Taxonomic and functional characterization of RAPD-PCR amplicon sequences.
For each amplicon sequence, the top five hits were analyzed and the best hit was selected. Best hits were selected based on E value and most appropriate taxonomic identification. Analysis of significant homologs (E < 0.001) to sequences within GenBank (nr/nt database) showed that 29% of the GenBank (nr/nt database) sequence homologs were classified as viral and 70% bacterial. Among the RAPD-PCR amplicon sequences with a viral best hit, 46% belonged to bacteriophages within the order Caudovirales (i.e., the tailed phages) and another 11% were of algal virus origin (i.e., Phycodnaviridae). Analysis of viral homologous sequences showed that a large proportion of these belonged to prophages (34%), as well as the Podoviridae (24%) or Myoviridae (22%) viral families (see Fig. S3 in the supplemental material). Proteobacteria comprised the majority of bacterial best hits to RAPD-PCR amplicon sequences, of which most were from the class Alphaproteobacteria. Among the bacterial homologs, 57% were listed as prophage/phage related and 23% were classified as hypothetical or conserved hypothetical proteins. Of the hypothetical proteins, 6% were encoded by genes located within 60 kb of phage-related genes (e.g., terminase and integrase genes) and 4% near transposase genes within their host genomes (Fig. 3).
![]() View larger version (24K): [in a new window] |
FIG. 3. Comparison of protein homologs to sediment viral concentrate RAPD-PCR sequences (112 of a total of 448 sequenced). Phage, phage-related, or near-phage sequences comprised 69% of the total. Values in parentheses are median log transformed E values. Results are based on tBLASTx E values (E < 0.001). Abbreviations are as follows: Bact, bacteria; Hyp, hypothetical gene; Fxn, gene with defined function; +P, within 60 kb of phage-related gene(s); –P, no close proximity to phage-related gene(s); +T, within 60 kb of transposon-related gene(s).
|
|
View this table: [in a new window] |
TABLE 2. Distribution of homologs to RAPD-PCR amplicon sequences according to protein functional groups and taxonomic domain
|
|
|
|---|
RAPD-PCR profiling of viral communities in sediments.
Overall, it is apparent that RAPD banding patterns do not remain absolutely consistent over time at any given location, indicating that viriobenthic assemblages are temporally and geographically dynamic. The high viral production rates and short turnover times reported for many sediment environments (14, 27) lend further support to the notion that the composition of viriobenthos assemblages can change quickly. In contrast to water column environments where pulsed-field gel electrophoresis and RAPD-PCR analysis indicated seasonally linked changes in Chesapeake Bay virioplankton assemblages (36, 39), viriobenthos assemblages in the bay tended to change with station location. The most likely explanation for the variability in sediments from different stations is the spatial heterogeneity of the bay sediments. Unlike the water column, where a given volume of water is in constant motion, sediments are relatively stable and have limited movement compared to the water column. Even though sediments shift due to movements by benthic organisms (e.g., worms, bivalves) as well as resuspension and sedimentation of particulates, these changes are less obtrusive in the sediment environment than in the water column.
RAPD-PCR amplicon sequence analysis.
BLAST sequence homology searches for this study showed that over one-half of the sediment viral RAPD-PCR amplicon sequences were completely novel. This is a high proportion compared to the ca. 30% novel sequences typically reported for environmental double-stranded DNA viral metagenome sequence libraries (3, 4). However, the 24% frequency of BLAST-positive RAPD-PCR amplicon sequences against sequences within the GenBank nt/nr database is similar to that seen in other viral metagenomic studies that have used traditional Sanger sequencing. In many cases, RAPD-PCR amplicons from sediment viruses displayed significant BLAST homology to sequences within more than one database. In those instances, it is clear that frequently the best homologs to sediment amplicon sequences occur in the environmental and metagenomic databases and not in the known organismal databases. Homologs to bacterial functional genes were also found among RAPD-PCR amplicons, a finding echoed in several other metagenomic studies (see the review by Comeau et al. [12]). A large metagenomic study of virioplankton assemblages found a far higher proportion (91%) of unknown sequences (2), a result which was largely attributable to the short read length of sequences (
100 bp) within those shotgun libraries. Recent in silico metagenome simulation experiments clearly demonstrated that short read sequencing technology performs especially poorly at characterizing the functional genetic diversity within virioplankton assemblages (37). Moreover, the low homolog detection frequency of short viral metagenome reads is not alleviated by the increases in sequence coverage newer technologies can provide.
It is likely that the high proportion of unknown sequences seen for this collection of RAPD-PCR amplicons can be explained, in part, by the relatively short average read lengths of these sequences. Nevertheless examination of the distribution of the sequence length versus significant BLAST expectation values indicated that longer sequence lengths did not necessarily yield lower expectation values (Fig. 2). Thus, even short reads can produce good quality sequence matches. Other studies which have examined the functional and taxonomic diversity of RAPD-PCR amplicons from viral assemblages though BLAST homology searches reported BLAST-positive and novel sequence frequencies similar to those seen here (35, 36). In the case of viral assemblages at deep-sea hydrothermal vents (35), the frequencies of BLAST-positive sequences with respect to novel sequences were similar for RAPD-PCR amplicon libraries and a small random shotgun sequence library which did not involve a selective PCR step. Overall, the similar behaviors of sediment RAPD-PCR sequences in BLAST analysis indicate that these amplicons were derived from viral assemblages and are a reasonable means to superficially survey the functional and taxonomic diversity within sediment viral assemblages. These findings also reinforce the idea that the majority of extant viral diversity in the biosphere is poorly characterized.
A preponderance of sequences with homology to genes belonging to members of the Podoviridae and Myoviridae families was seen within a metagenome library of Chesapeake Bay virioplankton (3) as well as in this study. This finding contrasts with previous viral metagenome studies from marine waters (8), coastal sediments (4), and equine feces (9) which found that the majority of viral BLAST hits belonged to phages within the Siphoviridae family (18). Phages within the Podoviridae and Myoviridae families are often considered to be virulent, while phages within the Siphoviridae family have been considered to be more often temperate (4). Thus, the observation of frequent homologs to sequences belonging to members of the Podoviridae and a low frequency of hits to Siphoviridae sequences indicate that the sediments of the Chesapeake Bay do not contain many phage families known to be temperate.
Persistence of identical RAPD-PCR bands indicates that the same viral genes can be found and are maintained within different sediment environments over time. Conserved genes exist among viral groups, and extensive gene transfer can occur among dissimilar viruses in the environment (26, 30). Whether this
625-bp sequence occurred within only one type of virus or within several different viruses cannot be unambiguously ascertained from this study. However, the possibility of finding a common viral gene within a large collection of viral assemblages is not unexpected, as previously reported data have shown that nearly identical T7-like DNA polymerase sequences can be detected in widely divergent environments (6). Moreover, diversity analyses based on metagenomic sequence data have reported that local-scale viral diversity is high but that diversity is more limited on a global scale (7). By extension, this result predicts that specific sequences can be found across a broad range of environments. Thus, there is a good likelihood that this
625-bp sequence is distributed across several different viruses and not limited to a single commonly occurring virus.
In the present study, RAPD-PCR has been successfully used to examine the composition of viriobenthos assemblages in sediment samples from the Chesapeake Bay. Diversity measurements based solely on morphological classification of viral particles or analysis of viral metagenome sequences lack the sensitivity, resolution, and/or sample throughput necessary for determining short-term changes in the composition and diversity of viriobenthos assemblages. As the data show, RAPD-PCR banding patterns can address these morphological shortcomings and thus provide the data needed to assess fine-scale synecological effects of viruses within sediment microbial environments.
We express our sincerest gratitude to S. W. Polson and J. Bhavsar for computational assistance, D. M. Winget and K. E. Williamson for helpful comments, the captain and crew of the R/V Cape Henlopen, Old Dominion University, for use of the multicorer device, and the staff of the DNA Sequencing & Genotyping Center at the Charles C. Allen, Jr., Biotechnology Laboratory at the University of Delaware.
Published ahead of print on 13 February 2009. ![]()
Supplemental material for this article may be found at http://aem.asm.org/. ![]()
|
|
|---|
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»