Previous Article | Next Article ![]()
Applied and Environmental Microbiology, July 2006, p. 4899-4906, Vol. 72, No. 7
0099-2240/06/$08.00+0 doi:10.1128/AEM.00354-06
Copyright © 2006, American Society for Microbiology. All Rights Reserved.
Department of Microbiology, University of Georgia, Athens, Georgia,1 U.S. Department of Energy, Joint Genome Institute, Walnut Creek, California2
Received 13 February 2006/ Accepted 3 May 2006
| ABSTRACT |
|---|
|
|
|---|
| INTRODUCTION |
|---|
|
|
|---|
Despite the recognized importance of these elements, genomic analysis of MGEs has been limited. Whereas the total size of sequenced bacterial genomes is 1.3 Gb, only 61 Mb of plasmid genomes and 30 Mb of phage genomes have been sequenced previously (11). Most MGE sequences have been obtained fortuitously during sequencing of their hosts' genomes, resulting in a bias towards MGEs associated with a limited selection of organisms. Large (>50 kb), conjugative plasmids are especially poorly represented in current sequence databases, constituting only 20% of all plasmid sequences in GenBank at present. Most commercial kits for plasmid DNA preparation are designed for small, high-copy-number plasmids. The traditional methods of high-molecular-weight plasmid isolation, such as cesium chloride density gradient centrifugation (26) and pulsed-field gel electrophoresis, require equipment and expertise that are not widely available. Moreover, these and other techniques, such as Eckhardt in-well lysis (6), are time and labor intensive and thus unsuitable for a high-throughput approach.
In the course of large-scale sequencing projects such as the Human Genome Project, magnetic beads were modified for purification of nucleic acids, including bacterial artificial chromosome (BAC) clones (7, 15, 28). The method used here, termed Solid-Phase Reversible Immobilization (SPRI), employs magnetic beads with carboxylated surfaces to bind plasmid DNA under proprietary buffer conditions. Magnetic immobilization of the beads and bound DNA allows removal of cellular debris and chromosomal DNA. We inferred that SPRI should enable isolation of large, natural plasmids similar in size and copy number to BAC clones. The rapidity, ease, and low cost of SPRI BAC purification suggested that it might provide an advantage over traditional costly and laborious methods of high-molecular-weight plasmid isolation. However, there are some important differences between BACs and natural plasmids.
Whereas BACs are maintained individually in laboratory strains of Escherichia coli, wild bacteria typically have several plasmids in a wide range of sizes and copy numbers. Ideally, these should each be recovered separately because the abundance of repetitive elements in plasmids can make computer assembly of libraries constructed from pooled supercoiled DNA, such as obtained from CsCl gradients, difficult or impossible. In addition, it is preferable to recover plasmids from their native hosts (when culturable) rather than having to transfer them to a laboratory strain, which might result in changes (see "Plasmid pLEW517" below). Thus, the ideal plasmid isolation method should be applicable to many types of culturable bacteria and not just E. coli.
We describe here a protocol for the use of SPRI for rapid, efficient, and inexpensive isolation of sequencing-quality DNA of individual large, low-copy-number plasmids from gram-negative and gram-positive bacterial strains. As proof of the efficacy of this method, we report on the completion and closure of full-length sequences of five natural bacterial plasmids isolated using this protocol: the previously sequenced 94-kb Shigella flexneri plasmid NR1; a novel 65-kb E. coli plasmid, pLEW517; a novel 52-kb Staphylococcus plasmid, pLEW6932; and two novel Corynebacterium plasmids, the 35-kb pLEW279a and the 30-kb pLEW279b. We chose these strains and plasmids to answer the following salient questions about plasmid DNA isolation using SPRI technology: (i) does the SPRI isolation method work on gram-negative and gram-positive bacteria? (ii) What is the yield of DNA from a single plasmid band? (iii) What is the best way to remove the plasmid band from the gel? (iv) How much do chromosomal DNA and DNA from other plasmids contaminate the library? We also addressed some of the bioinformatics problems unique to sequencing plasmid DNA: (i) what are the best strategies for vector screening during assembly? (ii) How effective are default BLAST algorithms in identifying DNA and/or protein sequences in plasmids?
| MATERIALS AND METHODS |
|---|
|
|
|---|
|
Following cell resuspension, 100 µl CosMCPrep lysis buffer was added. Preparations were mixed gently by inverting five times and held at room temperature for 5 min. All preparations were handled gently during and after lysis to prevent shearing of the supercoiled DNA. CosMCPrep neutralization buffer (100 µl) was added, and preparations were rotated on an orbital shaker for 10 min at 170 rpm to flocculate cellular debris.
Preparations were then centrifuged for 12 min at 16,000 x g at room temperature. The cleared cell lysates (200 µl) were transferred to microcentrifuge tubes and mixed with 139 µl isopropanol and 10 µl CosMCPrep magnetic bead suspension by mixing gently four to six times with a pipette tip. Tubes were placed in a magnetic stand at room temperature for 10 min to trap the beads with their bound DNA against the sides of the tubes. Unless otherwise noted, all subsequent steps were conducted at room temperature with the tubes in the magnetic stand. The lysate was aspirated without disturbing the beads, and the beads were washed three times with 200 µl 70% ethanol-30% autoclaved MilliQ water. Beads were dried at 37°C in a forced air incubator for 2 min. DNA was then eluted by pipetting 40 µl CosMCPrep resuspension buffer over the beads. Preparations were returned to 37°C in closed tubes for 5 min to separate any beads from the eluate and allow complete elution of bound supercoiled DNA. The bead-free supercoiled DNA eluates (40 µl) were transferred to microcentrifuge tubes and stored at 20°C. Electrophoresis and gel extraction of plasmids were usually done within 24 h, but eluates from gram-negative strains were stored for up to 20 days without appreciable loss of supercoiled DNA.
Electrophoresis and gel extraction of plasmids.
All 40 µl of eluate was loaded onto a 0.3% or 0.5% SeaKem Gold agarose (Cambrex BioScience, Walkersville, MD) gel, with 10 µl of High Range MassRuler DNA ladder (Fermentas, Inc., Hanover, MD) as a size and mass standard. The gel was electrophoresed in 1x TAE (40 mM Tris-acetate, 2 mM Na2EDTA-2H2O) at 80 V for 3 to 4 h. Gels were stained with SYBR green I nucleic acid gel stain (Molecular Probes, Inc., Eugene, OR) to increase sensitivity and reduce the DNA damage typically observed with ethidium bromide. Gels were visualized on a Molecular Dynamics FluorImager, and the mass of DNA in each plasmid band was determined by densitometry. To reduce exposure to damaging UV light, gels were placed on a DarkReader Transilluminator (Clare Chemical Research, Denver, CO), and gel slices containing individual plasmid bands were excised with a razor blade. Supercoiled plasmid DNA was extracted from gel slices using either a GeneClean Turbo glass milk spin kit (Qbiogene, Inc., Carlsbad, CA) or dialysis tubing electroelution.
A GeneClean Turbo kit was used following the manufacturer's instructions, except that 30 µl TE buffer (10 mM Tris, 1 mM EDTA, pH 8) was used for elution. Dialysis tubing electroelution was done as previously described (27, 29), using 1-inch-diameter SpectraPor dialysis tubing that had been boiled in 25 mM EDTA for 10 min, rinsed once with water, and stored at 4°C in 30% ethanol. Gel slices with plasmid DNA were placed in dialysis tubing bags with 250 to 500 µl of 0.1x TAE and electrophoresed at 100 V for 2 h in 0.1x TAE. Polarity was reversed for 2 min, and then the electroeluted DNA was transferred by pipette to a clean 1.5 ml microcentrifuge tube.
DNA yield was quantified by A260/280 of a 1:10 dilution in either TE (for GeneClean eluates) or 0.1x TAE (for dialysis tubing electroeluates).
Whole-genome shotgun sequencing and assembly.
For each plasmid, four preparations of DNA, each isolated from 1.5 ml culture using the appropriate CosMCPrep kit method, excised from a gel as a single plasmid band, and eluted using either the GeneClean kit or dialysis tubing electroelution, were pooled and then sheared on a HydroShear (Genomic Solutions, Ann Arbor, MI) at setting 9 into 2- to 3-kb fragments, which were blunt end ligated into pMCL200, a pUC18-based cloning vector (21). Library construction and template preparation were conducted following standard Joint Genome Institute protocols (http://www.jgi.doe.gov/sequencing/protocols/prots_production.html). End sequencing reactions were carried out using a 1/16 dilution of BigDye Terminator v3.1 (Applied Biosystems, Foster City, CA) and resolved on ABI PRISM 3730 sequencers. Electropherograms were analyzed with PHRED basecalling software (8). The average sequencing read length was 689 ± 30 bp. Sequencing reads were screened using Cross-Match SPS-3.57 (Southwest Parallel Software) to identify and remove vector sequence. For the Corynebacterium plasmids pLEW279a and pLEW279b, the entire vector sequence was used for screening. For the E. coli plasmids NR1 and pLEW517 and the Staphylococcus plasmid pLEW6932, screening with the complete cloning vector sequence introduced artificial gaps into the assemblies, possibly due to similarity between the origins of replication and antibiotic resistance genes on the natural plasmids and those on the cloning vector. Consequently, only sequences identical to the cloning vector spanning the insert site to the sequencing primer annealing site were removed from sequencing reads for these plasmids. After the vector sequences were removed, reads were assembled by PHRAP (www.phrap.org), and gap closure was accomplished by directed PCR of library clones or purified plasmid DNA, resulting in a single, circularized contiguous sequence (contig) for each plasmid.
Analysis of plasmid sequences.
BLAST (1) analysis used the default parameters for the NCBI BLASTN program (+1/3 match/mismatch penalty; word size 11) and the BLASTX program (word size 3), with the exception that the bacterial translation table was used for BLASTX (www.ncbi.nlm.nih.gov/BLAST/). Pairwise, whole genome alignments of plasmid sequences were generated using the MUMmer software package (19). The "nucmer" script was used to obtain percent identity between the two sequences and to identify the exact locations of disagreements.
GenBank accession numbers.
The plasmid sequences determined in this study have been deposited in the GenBank database under the following accession numbers: for NR1, DQ364638; for wild pLEW517, DQ390454; for transconjugant pLEW517, DQ390455; for pLEW6932, DQ390456; for pLEW279a, DQ390458; and for pLEW279b, DQ390457.
| RESULTS |
|---|
|
|
|---|
|
|
|
Plasmid pLEW517.
To determine whether a single plasmid in a multiplasmid host could be reliably sequenced from supercoiled DNA recovered from a gel, we sequenced plasmid pLEW517, previously shown to confer resistance to ampicillin, streptomycin, sulfonamides, and mercury (32), as derived both from its native host, the multiplasmid primate intestinal E. coli strain 517-2H1 (Fig. 1) (30), and from an otherwise plasmid-free laboratory E. coli strain 690FNR into which it had been transferred by conjugation. Dialysis tubing electroelution, which preserved the supercoiled conformation of extracted DNA better than glass milk extraction (Fig. 2), was used to recover pLEW517 DNA and in all subsequent work. Pooling four preparations yielded 3.5 µg for wild pLEW517 and 4.0 µg for transconjugant pLEW517, amounts which were sufficient for library construction.
Sequencing and assembly of wild pLEW517 yielded a single major contig of 63,946 bp (Table 2), and transconjugant pLEW517 yielded a single major contig of 65,288 bp. These contigs were 100% identical except for a 1,342-bp segment found on transconjugant pLEW517 but not wild pLEW517 (discussed below). Overall, 98% of the sequence of pLEW517 returned significant BLASTN hits (Table 3). BLASTN analysis indicated that pLEW517 is a variant of plasmid R46 (NC_003292), a 50-kb IncN plasmid previously observed in Salmonella enterica serovar Typhimurium. Significant similarity (e-value, 0.0) was detected to regions encoding replication, maintenance, and transfer functions on plasmid R46. However, an 18-kb segment of R46 that included the In1 integron was absent from pLEW517. pLEW517 also had sequences with very significant similarity to those of transposon Tn21, which contains a class 1 integron (In2) and a mercury resistance (mer) operon (20), and of transposon Tn3, which encodes a ß-lactamase (18).
|
Further examination of the pLEW517 region showing similarity to Tn21 revealed that although the hallmarks of Tn21, which include the transposition genes (tnpAR) and the mer operon, were present on the pLEW517 transposon, there were dramatic differences in the content of its version of the integron (Fig. 3). Whereas the class 1 integron In2 of Tn21 has a single cassette encoding an aminoglycoside adenyltransferase (aadA1), the class 1 integron of pLEW517 has three cassettes carrying dfrA12 (previously called dhfrXII), a dihydrofolate reductase; an open reading frame (ORF) of unknown function; and aadA2, an aminoglycoside adenyltransferase. These cassettes were identified by BLASTN analysis based on similarity to an integron found on the 89.5-kb Citrobacter freundii plasmid pCTX-M3 (NC_004464), which contains the three cassettes in the same order. This cassette arrangement was previously reported on a Tn21-like element carried on a 70-kb plasmid from a pathogenic E. coli strain (16). Note that the insertion of the pLEW517 integron into the ancestral mer transposon "backbone" is in exactly the same position as in the prototypical Tn21 of NR1 and R100.
|
Plasmid pLEW6932.
We then assessed the effectiveness of the SPRI method on low-G+C gram-positive bacteria using the multiplasmid Staphylococcus strain 693-2 obtained from poultry litter (Fig. 1) (22). This strain had nine visible plasmid bands, the largest of which was chosen for sequencing. Four pooled preparations yielded 2.9 µg DNA for library construction. Sequencing and assembly produced a major contig of 51,514 bp, similar to the 51 kb estimated for pLEW6932 from agarose gels. BLASTN identified two small regions of significant similarity (e-value, 0.0): one similar to the arsenic resistance operon of Staphylococcus saprophyticus plasmid pSSP1 (NC_007351) and another to ß-lactamase genes from other Staphylococcus chromosomes and plasmids (e.g., BX571857) (Table 3). The arsenic resistance (ars) operon on pLEW6932 may be selected due to the common use of organic arsenic coccidiostats such as roxarsone for growth promotion; roxarsone naturally degrades to inorganic arsenate and arsenite (13), to which this locus confers resistance.
To look for genes conserved only at the amino acid sequence level, we used BLASTX analysis, in which the nucleotide sequence is translated into all six possible reading frames and compared to the protein sequence database. BLASTX identified a region of significant similarity (e-value
e90) to replication initiation proteins of various Staphylococcus plasmids (e.g., NP_932180, CAA63141). Other regions of similarity included hits to glycine betaine transporters (e.g., ZP_00233406) and cation transport ATPases (e.g., ZP_00063375) found on the chromosomes of a variety of bacterial species. Of the 125 hits identified using default parameters of BLASTX, most were repetitive hits on these and a few other loci, leaving approximately 60% of the pLEW6932 sequence with no known protein or nucleic acid similarities as reported with the default parameters of these BLAST programs.
Plasmids pLEW279a and pLEW279b.
Last, Corynebacterium strain L2-79-05 provided the opportunity to assess the effectiveness of SPRI on high-G+C gram-positive bacteria and also to investigate the occurrence of plasmid band cross-contamination in multiplasmid strains. In contrast to E. coli 517-2H1, the plasmid profile of this poultry litter (22) strain shows two plasmids of very similar sizes and copy numbers (Fig. 1). Isolation of the larger plasmid pLEW279a by electroelution from the gel yielded 3.1 µg DNA for library construction. Sequencing and assembly produced two major closed circular contigs (Table 2). The 34,606-bp contig corresponded in size to the 35-kb band extracted from agarose gels, and the second contig of 29,854 bp was similar in size to the 30-kb plasmid (named pLEW279b) also observed in the L2-79-05 plasmid profile. This suggests that for plasmids of very similar sizes and copy numbers, gel slices of apparently single-plasmid bands may contain some of the other plasmid's DNA; however, current methods of sequence assembly are capable of resolving the two plasmids into separate contigs. Although the fortuitously sequenced pLEW279b has fewer sequencing reads than pLEW279a (Table 2), it still has very good depth.
For pLEW279a, BLASTN identified two regions with significant similarity (e-value, 0.0) to other plasmids. One large region resembles part of the 28-kb Corynebacterium glutamicum plasmid pTET3 (NC_003227) (31), including a tetracycline resistance determinant and a class 1 integron with a truncated integrase and an aminoglycoside adenyltransferase cassette, aadA9 (Table 3). An adjacent, smaller region resembles the 9-kb Arcanobacterium pyogenes plasmid pAP2 (NC_005206) (17), including a macrolide resistance determinant. In addition, BLASTX identified a region of pLEW279a with approximately 60% amino acid identity to RepA replication initiation proteins of C. glutamicum plasmids pTET3 and pCG4 (29 kb; NC_004945) and a region with approximately 50% amino acid identity to TraA transfer proteins of Corynebacterium plasmids pGA2 (19 kb; NC_004535) and pNG2 (15 kb; NC_005001). Of the 148 BLASTX hits, all are repeated hits on these and a few other loci, leaving 40% of the plasmid sequence with no known protein or nucleic acid similarities as reported by the default parameters of the BLAST programs.
For pLEW279b, BLASTN identified a large region of significant similarity (e-value, 0.0) to C. glutamicum ATCC 13032 chromosomal DNA (BA000036) containing genes involved in copper metabolism and genes encoding a two-component-type response regulator and kinase. BLASTN also identified two small regions of similarity to Corynebacterium plasmids: one region of approximately 2 kb similar to a putative ABC transporter found on the 12-kb C. jeikeium plasmid pA505 (NC_004773) and another region of approximately 0.5 kb similar to part of a transposase found on C. glutamicum plasmids pTET3, pAG1 (20 kb; NC_001415), and pGA2. Last, BLASTX identified a region with approximately 69% amino acid identity to the RepA replication protein of C. striatum plasmid pTP10 (51.5 kb; NC_004939) and a region with approximately 31% amino acid identity to the TraA-like transfer protein of Rhodococcus equi virulence plasmid p103 (80.5 kb; NC_002576). The 103 BLASTX hits were repeated instances of these and a few other sequences. Approximately 40% of this plasmid sequence had no known protein or nucleic acid similarities detectable by the default parameters of these two BLAST programs.
| DISCUSSION |
|---|
|
|
|---|
We found that vector screening during assembly of plasmid sequences is complicated by the presence of similar genes on both the natural plasmid and the cloning vector, resulting in the introduction of artificial gaps in the assembly. Rather than using the entire cloning vector sequence for screening, we screened using only the cloning vector sequence from the insert site to the primer annealing site. This resolved the gaps in the assemblies; however, in one case it resulted in the generation of a small contig identical to the cloning vector's origin of replication and antibiotic resistance marker. Since this contig was easily identified as cloning vector sequence, the strategy of narrowed vector screening was considered suitable for plasmid sequence assembly. As a general rule in plasmid sequencing work, any antibiotic resistance phenotype information available for the plasmid of interest could guide selection of a cloning vector that encodes a different antibiotic resistance than those found on the natural plasmid.
Our preliminary analysis of the four novel plasmid sequences illustrates the variety of genes that can be carried by plasmids as well as the limitations in plasmid genome information currently available. BLAST analysis successfully identified plasmid replication, transfer, and maintenance genes as well as mobile and chromosomal genes with diverse functions in all novel plasmid sequences. The gram-negative plasmid pLEW517, which had high-scoring BLAST hits for all but a small fraction of the genome, was characterized as a variant of plasmid R46. pLEW517 encoded the same replication and transfer genes as R46 but lacked a characteristic R46 integron and possessed two transposons not found on R46. This illustrates how the insertion and removal of accessory elements such as transposons, integrons, and insertion sequences on a plasmid backbone leads to extensive variation in plasmid genomes, similar to the chromosomal variation observed among strains of a bacterial species. Further sequencing of natural plasmids will contribute to our understanding of the degree to which plasmid backbones vary in accessory elements. Sequencing more plasmids will also uncover variation in the accessory elements themselves, such as the novel Tn21-like transposon found on pLEW517.
In contrast to pLEW517, approximately 60% of the pLEW6932 sequence and 40% of the pLEW279a and pLEW279b sequences had no apparent similarity to known sequences in the nucleotide and protein sequence databases. BLAST hits obtained for these genomes were from a variety of gram-positive bacterial chromosomes and plasmids, suggesting that these plasmid genomes are a mosaic of genes and accessory elements acquired from related sources. The underrepresentation of environmental gram-positive plasmids in current sequence databases may account for the observed lack of similarity. Thorough manual annotation of these plasmids is beyond the expertise of a single laboratory; however, the initial annotation accompanying the sequences as deposited in GenBank can provide a starting point for deeper analysis by those with relevant expertise. Much more extensive sequencing of plasmid genomes as well as inclusion of plasmid and other mobile element terms in the genome and sequence ontologies is essential to addressing fundamental questions of prokaryotic evolution and to dealing with critical problems such as the spread of antibiotic resistance. The SPRI plasmid isolation method coupled with simple electrophoretic elution of individual supercoiled plasmids provides a facile and inexpensive approach to obtain DNA of sufficient amount and quality to generate libraries with excellent coverage that result in readily finishable genomic sequences.
| ACKNOWLEDGMENTS |
|---|
This work was supported by the Office of Science (BER), U.S. Department of Energy (grant DE-FG02-04ER63770 to A.O.S.).
| FOOTNOTES |
|---|
Supplemental material for this article may be found at http://aem.asm.org. ![]()
| REFERENCES |
|---|
|
|
|---|
This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| J. Bacteriol. | Microbiol. Mol. Biol. Rev. | Eukaryot. Cell | All ASM Journals |
|---|