Laboratory for Molecular and Computational Genomics, University of WisconsinMadison, Madison, Wisconsin 53706,1 Department of Chemistry, University of WisconsinMadison, Madison, Wisconsin 53706,2 Department of Bacteriology, University of WisconsinMadison, Madison, Wisconsin 53706,3 OpGen, Inc., Madison, Wisconsin, 53719,4 Microbial Genomics, Department of Energy Joint Genome Institute, Walnut Creek, California 94598,5 Los Alamos National Laboratory Center for Human Genome Studies, Los Alamos, New Mexico 87545,6 Laboratory of Genetics, University of WisconsinMadison, Madison, Wisconsin 537067
Received 9 January 2005/ Accepted 11 April 2005
| ABSTRACT |
|---|
|
|
|---|
| INTRODUCTION |
|---|
|
|
|---|
R. rubrum has proven to efficiently convert hydrogen to electrical current in fuel cells (23) and produce novel forms of biodegradable thermoplastics when grown on assorted ß-hydroxycarboxylic acids and n-alkanoic acids (7). More recently, Handrick et al. (12) reported on R. rubrum's activator role in the degradation of polyhydroxybutyrate, a polymer of interest for its thermoplasticity and breakdown to water and carbon dioxide. Finally, from a biochemical standpoint, Zhang and others (29) studied the mechanisms behind R. rubrum's posttranslational regulation of nitrogenase activity.
To supplement understanding of R. rubrum's biology, the organism was sequenced by the Department of Energy Joint Genome Institute and finished by the Los Alamos Finishing Group. In order to aid in sequence assembly, three whole-genome restriction endonuclease maps of R. rubrum with resolutions ranging from 11 to 45 kb were created. Physical maps are an excellent means by which to independently validate sequence assembly, close sequence contig gaps, and resolve repeat-rich regions, which consistently confound sequence assembly methods (17, 20-22, 25). In comparison to other physical mapping techniques, the automation and resolution of optical mapping make the system ideal for addressing a wide range of problems in genomics, including the finishing and checking of microbial sequencing projects (4, 17, 30-32). In addition, the use of genomic DNA as the source of single molecules for mapping eliminates the need for libraries, PCR, or separations and makes optical mapping advantageous for whole-genome mapping and sequence assembly.
Optical mapping enables the construction of whole-genome restriction endonuclease maps from ensembles of single DNA molecules that have been elongated and immobilized on positively charged glass surfaces and subsequently cut with a restriction endonuclease (Fig. 1). The resulting single DNA molecule restriction endonuclease maps are stained with a fluorochrome and visualized by fluorescence microscopy. Because the order of restriction endonuclease fragments is retained on the optical mapping surface, there is no need for sorting fragments by size. The mass of each restriction endonuclease fragment is determined by integrated fluorescence intensity measurements. The single-molecule restriction endonuclease maps are assembled into contigs (2, 3, 16, 18), in a process similar to shotgun sequence assembly, that span entire microbial genomes. The depth of coverage minimizes mapping error and the overlapping cascades of optical maps create continuity of coverage across an entire genome.
|
| MATERIALS AND METHODS |
|---|
|
|
|---|
Surface preparation.
Glass coverslips (22 by 22 mm, Fisher's Finest; Fisher Scientific, Pittsburgh, PA) were cleaned and derivatized as described previously (30). Surface properties were assayed by digesting lambda DASH II bacteriophage DNA with 40 units of XbaI, HindIII, and NheI diluted in 200 µl of digestion buffer with 0.2% Triton X-100 (Sigma, St. Louis, MO) at 37°C to determine optimal digestion times, which ranged from 30 to 120 min.
DNA mounting, overlay, digestion, and staining.
DNA molecules were mounted on derivatized glass surfaces by capillary action using a microfluidic device (8). Capillary flow elongates the DNA molecules; they are then immobilized by electrostatic interactions between the positively charged glass surface and negatively charged DNA molecules. Following DNA elongation and deposition, a thin layer of acrylamide (3.3% containing 0.02% Triton X-100 [Sigma, Pittsburgh, PA]) was applied to the surface. After polymerization, the surfaces were washed with 400 µl of TE for 2 min, followed by washing with 200 µl of digestion buffer for another 2 min. The digestion was then performed by adding 200 µl of digestion buffer with enzyme (20 µl of 10x buffer 2 [100 mM Tris, 500 mM NaCl, 100 mM MgCl2, 10 mM dithiothreitol, pH 7.9] [New England Biolabs, Beverly, MA], 176 µl high-purity water, 2 µl 2% Triton X-100 [Sigma, Pittsburgh, PA], and 4 µl of HindIII [New England Biolabs, Beverly, MA] or EcoRI [New England Biolabs, Beverly, MA] [10 unit/µl] or 2 µl XbaI [New England Biolabs, Beverly, MA] [20 units/µl]) to the surface and incubating in a humidified chamber at 37°C for 30 to 120 min.
Following digestion, the surfaces were washed twice by adding 400 µl of TE, waiting 2 to 5 min, and the solution was removed by aspiration. The surfaces were mounted onto a glass slide with 12 µl 0.2 µM YOYO-1 solution (containing 5 parts YOYO-1 [1,1'-[1,3-propanediylbis[(dimethyliminio)-3,1-propanediyl]]bis[4-[(3-methyl-2(3H)-benzoxazolylidene)-methyl]]-tetraiodide; Molecular Probes, Eugene, OR] in 95 parts of 14.3 M ß-mercaptoethanol in 20% TE vol/vol). The edges of the glass surface were sealed to the glass slide with nail polish and incubated (4°C in the dark) for at least 20 min so the staining dye could diffuse before checking by fluorescence microscopy.
Image acquisition and processing.
The samples were imaged by fluorescence microscopy as previously described (17) using a 63x objective (Zeiss, Thornwood, NY) and a high-resolution digital camera (Princeton Instruments, Trenton, NJ). Single overlapping images, spanning the full length of the microfluidic channels, were collected, flattened, and superimposed by a fully automated image acquisition system, ChannelCollect (8). These flattened and overlapped superimages were then processed through the Pathfinder software (Rod Runnheim, unpublished), which identifies digested molecules to be made into single molecule maps. Features that are recognized as DNA molecules are denoted and created into an ordered restriction endonuclease map for that molecule. Comounted Lambda DASH II molecules were used to estimate the digestion rate and to provide internal fluorescence standards for accurately sizing the DNA fragments (1, 18, 20). Each digested genomic DNA molecule selected by Pathfinder becomes a single molecule optical map.
Optical map assembly.
The custom-written software Gentig (1-3, 16-18) overlapped the single-molecule restriction endonuclease maps by aligning restriction endonuclease sites based on fragment sizes. Gentig assembles the individual molecule restriction endonuclease maps into a contig that spans the entire genome. Bayesian inference estimates the probability that two single-molecule restriction endonuclease maps, while subject to various data errors stemming from sizing, missing restriction endonuclease sites (missing cuts), and spurious restriction endonuclease sites (false cuts), may have been derived from the proposed placement. A known statistical distribution of the error sources is required for the Bayesian approach, as is fine-tuning of parameters such as standard deviation, digestion rate, false cut, and false match probability. These parameters can be reestimated from the data using a limited number of iterations of Bayesian probability density maximization.
Once these parameters have been accurately estimated from the data, an efficient dynamic programming algorithm computes the best offset and alignment between a pair of maps. The accuracy of an optical map as its own entity is estimated by Gentig's ability to assess a set of hypothetical maps against the optical map data set and, using error models, report a false-positive probability (2). For circular genomes, this is reported as the false circularization probability. The value represents the probability that the circular contig created by Gentig is a false positive.
DNA sequencing.
Two randomly sheared libraries of the R. rubrum strain ATCC 11170 genome were produced with 3-kb inserts (plasmids) and 40-kb inserts (fosmids). These libraries were sequenced to a total depth of approximately 11x and all reads were quality assessed and trimmed for vector sequence before being used for assembly.
For the 3-kb DNA shearing and plasmid subcloning, approximately 3 to 5 µg of isolated DNA was randomly sheared to 3- to 4-kb fragments (25 cycles at speed code 12) in a 100-µl volume using a HydroShear (GeneMachines, San Carlos, CA). The sheared DNA was immediately blunt end-repaired at room temperature for 40 min using 6 U of T4 DNA Polymerase (Roche, Basel, Switzerland), 30 U of DNA polymerase I Klenow Fragment (New England Biolabs, Beverly, MA), 10 µl of 10 mM deoxynucleoside triphosphate mix (Amersham Biosciences, Piscataway, NJ), and 13 µl of 10x Klenow buffer in a 130 µl total volume. After incubation the reaction was heat inactivated for 15 min at 70°C, cooled to 4°C for 10 min and then frozen at 20°C for storage. The end-repaired DNA was run on a 1% TAE (Tris-acetate-EDTA)-agarose gel for
30 to 40 min at 120 V. Using ethidium bromide stain and UV illumination, 3- to 4-kb fragments were extracted from the agarose gel and purified using QIAquick gel extraction kit (QIAGEN, Valencia, CA). Approximately 200 to 400 ng of purified fragment was blunt-end ligated for 40 min into the SmaI site of 100 ng of pUC18 cloning vector (Roche, Basel, Switzerland) using the Fast-Link DNA ligation kit (Epicentre, Madison, WI).
Following standard protocols, 1 µl of ligation product was electroporated into DH10B Electromax cells (Invitrogen, Carlsbad, CA) using the Gene Pulser II electroporator (Bio-Rad, Hercules, CA). Transformed cells were transferred into 1,000 µl of SOC-medium and incubated at 37°C in a rotating wheel for 1 h. Cells (usually 20 to 50 µl) were spread on LB agar plates, 22 by 22 cm, containing 100 µg/ml of ampicillin, 120 µg/ml of isopropylthiogalactopyranoside (IPTG), and 50 µg/ml of 5-bromo-4-chloro-3-indolyl-ß-D-galactopyranoside (X-Gal). Colonies were grown for 16 h at 37°C. Individual white recombinant colonies were selected and picked into 384-well microtiter plates containing LB/glycerol (7.5%) medium containing 50 µg/ml of ampicillin using the Q-Bot multitasking robot (Genetix, Dorset, United Kingdom).
To test the quality of the library, 24 colonies were directly PCR amplified with pUCM13 28 and 40 primers using standard protocols. Libraries were considered high quality if they had >90% 3-kb inserts. For more details see http://www.jgi.doe.gov/sequencing/protocols/General3kbLibraryCreationSOP.doc and http://www.jgi.doe.gov/sequencing/protocols/FosmidLibraryCreationSOP.DOC.
For the plasmid amplification and sequencing steps, 2-µl aliquots of saturated Escherichia coli DH10B cultures containing pUC18 vector with random 3- to 4-kb DNA inserts grown in LB/glycerol (7.5%) medium containing 50 µg/ml of ampicillin were added to 8 µl of a 10 mM Tris-HCl (pH 8.2), 0.1 mM EDTA denaturation buffer. The mixtures were heat lysed at 95°C for 5 min and then placed at 4°C for 5 min. To these denatured products 10 µl of a rolling circle amplification (RCA) reaction mixture (TempliPhi DNA sequencing template amplification kit, Amersham Biosciences, Piscataway, NJ) were added. The amplification reactions were carried out at 30°C for 12 to 18 h. The amplified products were heat inactivated at 65°C for 10 min then placed at 4°C until used as the template for sequencing.
Aliquots of the 20 µl of amplified plasmid RCA products were sequenced with standard M13 28 or 40 primers. The reactions contained 1 µl of RCA product, 4 pmol of primer, 5 µl of distilled H2O, and 4 µl of DYEnamic ET terminator sequencing kit (Amersham Biosciences, Piscataway, NJ). Cycle sequencing conditions were 30 rounds of 95°C for 25 seconds, 50°C for 10 seconds, 60°C for 2 min, and then held at 4°C. The reactions were then purified by a magnetic bead protocol [for more details see http://www.jgi.doe.gov/sequencing/protocols/DYEnamicET-TerminatorCycleSequencing(10ulrxn)SOP.doc] and run on a MegaBACE 4000 (Amersham Biosciences, Piscataway, NJ). Alternatively, 1 µl of the RCA product was sequenced with 2 pmol of standard M13 28 or 40 primers, 1 µl 5x buffer, 0.8 µl H2O, and 1 µl BigDye sequencing kit (Applied Biosystems, Foster City, CA) at 1 min denaturation and 25 cycles of 95°C for 30 seconds, 50°C for 20 seconds, 60°C for 4 min, and finally held at 4°C. The reactions were then purified by a magnetic bead protocol and run on an ABI PRISM 3730 (Applied Biosystems, Foster City, CA) capillary DNA sequencer. Detailed protocols for fosmid library creation, fosmid DNA isolation and cleanup procedure can be found at http://www.jgi.doe.gov/Internal/protocols/prots_production.html.
In the sequence finishing process, all drafted reads were assembled together with SPS Phrap (SPSOFT, Albuquerque, NM). Repetitive regions of the genome were resolved with repFinisher (Cliff S. Han, unpublished). Autofinish (11) was used in the first cycle of finishing to select sequencing reactions. Remaining gaps and low quality regions closed with primer walking on subclones or by shattering PCR fragments covering the gaps.
Optical maps versus DNA sequence-based maps.
Alignments between the optical maps and DNA sequenced-based maps from the seven finishing-stage sequence contigs were created with the MapViewer software (OpGen, Inc., Madison, WI), a Perl/Tk application that provides an intuitive graphical interface for optical map analysis. In addition to creating and displaying alignments of optical maps, MapViewer allows the user to manipulate the relative positions and orientations as well as the scale of the optical maps to better understand these alignments. The map alignments are generated with a dynamic programming algorithm that finds the optimal alignment of two restriction endonuclease maps according to a scoring model that incorporates fragment sizing errors, false and missing cuts, and missing small fragments. For a given alignment, the score is proportional to the log of the length of the alignment, penalized by the differences between the two maps, such that longer, better matching alignments will have a higher score.
Using Gentig, the XbaI, NheI, and HindIII maps were aligned separately with the DNA sequence-based HindIII, XbaI, and NheI maps generated from the finished sequence. These initial alignments enabled determination of missing fragments, false cuts, or missing cuts. The relative sizing error for each fragment in the optical maps was calculated from the formula [100% x (optical map fragment size - corresponding DNA sequenced-based map fragment size)/corresponding DNA sequence-based map fragment size] and was plotted against the DNA sequence-based map fragment sizes to show the relationship between fragment size and relative error.
| RESULTS |
|---|
|
|
|---|
The resolution of the optical maps affects the contig rate and average molecule size in the contig (Table 1). For the XbaI map, a total of 405 digested molecules were imaged and processed. Of this total, 204 were included in the whole-genome map contig, giving a contig rate of 50%. This low contig rate can be explained in part by the low resolution of the optical map. A map with an average fragment size of 44.73 kb requires very large genomic DNA molecules for contig assembly, due to the number of fragments in a single-molecule map required for confidence in merging that map into a map contig. The average size of molecules in the XbaI contig was about 900 kb, whereas the average size of collected DNA molecules was 637.69 kb. In addition, the digestion rate was calculated at 76.01%, which is lower than the target digest rate of 80% or higher. While still acceptable, a digest rate of 76.01% will reduce the density of apparent restriction endonuclease sites, or markers, thus increasing the difficulty in creating a contig. However, even with the low rate of contig formation, the total mass of molecules in the contig was 184.23 Mb, which corresponds to about 42-fold coverage.
|
|
Finally, 932 molecules were collected for the production of the HindIII map. With an average fragment size of 10.95 kb, this map is the highest resolution of the set of maps presented here. Of the 932 molecules, 623, or 67% of them went in to the final contig. The smaller average fragment size loosened the requirements for molecule size in the contig; the average size of molecules in the contig was 405.49 kb. The total mass represented by the molecules in the contig was 252.68 Mb, which corresponds to about 57-fold coverage, based on the HindIII contig size of 4,456.35 kb. Again, the size of the HindIII contig was calculated by summing the masses of the restriction endonuclease fragments in the consensus map. The digestion rate of the molecules in the contig was calculated to be 78.9%. An average of about 31 fragments was used to calculate the mass of each fragment in the consensus map. For each fragment in the consensus map, the standard deviation about the mean was 1.16 kb.
In all of the maps, the high coverage ensured accurate calling of restriction endonuclease sites, fragment sizing, and sizing of the entire circularized genome map. Below, the accuracy of the optical maps compared to the sequence is assessed. The false circularization probability for the XbaI, NheI, and HindIII maps was 0.00738, 0.00329, and 0.00440, respectively. Since the XbaI map had the lowest coverage, contig rate, and digestion rate, it is not surprising that the false circularization probability for this map is the highest. However, for all the maps, the false probability values were well below 0.05, which is considered the upper limit for confident map circularization (30). The restriction endonuclease patterns generated by the XbaI, NheI, and HindIII maps appeared random; no particular restriction endonuclease patterns or structural features were observed.
Use of optical maps in sequence assembly.
All of the optical maps were made in order to guide and verify the R. rubrum genome sequence assembly process. Near the end of the finishing effort, nine sequence contigs were generated ranging in size from 2311 base pairs to 1,465,886 base pairs. Alignment of the optical maps against the DNA sequence-based maps of the sequence contigs gave three independent indications of sequence contig assembly and order. Two of the sequence contigs did not align against the optical maps. They were contig 84, the plasmid sequence contig, and contig 82c, which, with a size of 2.311 kb, was too small to align with the optical maps. Six of the seven remaining contigs aligned to both the XbaI and NheI maps. Only the HindIII map, with its higher resolution, was able to align all seven sequence contigs, including contig 83 with its small size of 80.404 kb (Fig. 3A).
|
The finished sequence (GenBank accession number AAAG00000000) contained a 4.4-Mb circular chromosome (contig 94, exact size is 4,352,726 base pairs) and a 54 kb plasmid (contig 93, exact size is 54,412 base pairs). Minor differences have been found between the optical maps and finished sequence and are described below.
Assessment of optical mapping errors.
Comparisons between sequence and optical mapping data were made in order to evaluate the errors and accuracy in the XbaI, NheI, and HindIII maps (Fig. 4). The relative sizing error was calculated by the alignment of optical maps with the DNA sequence-based maps made from the finished sequence (Fig. 4A to F). The error bars in Figs. 4A, 4B, and 4C reflect the standard deviation about the means of the restriction endonuclease fragment sizes used in calculating the consensus map fragments. In general, a high degree of correspondence was evident between the optical map and DNA sequence-based map fragment sizes. The regression values for the trendlines are 0.9985, 0.9995, and 0.9947 for the XbaI, NheI, and HindIII maps, respectively.
|
Figure 5 shows the cumulative distribution of fragment sizes for the three optical maps. Only the XbaI map has fragments greater than 135 kb, and thus this value was chosen as an endpoint in the figure to facilitate visual comparison of the three maps' fragment size distributions. Each bar represents the cumulative percentage of consensus map fragments in 5-kb intervals. For each of the maps, the distribution is roughly exponential, as expected. One key difference between the HindIII and the lower-resolution XbaI and NheI maps is the proportion of fragments smaller than 5 kb and 10 kb. In the HindIII map, about 35% of all fragments in the consensus map are smaller than 5 kb; 72% of fragments are smaller than 10 kb, and 100% of fragments are under 40 kb (the largest fragment is 37.82 kb). These numbers are in stark contrast to those for the NheI and XbaI maps. In the NheI map, only 12% of fragments are smaller than 5 kb, and 32% of fragments are smaller than 10 kb. Similarly, in the XbaI map, only 13% of fragments are smaller than 5 kb and 25% of fragments are smaller than 10 kb. For the XbaI map, there were only two additional fragments greater than 135 kb: a 242.20-kb fragment and a 256.30-kb fragment (not shown). The increased average relative sizing error for small fragments (Fig. 4F) seen in the HindIII map may be due to the high proportion of fragments 2 kb or smaller, many in tandem with each other, in this high-resolution map.
|
|
In comparison to the DNA sequence-based map, the NheI map showed no false cuts and one missing cut out of a total of 145 cuts in the DNA sequence-based map. There were four missing fragments, over 500 bp, in the NheI map. The DNA sequence-based map had no fragments smaller than 500 bp, and three fragments smaller than 1 kb, out of a total of 145 fragments.
Finally, the HindIII map showed no false cuts and five missing cuts in comparison to the 684 cuts in the DNA sequence-based map. Of the 684 fragments, 664 were greater than 500 bp. Of these fragments, 125 were missing in the HindIII optical map. Fifty-eight of the missing fragment loci, corresponded to DNA sequence-based fragments >500 bp and
1 kb, 59 to fragments >1 kb and
2 kb, and the remaining eight to fragments >2 kb and <3 kb.
Comparing the locations of the missing cuts and missing fragments revealed no consistent errors among the three optical maps. Thus, errors appear to be random and not associated with any major discrepancy between the sequence and the optical maps.
| DISCUSSION |
|---|
|
|
|---|
2-kb contig) generated at the end of the finishing effort without gaps. While the error in contig 90 was evident in the HindIII optical map to sequence alignment, in this case, the lower-resolution XbaI and NheI maps best displayed this error and how it could be corrected. Yet in general, an array of different-resolution optical maps is advantageous for addressing discrepancies in genome sequences. All three maps were used to confirm the final 4.353-Mb sequence contig generated by the Los Alamos finishing group. The finished R. rubrum strain ATCC 11170 sequence size of 4.353 Mb is closest to the estimate of 4.323 Mb provided by the XbaI map. The overall sizing error for the XbaI map is 0.7%, which is smaller than the error associated with other whole-genome physical maps generated by pulsed-field gel electrophoresis (32). The sizing errors for the NheI map and HindIII map were 3% and 2%, respectively. Yet, the alignment of the NheI and HindIII optical maps against the DNA sequence-based maps showed no apparent overall size discrepancies, and thus this error most likely stems from the summation and increased error associated with small fragments. As the number of fragments summed to calculate genome size increases, so does the error associated with this calculation. As such, the low-resolution XbaI map should, and does, give the most accurate estimate of genome size.
The high number of missing fragments in the HindIII map and increased sizing error of small fragments illustrate the challenges optical mapping faces for scoring of small fragments. Of the 125 missing fragments in the HindIII optical map, 116 corresponded to DNA sequence-based map fragments less than or equal to 2 kb. This corresponds to a small fragment loss rate of 75%, as the DNA sequence-based map contained 154 fragments less than or equal to 2 kb. By contrast, the fragment loss rate for fragments >2 and
3 kb was 12% (8 out of 69 fragments were missing), and zero for fragments greater than 3 kb.
A key element of the optical mapping system is the elongation and immobilization of single DNA molecules onto glass surfaces. Immobilization via electrostatic interactions between the negatively charged DNA and positively charged glass surface must be subtle enough to enable biochemical reactions, such as a restriction endonuclease digest, yet strong enough to retain the resulting fragments. The loss of fragments 2 kb and smaller reflects the difficulty in retaining small fragments in their exact position on the surface after a restriction endonuclease digest but also in identifying and correctly sizing the fragments during image acquisition and subsequent processing. The error models in the optical map assembly software (Gentig) take into account the likelihood of losing small fragments and enable alignment against the sequence, as seen here in the HindIII map, despite the significant small fragment loss.
An increased positive sizing error is seen in both the HindIII map and NheI map for small fragments. One possible explanation is the likelihood of overestimating the size of small fragments when they are scored. In other words, when a small fragment is marked, it is unlikely that the fragment would be undersized, and thus errors in this size range do not balance each other as well as they do for the larger. New efforts in DNA mounting and small-fragment sizing with the Pathfinder software are currently under way in order to improve retention and scoring of small fragments.
With an average fragment size of 44.73 kb, the XbaI map represents the lowest-resolution optical map created. There are significant advantages of a low-resolution map. First, a low-resolution map requires very large single molecule for assembly into a whole-genome contig. As average fragment sizes increases, so does the molecule size required for achieving a unique pattern of restriction endonuclease fragments for accurate map assembly. Here, the average size of molecules in the XbaI map approached 1 Mb. This scale approaches the lower-resolution limit of more global cytogenetic methods that reveal chromosomal insertions, deletions, rearrangements, etc. (5, 15, 19). With a documented resolution between 6.5 kb (32) and 45 kb (reported here), optical mapping's niche falls between low-resolution, global methods, such as comparative genomic hybridization, and very high-resolution genotyping systems. This "molecular cytogenetics" approach has enormous potential for aiding in large genome (such as mammalian) sequencing projects as well as for identifying genomic variation in the form of insertions, deletions, and repetitive elements, a difficult and often evasive task.
With the ability to qualify conclusions drawn from low-resolution cytogenetic techniques and contextualize the information gleaned from high-resolution genotyping tools, the optical mapping system can be particularly powerful when used in conjunction with other methods. We are currently pursuing these directions with the optical mapping system, as well as working on improvements for larger molecules and improved small fragment retention for the goal of widening the range of optical mapping's resolution.
Here, three optical maps that have aided in sequence assembly and validation of R. rubrum have been shown. In addition, we have widened the resolution range of the optical mapping system and contextualized this contribution to genomic analysis. Continual improvements and new applications of the optical mapping system are under way. For example, in a recent comparative genomics study, optical mapping revealed novel genomic insertions and rearrangements in Shigella flexneri in addition to genomic differences between sequenced strains of Escherichia coli and Yersinia pestis that were aligned as maps (31). Optical mapping's role in sequencing projects has expanded to larger, more complex genomes, such as the
34-Mb genome of the diatom Thalassiosira pseudonana (4). Optical mapping projects will continue to encompass increasingly challenging questions, with the goal of providing new insights on genome structure and organization that will potentiate the capabilities of higher and lower-resolution genomic analysis systems.
| ACKNOWLEDGMENTS |
|---|
We thank all members of the University of WisconsinMadison Laboratory for Molecular and Computational Genomics.
| FOOTNOTES |
|---|
| REFERENCES |
|---|
|
|
|---|
This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| J. Bacteriol. | Microbiol. Mol. Biol. Rev. | Eukaryot. Cell | All ASM Journals |
|---|