Previous Article | Next Article ![]()
Applied and Environmental Microbiology, May 2009, p. 2889-2898, Vol. 75, No. 9
0099-2240/09/$08.00+0 doi:10.1128/AEM.01640-08
Copyright © 2009, American Society for Microbiology. All Rights Reserved.
,
Department of Medical Microbiology, University of Manitoba, 730 William Avenue, Winnipeg, Manitoba R3E 0W3, Canada,1 Agriculture and Agri-Food Canada Saskatoon Research Centre, 107 Science Place, Saskatoon, Saskatchewan S7N 0X2, Canada,2 Department of Veterinary Microbiology, University of Saskatchewan, 52 Campus Drive, Saskatoon, Saskatchewan S7N 5B4, Canada,3 National Microbiology Laboratory, Canadian Science Centre for Human and Animal Health, 1015 Arlington Street, Winnipeg, Manitoba R3E 3R2, Canada4
Received 16 July 2008/ Accepted 25 February 2009
|
|
|---|
|
|
|---|
The advent of next-generation ultra-high-throughput sequencing technologies, in particular, the GS FLX (454 Life Sciences, Branford, CT), has removed an important quantitative barrier in molecular analysis by increasing the number of reads from a gene or genome by orders of magnitude in a single run (20). Unfortunately, the short average length of pyrosequencing reads (
200 bp compared to
700 bp using dideoxy sequencing) presents a new set of problems. The results of recent application of this technology to analysis of 16S rRNA gene sequences from microbes in vaginal samples have demonstrated that short reads are more likely to generate matches to multiple sequences in the rRNA sequence database and that taxonomic and phylogenetic resolution was limited due to strong similarities between 16S rRNA sequences from closely related species (32).
An alternative molecular target for microbial identification and phylogenetic analysis is cpn60, a gene that encodes the 60-kDa chaperonin or heat shock protein (HSP60/GroEL) (13). The cpn60 gene is universal in eubacteria and eukaryotes and an extensive, curated reference database is available (13) (http://cpndb.cbr.nrc.ca). The cpn60 universal target (UT) offers key advantages, including short target length (549 to 567 bp), sufficient resolving power to distinguish closely related species and subspecies, and a relatively uniform distribution of variability across the entire length of the target (9, 12). The use of the cpn60 UT has been well established for phylogenetic analysis of complex samples (4, 14) and has recently been applied to vaginal microbial communities (11). In the present study, we examined the feasibility of pyrosequencing for determining the composition of the vaginal microbiota using the cpn60 UT. We compared the microbial community structure generated by pyrosequencing of cpn60 amplicons using the GS FLX with dideoxy sequencing based on clone libraries generated from the same samples. In addition, we evaluated the microbial community profiles generated by pyrosequencing of cpn60 UT amplicons and 16S rRNA amplicons from the same vaginal samples.
|
|
|---|
|
View this table: [in a new window] |
TABLE 1. Samples chosen for clone library construction and pyrosequencing
|
Pyrosequencing of cpn60 UT amplicons.
For high throughput sequencing using the GS FLX platform, cpn60 UT amplicons were prepared using the same primers and templates as for the clone libraries and all amplicons were purified on a 1.5% agarose gel. The sequencing libraries were prepared using the GS DNA library preparation kit and emulsion PCR (emPCR) was performed with a GS emPCR kit I or kit II as suggested by the manufacturer (Roche Diagnostics, Laval, Canada), except that the amplicons were treated as sheared DNA so the nebulization step normally performed in the library procedure was omitted. For the analysis of the 20-pool (Table 1), the GS FLX run was set up in one region of a 16-region gasket. Amplicons from the 20-pool were ligated to the linkers used for emPCR and sequencing. For the analysis of individual samples, a set of cpn60 UT primers containing unique sequence tags was used (see Table S1 in the supplemental material). The set of upstream primers (hybridizing to the 5' end of the cpn60 UT) each contained at their 5' ends the primer used in the subsequent emPCR and sequencing (primer A), followed by a unique four-base sequence tag and the cpn60 UT primer sequence. Each of the upstream primers was paired with a downstream primer that contained the other emPCR primer sequence immediately upstream of the cpn60 UT primer sequence. All primers were HPLC purified. The sequence tags enabled a multiplexed run in which individual samples were each amplified with a primer set containing a unique sequence tag. The resulting sequence data was then sorted according to the tag prior to subsequent analysis.
Pyrosequencing of 16S rRNA amplicons.
Amplicons were generated from each of four individuals ("4-pool"; Table 1) using the untagged broad-range PCR primers L27F and 355R, which target variable regions V1 and V2 (30). Amplicons from the four individuals were pooled by volume, purified on a 1.5% agarose gel, and ligated to PCR linkers containing a Roche multiplexing ID sequence prior to pyrosequencing. The 16S rRNA amplicons were sequenced on the same picotiter plate used for the individual cpn60 samples and identified after pyrosequencing by the unique multiplexing ID.
Data management and taxonomic assignment.
Clone library and pyrosequencing data were analyzed using a bioinformatic pipeline to evaluate each sequencing read and determine the optimal possible taxonomic label(s). Reads derived from Sanger sequencing were base-called using Phred (6, 7). Cloned insert sequences were identified and vector sequences removed by using Lucy (2). Pyrosequencing data was processed by using the default on-rig procedures from 454/Roche. Filter-passing reads were used in the subsequent analyses for each of the pyrosequencing libraries. All sequencing data was imported and warehoused using the APED software package (http://aped.sourceforge.net).
A combination of BLAST and Smith-Waterman alignments (watered-BLAST) was used to identify the most significant matches between each individual sequence read and an appropriate reference sequence database. As outlined in Fig. 1, each read was initially compared to the reference database using BLAST (nonstandard parameters: –F F) (1). The BLAST hits at the best significance level were examined further by performing a Smith-Waterman alignment of the sequence read and each hit in the reference database that was at the best significance level (29). Smith-Waterman alignments were generated by using the water program from EMBOSS (26). The optimal local alignments produced from water were examined, and those which had a percent identity of <70% or a length of <150 bp were deemed spurious matches and were rejected from further analysis. In order to identify the best putative taxonomic assignment possible for each read, the water alignments were limited to those within 1% identity of the top percent identity match to the reference database. In cases where there were multiple hits to the reference database that were within 1% of the best match to the reference database, a read was annotated as matching each of the reference database sequences. The resulting taxonomic assignments were used to calculate distributions of organism abundance at the genus and species levels. To account for variation between the sizes of pyrosequencing libraries, organism abundance was normalized to the respective library size as follows: normalized abundance = (no. of taxonomic matches/no. of total matches) x 100.
![]() View larger version (13K): [in a new window] |
FIG. 1. Data analysis flowchart of the watered-BLAST pipeline used to assign a taxonomic label for each sequence from the Sanger and GS FLX datasets.
|
Reference databases for watered-BLAST matching.
For cpn60 data, a nonredundant, customized database of cpn60 UT sequences was created and was comprised of a single reference strain for each species (cpnDB_nr). Unique sequences (did not match anything in cpnDB_nr) obtained from 25 cultured isolates derived from vaginal swabs of women from the cohort were added to complete the customized database cpnDB_NR_vag (1,373 sequences). For 16S rRNA data, each read was compared by the same watered-BLAST method to a database of 16S rRNA sequences from RDP (19). The 16S rRNA database (RDP_isolates) consisted of 66,304 full-length or nearly full-length (>1,200 bp) sequences from RDP that were annotated as "good quality" data from isolates (metagenomic and uncultured data excluded). To facilitate direct pairwise comparison of 16S rRNA and cpn60 pyrosequencing libraries derived from the same samples, customized reference databases were generated for each gene target. These databases contained 505 nonredundant type strain sequences representing species that are in common between RDP and cpnDB and for which the entire V1 to V8 region of the 16S rRNA gene was available. The V1 to V8 region was defined as the region corresponding to nucleotides 8 to 1406 of E. coli 16S rRNA.
Rarefaction.
Rarefaction curves and richness estimators (Chao1 and ACE) were calculated from the pyrosequencing and Sanger data using EstimateS (version 8.0.0; R. Colwell, University of Connecticut [http://purl.oclc.org/estimates]) as described previously (15, 16). EstimateS analyses were performed on 100 random samplings, without replacement, and where appropriate the classical method for Chao1 calculations was used.
Pairwise comparison of libraries.
In order to evaluate whether there were differences observed in the taxonomic composition of libraries, the relative abundance for each genus/species in the comparison was calculated as follows: relative abundance = log2 (normalized abundance library A/normalized abundance library B). When species were represented by a single read in one library and by >1 read in the paired library, that species was considered unrepresented in the single-read library.
Phylogenetic trees.
The phylogenetic tree from the 20-pool (Table 1) was drawn based on a CLUSTALW (33) alignment of the cpn60 UT using PHYLIP (Phylogeny Inference Package) version 3.5c (J. Felsenstein, distributed by the author, Department of Genetics, University of Washington, Seattle). The alignment was sampled using bootstrap, and distances were calculated using the F84 distance method.
Sequence clustering and assembly.
Assemblies of the pyrosequencing data were generated by using a gsAssembler/newbler (454/Roche) with the default parameters. The resulting number of contigs from the assembly of the pyrosequencing data was used to give an approximation of the number of distinct sequences sampled for a given library.
OTU diversity calculation for Prevotella spp.
For each of the pyrosequencing libraries generated for cpn60 and 16S rRNA, the reads identified by watered-BLAST as Prevotella spp. were processed by using t_coffee (23) (nonstandard parameters: –mode quickaln) to generate a PHYLIP format output file of their multiple sequence alignment. Distance matrix files suitable for input to DOTUR (28) were created with dnadist from the PHYLIP package. The number of operational taxonomic units (OTUs) was calculated at various sampling depths and percent identity cutoffs using the farthest-neighbor algorithm of DOTUR (28).
|
|
|---|
The cpn60 GS FLX data set generated from the 20-pool contained 5,938 unassembled individual filter pass reads in a single run with a mean length of 197 bp. Most of these sequences were categorized as cpn60 (4,410 of 5,938 reads or 74%), with 1,129 sequences discarded due to insufficient length (<150 bp) and the remainder identified as human non-cpn60, bacterial non-cpn60, or unknown. BLAST comparison of the sequences from the cpn60 GS FLX data set to the Sanger data set revealed that 3,509 of the sequences matched a sequence found in the latter data set (where a match is defined as having
97% identity over
150 nucleotides). The remaining 901 sequences were unique to the cpn60 GS FLX data set and could be reduced to 72 different partial cpn60 sequences (see Table S2 in the supplemental material). Therefore, the total cpn60 GS FLX data set for the 20-pool included 144 partial cpn60 sequences, comprised of 72 that were also found in the smaller Sanger data set and an additional 72 sequences that were found only in the larger GS FLX data set.
Overlap of Sanger and GS FLX data for the 20-pool.
A comparison of the paired Sanger and GS FLX datasets from the 20-pool showed that the Sanger data was essentially entirely included with the pyrosequencing data (Fig. 2). Although there were 18 sequences in the Sanger data set that had no identical match in the GS FLX data set, 17 of these sequences were found to reside in the phylogeny with closely neighboring with sequences from the GS FLX data set. Only one Sanger sequence (FJ594055) was completely distinct from anything found in the GS FLX data set. In contrast, the 901 sequences found only in the GS FLX data set ranged from 96 to only 80% identical to their nearest match in the Sanger data (range, 80 to 96%; mean, 92%), indicating that the deeper pyrosequencing had resulted in the identification of novel sequences. These pyrosequencing-only reads included sequences representing previously described vaginal organisms such as Lactobacillus delbrueckii, Lactobacillus crispatus, and Mobiluncus curtsii, as well as sequences with weak similarity to anything in the Sanger data set or in the cpnDB reference database (see Table S2 in the supplemental material).
![]() View larger version (21K): [in a new window] |
FIG. 2. Phylogenetic tree of sequences found in the Sanger data set generated from the 20-pool sample. The numbers in brackets after each sequence indicate the frequency with which each sequence is represented in the GS FLX data set compared to the Sanger data set, as described in the text (positive numbers indicate relatively greater frequency in the GS FLX data set). Sequences found only in the Sanger data set are indicated. The tree is a consensus of 100 neighbor-joined trees. Numbers at the nodes are bootstrap values out of 100. Sequences are labeled with their GenBank accession numbers.
|
![]() View larger version (8K): [in a new window] |
FIG. 3. Relative representation of bacterial families in the Sanger/pyrosequencing and total datasets for the 20-pool. The proportions of sequences representing each family were compared among the sequences found in both datasets ( ) and among the total data set ( ).
|
Sampling depth.
Rarefaction analysis of the number of OTUs observed in each of the Sanger and GS FLX datasets for the 20-pool showed the increased sampling depth obtained with the pyrosequencing method (see Fig. S1 in the supplemental material). The species accumulation curve for the Sanger data exactly followed the curve generated for the GS FLX data but stopped far short of the depth obtained in the GS FLX data set. However, the sampling was evidently not complete even in the larger GS FLX data set; the Chao1 and ACE richness estimators yielded values of 178.5 and 77.6 OTUs, respectively, for this pooled sample (data not shown). For each of the individual samples, pyrosequencing of the cpn60 UT appeared to result in nearly complete sampling of the taxonomic richness of the samples (see Fig. S1 in the supplemental material).
Paired clone libraries and pyrosequencing of individuals.
To further evaluate the efficacy of metagenomic profiling by pyrosequencing of the cpn60 UT, we prepared larger clone libraries and matching larger GS FLX datasets for four individuals with normal or BV vaginal microbiota (Table 1). The numbers of reads generated for each individual, along with the proportion of reads that were retained after the watered-BLAST analysis (showed > 70% identity to a sequence in the reference database), are shown in supplemental Table S3 in the supplemental material. The taxonomic distributions of the sequences identified by both methods were consistent with the clinical diagnosis of the individuals. For example, individuals 001 and 006 both had normal Nugent scores (Table 1), and both sequencing methods showed a predominance of Lactobacillales (Fig. 4). Similarly, individuals 027 and 054 were diagnosed with BV and showed a more diverse vaginal microbiota with Actinobacteria and Bacteroidetes predominant and Lactobacillales being less abundant (Fig. 4). Consistent with observations in the 20-pool (Fig. 3), each individual sample showed essentially the same taxonomic distribution in the paired datasets; however, differences were apparent in the proportions of the different taxa identified. For all four individuals, the majority (81 to 97%) of the sequences identified in the GS FLX datasets were also identified in the corresponding Sanger datasets (Fig. 5). In addition, each individual showed a substantial number of sequences that were unique to the GS FLX data set; although the percentages were relatively small, the very large numbers of reads generated by the pyrosequencing method resulted in a large number of taxa identified uniquely in the GS FLX data set (Fig. 5). For example, individual 006 had only a single taxon (Lactobacillus iners) represented in 862 sequences identified by dideoxy sequencing of clones, while the GS FLX data set revealed an additional 39 taxa in this sample (Fig. 5). Similarly, the vaginal microbiota of individual 001 consisted mostly of L. crispatus in the Sanger data set, and the GS FLX data set revealed an additional 18 taxa in this sample. The two individuals with BV each contained more taxa than the normal individuals in the Sanger datasets and showed a gain of 31 and 33 taxa in the corresponding GS FLX datasets (Fig. 5).
![]() View larger version (26K): [in a new window] |
FIG. 4. Proportional representation of taxonomic categories in Sanger, Sanger overlap, (sequences in common to both datasets) and all sequences for each of four individuals. Individuals 001 and 006 were normal by microscopy, while individuals 027 and 054 were diagnosed with BV (Table 1).
|
![]() View larger version (44K): [in a new window] |
FIG. 5. Taxonomic composition of individual vaginal microbiota as determined by clone libraries and pyrosequencing. The taxonomic assignments of sequences found in the Sanger (A, C, E, and G) and GS FLX (B, D, F, and H) datasets are shown. For B, D, F, and H, additional taxa found in the GS FLX datasets are shown as stacked bar graphs, while the Sanger-overlap data set is shown as a pie chart. Colors are used to indicate bacterial families: yellow, Firmicutes; blue, Actinobacteria; red, Bacteroidetes; green, Proteobacteria. Species abbreviations: Lin, L. iners; Lcr, L. crispatus; L6, Lactobacillus sp. strain L6; Lje, L. jensenii; Pbu, P. buccalis; Gva, G. vaginalis; N156, Nairobi isolate 156 (Actinobacteria spp.); Afa, Acidovorax facilis; Pme, P. melaninogenica; Ava, A. vaginae; Mel, Megasphaera eldensii; Pin, P. intermedia; N137, Nairobi isolate 137 (Actinobacteria spp.); N160, Nairobi isolate 160 (Actionobacteria spp.); Lsa, Lactobacillus salivarius; Fma, Finegoldia magna; Tca, Thermosinus carboxydivorans.
|
![]() View larger version (24K): [in a new window] |
FIG. 6. Relative abundances of genera and species found in the technical replicates of individual 166 (A) and in the cpn60 and 16S rRNA GS FLX datasets for the four individuals pooled (B). For panel B, the shaded area represents the maximum observed variability expected from technical replicates of the same sample (A). Abbreviations: N137, Actinobacteria sp. strain N137; N156. Actinobacteria sp. strain N156; N160, Actinobacteria sp. strain N160; Gva, G. vaginalis; Ava, A. vaginae; Bov, Bacteroides ovatus; Mhy, Megamonas hypermegale; Pgi, Porphyromonas gingivalis; Pbi, Prevotella bivia; Pbu, P. buccalis; Pco, P. corporis; Pdi, P. disiens; Pin, P. intermedia; Pme, P. melaninogenica; Por, P. oralis; Pru, P. ruminocola; Prevotella sp., all Prevotella species; Fma, Finegoldia magna; Mel, Megasphaera elsdenii; Tca, Thermosinus carboxydivorans; Afa, Acidovorax facilis; Lcr, Lactobacillus crispatus; Lga, L. gasseri; Lin, L. iners; Lre, L. reuteri; Lsa, L. salivarius; Lactobacillus sp., all Lactobacillus species; Ssa, Streptococcus salivarius.
|
Comparison of 16S rRNA and cpn60 pyrosequencing data.
To compare directly the data obtained by cpn60 pyrosequencing to that obtained for the 16S rRNA target, we aligned by watered-BLAST the 16S rRNA and cpn60 pyrosequencing reads from the same four individuals to the paired databases representing 505 nonredundant type strain reference sequences found in both the RDP and cpnDB. This approach resulted in a taxonomic profile of the matched samples using exactly analogous reference databases. This analysis revealed that 15 taxa were identified in these samples by both targets (Fig. 6B) and that the relative abundances of most taxa were similar in the two datasets. However, the 16S rRNA data set contained a relatively higher abundance of Atopobium vaginae, Prevotella spp., and Lactobacillus gasseri and a lower abundance of Gardnerella spp. relative to the cpn60 data set. Only six species were found uniquely in each of the cpn60 or 16S rRNA pyrosequencing datasets at an abundance of >0.1% and combined were less than
2% of the total data (Table 2).
|
View this table: [in a new window] |
TABLE 2. Taxa that were uniquely found in 16S rRNA or cpn60 pyrosequencing data and represented at >0.1% of the matches
|
![]() View larger version (12K): [in a new window] |
FIG. 7. Calculation of the number of OTUs for each of the 16S rRNA and cpn60 GS FLX subsets identified as Prevotella spp. The number of OTUs calculated by the farthest-neighbor algorithm of DOTUR are reported at various sampling depths for percent identity cutoffs of 3% (cpn60, ; 16S rRNA, ) and 5% (cpn60,
|
|
|
|---|
As a pilot experiment, we compared the taxonomic profile of pooled vaginal microbiota samples (20-pool) generated from a small clone library to that obtained by a small GS FLX data set generated from a single region of a 16-region run. In general, the taxa that were identified were consistent with what would be expected from a human vaginal microbial community (11, 17). In addition, we found that the microbial profiles generated by the two sequencing methods agreed very well with the smaller Sanger data set virtually entirely contained within the larger GS FLX data set. A substantial amount of taxonomic depth was gained with the pyrosequencing method, essentially doubling the number of taxa identified in the same samples. These results are consistent with those obtained by Edwards et al. (5), who found similar taxonomic distributions of 16S rRNA sequences in clone libraries and GS FLX datasets in samples taken from deep mines. We also found that the proportions of sequences represented in the two datasets were somewhat different, with certain taxa, especially Clostridiales, present in a higher proportion of the reads in the GS FLX data set. The fact that the proportions of the sequences represented were different between the Sanger data and the GS FLX data is not entirely surprising, given the different biases that apply to each of these methods. Although both methods are equally subject to representational biases that can arise in the PCR step since the same primers, templates, and amplification conditions were used for both methods, the library method has the additional bias of cloning the PCR products that are generated. The cloning step could introduce biases into the Sanger data set; for example, colonies with different inserts may not grow equally well on the selection plates. We have observed in previous work that the frequency with which clones are represented in cpn60 UT libraries does not always reflect the abundance of the organism as measured by methods such as quantitative PCR (4). Therefore, we expect that the pyrosequencing data reflect more accurately than the clone libraries the composition of the PCR product pool.
Since the number of reads generated in this pilot experiment was low for a 1/16 region on the GS FLX (which typically generates in excess of 12,000 sequences in this format), we generated expanded datasets containing paired Sanger and GS FLX data for four individuals. The results of this larger analysis also showed that the pyrosequencing method consistently revealed a far richer taxonomic composition of the vaginal microbiota in each individual than was shown with the clone library approach. This trend was particularly notable in the samples that were scored as normal by microscopy (individuals 001 and 006), which increased from 1 to 3 taxa in the clone libraries to 21 to 40 taxa in the corresponding GS FLX datasets. Samples from individuals with BV (027 and 054), which were more diverse in composition in their Sanger datasets, showed the same trends (4 to 12 taxa in the Sanger datasets versus 37 to 43 taxa in the GS FLX datasets). In four individual samples, we found that the Sanger data were nearly completely contained within the GS FLX data and that additional taxonomic richness was revealed with the GS FLX sequencing method.
We investigated the reproducibility of the pyrosequencing approach using cpn60 amplicons generated independently from the same sample and analyzed in two separate pyrosequencing reactions. We found that the taxonomic profile generated with this method was highly reproducible, including the proportions of reads represented at the species level. Since the maximal variation between runs for a given species was
2-fold, we suggest that this is the normal range of variation within technical replicates using this method. Although 18 species were found specifically in one of the two repeats, none of these represented more than 0.2% of the total data for a library. We conclude that the taxonomic profile generated using the sequence-tagged GS FLX approach is sufficiently robust that a single sample can be used for community analysis.
We also compared pyrosequencing data obtained from the same samples using the cpn60 UT and the more widely used 16S rRNA. In order to provide a valid, easily interpreted comparison of the taxonomic assignments given by the two targets, we prepared reference databases containing data for 505 isolates for which paired cpn60 UT and (near) full-length, good quality 16S rRNA sequence data are available. Using these databases for taxonomic assignments by watered-BLAST, we found that the profiles generated by the two targets on the same samples were virtually identical. A total of 16 species were found in both datasets, and while a few were represented with different proportional abundances within their respective datasets, the majority were represented with nearly equal abundances. Species with a proportional difference of more than two- to threefold likely represented real differences in their representation in the two datasets, since this is the maximal variation that was seen in the technical replicates. The different representation of some of the targets might be explained by differences in the efficiency with which various species are amplified by the universal primers. We found 12 species that were specifically represented in one or the other of the datasets generated by the 16S rRNA or cpn60 primers. However, none of these were more than ca. 2% of the sequences of the respective datasets, and most (10 of 12) were less than 1%. We can therefore conclude that the taxonomic profiles within the two datasets were essentially in agreement with one another.
It has been noted that protein-encoding genes may provide an increased level of resolution compared to the structural 16S rRNA-encoding gene (13, 28). However, we did not specifically address this question with the approach used above. Therefore, to compare the taxonomic richness of the data generated from the two targets, we used the farthest-neighbor algorithm of DOTUR to calculate the number of OTUs at various sampling depths for a genus whose relative abundance was similar across target libraries. The fact that cpn60 sequences consistently yielded a higher number of OTUs at each cutoff suggests that the sequences identified as Prevotella are more different from one another within the cpn60 data set than are the sequences within the 16S rRNA data set. We could not expand this observation to other genera since the sizes of the datasets made the generation of the distance matrices computationally intractable.
In summary, we found that pyrosequencing of cpn60 UT amplicons compared very favorably to the clone library approach as a method of characterizing a complex microbial system. Moreover, pyrosequencing of cpn60 amplicons yielded a taxonomic profile of a microbial community that was very similar to that generated by the 16S rRNA molecular target but with a higher level of taxonomic resolution. The very high number of reads that are generated by pyrosequencing resulted in a total data set that included essentially all of the sequences that were represented using the library method, along with additional sequences that greatly increased the number of distinct taxa that were identified. Since the pyrosequencing method does not require the cloning of amplicons, it is much less labor-intensive than sequencing of clone libraries, and it avoids the representational biases that can result from the cloning step. Pyrosequencing of cpn60 UT PCR products offers the ability to probe much deeper into the compositions of microbial ecosystems than is feasible using the library approach, making the detection of lower-abundance organisms possible. We conclude that generating microbial community profiles by pyrosequencing of cpn60 UT amplicons results in a reliable, reproducible taxonomic profile of a microbial community that can be used to identify low-abundance organisms that are typically missed by the clone library approach.
Published ahead of print on 6 March 2009. ![]()
Supplemental material for this article may be found at http://aem.asm.org/. ![]()
|
|
|---|
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»