Previous Article | Next Article ![]()
Applied and Environmental Microbiology, April 2004, p. 2263-2270, Vol. 70, No. 4
0099-2240/04/$08.00+0 DOI: 10.1128/AEM.70.4.2263-2270.2004
Copyright © 2004, American Society for Microbiology. All Rights Reserved.
Microbial Ecology Program, Division of Biological Sciences, The University of Montana, Missoula, Montana 59812,1 Enteromix Research, Danisco Innovation, 02460 Kantvik, Finland2
Received 29 July 2003/ Accepted 31 December 2003
|
|
|---|
|
|
|---|
Compilation-based strategies often involve a random ("shotgun") approach, wherein related functional or ribosomal gene sequences from individual community members are PCR amplified and cloned from total community DNA for phylogenetic analysis or comparison to existing databases. Such techniques have proven very powerful and have been widely applied, generating much information on microbial diversity in a variety of systems, especially where ecologically relevant, but as-yet-uncultured, microbial community members are concerned (e.g., see references 5, 7, 8, 13, 15, 25, 28, 34, 36, 37, and 43-45). However, it is becoming clear that compilation-based approaches, which typically analyze 100 to 300 randomly obtained individual sequences (9), are limited in their ability to accurately detect total diversity where communities are complex. Thus, in microbial communities comprised of hundreds to thousands of individual taxa (e.g., soils or the gastrointestinal [GI] tract), individual taxa present in lower abundance (i.e., minority populations) will go undetected.
Some recent studies of microbial diversity have taken a theoretical approach by estimating total community diversity based on mathematical extrapolation from a partial analysis of the total community (6, 9, 10, 20, 25, 26, 41). These approaches, however, provide no specific information regarding the identity of minority populations, since their presence is only inferred and no clones are actually obtained and analyzed.
By contrast, total community analyses typically attempt to capture a sense of total community structure or diversity through a single, more direct analysis of total community DNA. A number of different approaches have been developed, including monitoring DNA reannealing kinetics (40, 42), restriction analysis of PCR amplicons from community DNA (10, 14, 24, 39), denaturing gradient gel electrophoresis (DGGE) of community amplicons (12, 27, 40), and fractionation of total community DNA based on G+C content (17). While these approaches typically probe the entire community, including minority populations, by direct analysis of total community DNA, they generally do not provide high-resolution identification of the populations present and do not focus on minority populations.
The limitations described above suggest that novel approaches are required to more comprehensively assess microbial diversity and enable detection and characterization of taxa that are present in low abundance yet perform important functions in the community. The present study combines two mechanistically different community analysis methods (GC fractionation and DGGE; GC-DGGE) with phylogenetic analysis of DNA sequences to obtain information on minority populations in the GI tract that were not detected by a typical random cloning survey of the same community.
DGGE-based approaches have been widely embraced to provide rapid, comparative analyses of apparent diversity of microbial communities in a variety of environments. Further, individual bands of interest can be excised from the gel for cloning or direct sequence analysis (32). However, because this approach relies on PCR amplification with its potential biases (21, 31, 38) and on visualization of resultant PCR products on gels, it is not quantitative and also likely underestimates true diversity in complex communities where taxa present only in low abundance go undetected.
GC fractionation of total community DNA (17) is independent of PCR amplification and thus provides a sense of relative abundance of bacterial populations, though only at low resolution. The output from this approach is a fractionated profile of the entire community that indicates relative abundance of DNA as a function of G+C content and inferential information regarding the taxa comprising the community. This technique has been successfully employed to study and compare microbial community structures in a variety of environments, including soils and sediments (16, 17), bioreactors (18), and GI tracts of insects and animals (1-3, 33). In addition, this technique physically fractionates total community DNA into aliquots that represent different G+C contents. These highly purified fractions are of high molecular weight and thus are suitable for additional molecular manipulations, including PCR amplification, DGGE analysis, and cloning.
Since each approach to microbial community analysis has its own inherent strengths and limitations, we reasoned that combining mechanistically different approaches should afford better resolution, provide more information, and thus increase the ability to accurately detect and assess total community diversity, including minority populations. We and others have previously shown how GC fractionation combined with 16S rRNA gene sequencing provides a useful method for the directed detection of bacterial populations of interest in the GI tract of humans and animals (2, 4) and in a volcanic soil (30). Herein we report a study that combines GC fractionation, DGGE analysis, and directed cloning and sequencing to assess microbial community diversity and specifically detect minority populations in the community.
This combined approach (GC-DGGE) overcomes both the primary limitation of GC fractionation, low resolution that does not indicate the number or identity of different taxa in a particular G+C fraction, and also the primary limitation of DGGE, the inability to detect populations present in low abundance, to better assess total community diversity. Initial fractionation of total community DNA based on G+C content effectively reduces the complexity of the community DNA mixture being analyzed such that the total diversity within each fraction can be more effectively assessed. The GC-DGGE approach also enables detection of taxa that are present in low abundance, since their DNA is localized into one or a few fractions and thus effectively purified away from the bulk of total community DNA. Additionally, by cloning and sequencing DGGE bands from individual fractions, we can gain insight into the identity of specific taxa of interest.
|
|
|---|
GC fractionation of bacterial community DNA.
Purified
enterome super-pool DNA was subjected to equilibrium density
centrifugation for fractionation based on G+C content
essentially as described previously
(17), except that the
cesium chloride solution was buffered with 5 mM Tris (pH 8.0). This
approach fractionates the genomic DNA of the component taxa of the
community as a function of its characteristic G+C content. This
separation is based on differential density imposed by the AT-dependent
DNA-binding dye bis-benzimidazole
(17). Following
ultracentrifugation, a Brandel model SYR-94 syringe pump (Brandel,
Inc., Gaithersburg, Md.) was used to pass the formed gradients through
an ISCO UA-5 UV absorbance detector (ISCO, Inc., Lincoln, Nebr.) set to
280 nm (to minimize background absorbance due to the cesium chloride
gradient) and then to a fraction collector. The G+C content
represented by each gradient fraction was determined by linear
regression analysis (r2 > 0.99) of data
obtained from control gradients containing standard DNA samples of
known G+C composition as described elsewhere
(17).
PCR amplification of 16S rRNA gene sequences.
Fractions from the region of the
gradient containing DNA were subsequently desalted using PD-10 columns
(Amersham Pharmacia Biotech, Piscataway, N.J.) and the
manufacturer's recommended protocol. bis-Benzimidazole is
presumably also removed in this process or at least does not interfere
with subsequent PCRs. Partial 16S sequences of rRNA genes representing
the organisms in the super-pool sample and from individual gradient
fractions were amplified for DGGE analysis and cloning by PCR as
described previously (1,
19). For direct random
cloning from super-pool bacterial community DNA, primers 536f
(5'-CAGCMGCCGCGGTAATWC-3') and
907r (5'-CCGTCAATTCMTTTRAGTTT-3')
were used. To amplify sequences from individual G+C
fractions for DGGE analysis, a 40-base GC clamp was added to the 536f
primer to generate primer 536fC
(11) and used in
conjunction with the 907r primer. PCR conditions were as previously
described (1).
These primers were derived from generally conserved sequences that have previously been shown to be present in the greatest percentage of eubacteria (and also Archaea and Eucarya) (35) and were predicted to capture >75% of all eubacterial sequences based on those present in the database at the time of that study (>10,000). However, greater recovery of total diversity was expected in the present study because primers 536f, 536fC, and 907r include degenerate bases at the positions having mismatches with known rRNA sequences in the Schmalenberger study. These degeneracies overcome inefficient binding of primers with mismatches by including appropriate primer variations and were thus expected to help minimize PCR bias in the present study.
DGGE analysis.
DGGE was
performed to compare the banding patterns obtained from individual
fractions obtained by GC profiling to the pattern obtained from total
cecal enterome DNA. Procedures and conditions for DGGE were essentially
as described previously
(11), except that 750 ng
of PCR amplicons from each G+C fraction and from the
unfractionated super-pool sample was loaded into each lane. Following
staining with SYBR Green I (BioWhittaker Molecular Applications,
Rockland, Maine) using the manufacturer's recommendations, gel
banding patterns were visualized and captured using a Bio-Rad Gel Doc
1000 and Molecular Analyst software (Bio-Rad Laboratories, Hercules,
Calif.). GelCompar version 4.0 software (Applied Maths, Kortrijk,
Belgium) was then used to normalize the positions of sample bands
(i.e., remove "smiling") based on band positions in
multiple known marker lanes to facilitate comparison of sample band
patterns.
Cloning, sequencing, and phylogenetic analysis.
Amplified bacterial 16S rRNA PCR
products (
400 bp) from super-pool DNA or from bands excised
from the DGGE gel were subsequently cloned into the EcoRVsite of the pT7Blue-3 plasmid vector using the Perfectly Blunt cloning
kit (Novagen, Madison, Wis.). Plasmid clones were identified based on
blue-white screening and then grown overnight in Luria-Bertani medium
amended with ampicillin (300 µg/ml) and tetracycline (15
µg/ml). Plasmid DNA was subsequently purified using Qiagen
mini-prep kits (Qiagen, Valencia, Calif.) according to the
manufacturer's specifications. The insert size of individual
clones was confirmed by restriction fragment analysis using
EcoRI.
Selected DGGE bands were excised from the gel, cloned, and sequenced as described in detail previously (11). To ensure that individual clones corresponded to the bands of interest, purified plasmid DNA from putative clones was used as template for PCR using the 536fC and 907r primers, and the products were analyzed via DGGE alongside PCR products from the super-pool DNA.
All confirmed
clones from the super-pool survey and from DGGE bands were subjected to
double-stranded DNA sequence analysis (MWG Biotech, High Point, N.C.)
and sequence comparison to determine the best match to known sequences
using the Ribosomal Database Project II (RDP II) website
(http://www.cme.msu.edu/RDP/html/index.html)
(23). Screening for
potential chimeric products was performed using the Chimera Check
software at the RDP II site, and potential chimeras were not considered
further. The species number designations and phylotype number
designations given below in the tables are arbitrary and are based on a
best-match analysis that included the sequences from this work, such
that any two clones given the same species number or phylotype number
are related to each other at an Sab of
0.95.
Nucleotide sequence accession number.
Sequences
obtained by random cloning from unfractionated super-pool DNA were
deposited in GenBank under accession numbers
AY574393
to
AY574431,
while sequences obtained from excised bands in the DGGE gel
were deposited under accession numbers
AY574432
to
AY574568.
|
|
|---|
![]() View larger version (14K): [in a new window] |
FIG. 1. GC
profile of pooled cecal community DNA (super-pool DNA) from
500 individual birds from the United States, United Kingdom,
Australia, France, and Finland. Numbers indicate locations of
individual fractions subjected to DGGE
analysis.
|
![]() View larger version (55K): [in a new window] |
FIG. 2. DGGE
analysis of partial 16S rRNA gene sequences from super-pool DNA and
individual gradient fractions (image normalized to remove smiling,
using GelCompar software). SP indicates DGGE patterns from
unfractionated super-pool DNA, and lane numbers indicate DGGE patterns
from individual GC gradient fractions (Fig.
1). The lettered circles
indicate individual bands excised from the gel for cloning and DNA
sequence
analysis.
|
Sequencing and phylogenetic analysis of DGGE bands from individual fractions.
Individual DGGE bands of interest were
excised from the gel, cloned, sequenced, and subjected to phylogenetic
analysis to determine the closest known relatives to these cecal
inhabitants. The criteria for selection of bands (Fig.
2) were that (i) they
represent examples of bands that have migrated similarly in widely
separated individual fractions such that they appear as a single band
in the super-pool lanes of Fig.
2 (e.g., lane 3, band c,
and lane 9, band a; lane 3, band d, and lane 5, band a); (ii) they
represent bands in high abundance in the community (e.g., lane 5, band
a; lane 9, band a; lane 10, band a); or (iii) they were undetected or
in very low relative abundance in the DGGE pattern from the total
super-pool sample (e.g., lane 1, band a; lane 3, bands a and b; lane 6,
band a; lane 8, band b). For several of the excised bands, multiple
clones were generated to assess the degree to which single DGGE bands
harbored heterogeneous mixtures of rRNA gene sequences from more than
one population.
In all, 17 different bands were excised from the
DGGE gel, from which 39 independent clones were generated and subjected
to DNA sequence analysis. Based on prior experience comparing partial
16S sequences from type strains to newly cloned sequences
(2) we made the following
assignments: when an Sab score of a cloned sequence
was
0.95 in relation to a type strain of a known species, the
cloned sequence was assigned to that species. When the
Sab score of a cloned sequence was greater than
0.70 but less than 0.95 in relation to a known sequence, the clone
sequence was assigned to that genus. Where the Sab
score of a cloned sequence was less than 0.70 in relation to any known
sequence, that clone was labeled an unidentified phylotype. Based on
the criteria described above, five of the clones were assigned to the
species level: Bifidobacterium saeculare (two clones),
Lactobacillus crispatus (two clones), and Lactobacillus
reuteri (one clone) (Table
1). An additional 17 clones were assigned to genera including
Atopobium, Bacteroides, Butyrivibrio,
Clostridium, Eubacterium, Fusobacterium,
Pectinstud, and Ruminococcus. The remaining 16 of the
clones represented 13 different unidentified phylotypes (Table
1). It is interesting that
the majority (13 of 14) of unknown phylotypes were recovered from
fractions 1 to 3, which represent regions in the total community
profile where the relative abundance of DNA, and hence organisms in the
community, is low (Fig.
1). This likely represents
a manifestation of the aforementioned phenomenon where, based on the
generally used random cloning approaches for phylogenetic surveys, the
majority of sequences obtained and deposited in the databases to date
come from the most abundant organisms in the community.
|
View this table: [in a new window] |
TABLE 1. Best-match
identification of phylotypes of clones from excised DGGE bands
|
Two sets of bands in Fig. 2 (lane 3, band c and lane 9, band a; and lane 3, band d and lane 5, band a) are from regions that have intense bands in the unfractionated super-pool sample and also have bands in essentially all of the G+C fractions. These represent striking examples of where a single DGGE band from total community DNA can harbor multiple phylotypes, as evidenced by the various identities of the clones obtained from these bands (Table 1). However, GC fractionation generally separated these phylotypes to different fractions.
Random cloning of unfractionated super-pool DNA.
A random cloning-based 16S rRNA
phylogenetic survey was performed on the same super-pool DNA sample
that was analyzed by GC-DGGE to compare the relative efficiencies of
the two approaches for detecting less-abundant members of the bacterial
community. In all, 136 randomly selected, confirmed clones were
analyzed. This general approach and the number of clones studied are
typical in size and scope to numerous other phylogenetic surveys of
microbial communities from a variety of environments as observed by
Dunbar et al. (9). As in
many, perhaps most, reports of phylogenetic analysis of 16S rRNA gene
sequences obtained from environmental samples, the majority of
sequences obtained did not exactly match any known organisms in the RDP
II database. Indeed, only 1 sequence of the 136 analyzed was an exact
match (Sab = 1.00) to a known organism,
Lactobacillus salivarius subsp. salicinius (strain
H0268 ATCC 11742T) (Table
2).
|
View this table: [in a new window] |
TABLE 2. Best-match
identification of phylotypes from shotgun cloning of unfractionated
super-pool DNAa
|
0.95 represented
Bacteroides fragilis (3 clones), Bacteroides merdae
(1 clone), B. saeculare (10 clones), Enterococcus
mundti (1 clone), Escherichia coli (1 clone),
Klebsiella pneumoniae (1 clone), L. crispatus (10
clones), Lactobacillus gasseri (2 clones), Lactobacillus
pontis (1 clone), L. salivarius (1 clone), and
Streptococcus bovis (6 clones) (Table
2). Seventy-one of the
sequences produced Sab scores between 0.70 and
0.95, often representing several different phylotypes within a given
genus. For example, there were 4 clones representing 4 different
phylotypes within the genus Bacteroides, 15 clones
representing 11 different phylotypes within the genus
Clostridium, and 31 clones representing 25 different
phylotypes within the genus Ruminococcus (Table
2). The remaining 28
sequences produced Sab scores below 0.70 and were
distributed into 21 different unidentified phylotypes (Table
2). These findings are
generally consistent with prior reports on chicken cecal microflora
composition using phylogenetic approaches by this and other groups
(1,
3,
14,
29). Collectively, these
data indicate that there are many as-yet-unknown microbes inhabiting
the chicken GI tract, at least in the context of phylogenetic
characterization. The most abundant species, based on frequency of cloning, were B. saeculare (7% of total) and L. crispatus (7% of total). At the genus level, the most abundant microbes were Ruminococcus (23% of total), Clostridium (11% of total), Lactobacillus (10% of total), and Bifidobacterium (8% of total). It is important to note that there were 87 unique phylotypes represented in this set of 136 clones, and the number of representatives of individual phylotypes in this study ranged between 1 (for 71 phylotypes) and 10 (for 2 phylotypes). Based on simple probability estimates and ignoring any potential cloning bias, a given phylotype would need to represent at least 2.2% of the total to be detected in this study with 95% confidence and at least 3.4% of the total to be detected with 99% confidence. Since more than half of the phylotypes detected in this survey were represented only once (0.7% of total), it can be assumed that there were many more phylotypes present in low abundance that went undetected. If we theoretically consider a system where there are 200 different phylotypes with relative abundances normally distributed across an order of magnitude from 0.1 to 1% of the total, 2,000 sequences (=10 times the number of phylotypes) would be required to detect all 200 phylotypes with 95% confidence. This is likely a conservative estimate, since others have noted that complex microbial communities often exhibit a log-normal distribution and thus contain a higher proportion of rare populations (6, 9). Alternatively, an approach that can dissect or fractionate the complexity of the entire community, allow directed recovery of unique phylotypes, and facilitate detection of populations present in low abundance could be employed to enhance detection of the diversity present in the community. Preliminary fractionation of total community DNA based on G+C content, as described here, represents one such approach.
In the context of the present study, it is most important to compare the suites of clones obtained by random cloning and the combined GC-DGGE approach. While it is difficult to apply statistical approaches to directly compare these disparate data sets, one from random sampling and the other from directed sampling, chi-square analysis indicated that the distribution of clones in the data sets obtained by the two approaches was very different (P < 0.01). The results were also quite striking in a qualitative sense. Where DGGE bands were selected for cloning because they were abundantly represented in the unfractionated super-pool sample (e.g., lane 5, band a; lane 9, band a; lane 10, band a), the corresponding phylotypes were well-represented in the random cloning survey (Table 3). Where DGGE bands from individual fractions were of intermediate abundance in the lanes from the unfractionated super-pool sample (e.g., lane 2, bands a and b; lane 3, band e; lane 8, bands b, c and d), the corresponding phylotypes from DNA fractions representing high relative abundance in the total community (e.g., fraction 8) were generally detected by the random cloning approach, whereas those from fractions of low relative abundance (lanes 2 and 3) were not (Table 3). In cases where DGGE bands were selected because they were undetected or in very low abundance in the super-pool lanes (e.g., lane 1, band a; lane 3, bands a and b; lane 6, band a), the corresponding phylotypes were not recovered by the random cloning approach. Overall, 22 of the 28 phylotypes recovered from excised DGGE bands (79%) were not detected by the random cloning approach. If one excludes clones from bands selected because they were abundant in the unfractionated super-pool sample (lane 5, band a; lane 9, band a; lane 10, band a), 20 of the 23 different phylotypes recovered from the remaining targeted DGGE bands (87%) were not represented in the pool of phylotypes recovered by random cloning.
|
View this table: [in a new window] |
TABLE 3. Overlapping
phylotype clones between random cloning and GC-DGGE approaches
|
The power of combined approaches.
In this paper we
report a novel strategy, GC-DGGE, which combines mechanistically
different microbial community analysis approaches to facilitate
enhanced assessment of microbial community diversity and detection of
minority populations of microbes. The underlying basis for this
strategy is to employ GC fractionation as an initial step to divide
total community DNA into fractions based on G+C content,
thereby effectively reducing the complexity of the community in each
fraction. This reduced complexity facilitates detection of diversity
based on DGGE analysis and directed cloning and sequencing of
individual bands from DGGE lanes corresponding to individual fractions.
This combined approach (GC-DGGE) overcomes both the primary limitation
of GC fractionation, low resolution that does not indicate the number
or identity of different taxa in a particular G+C fraction, and
also the primary limitation of DGGE, the inability to detect
populations present in low abundance, to allow enhanced detection of
total microbial community diversity.
This work was funded in part by the National Technology Agency of Finland, Tekes.
Present
address: Alimetrics Ltd., 07900 Helsinki, Finland. ![]()
|
|
|---|
This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»