Previous Article | Next Article ![]()
Applied and Environmental Microbiology, April 2009, p. 2495-2505, Vol. 75, No. 8
0099-2240/09/$08.00+0 doi:10.1128/AEM.02409-08
Copyright © 2009, American Society for Microbiology. All Rights Reserved.

Microbial Habitat Group, Max Planck Institute for Marine Microbiology, Celsiusstrasse 1, 28359 Bremen, Germany
Received 21 October 2008/ Accepted 2 February 2009
|
|
|---|
|
|
|---|
When using community fingerprinting methods, the sizes and the fluorescence intensities of labeled DNA fragments can be analyzed by using an automated sequencer (e.g., in T-RFLP, SSCP, and ARISA). DNA fragments of different sizes are thereafter commonly classified into OTUs (see, for example, references 1, 29, and 44). Often the relative peak areas, i.e., relative to the sample total peak area (54), are used as an indicator of OTU abundance, although it is known that PCR-based fingerprinting methods cannot yield a true representation of the original DNA ratios because of preferential and nonspecific amplifications (11, 46, 51) and of heterogeneity in rrn copy number (7). However, it is generally accepted that the intersample variation in relative peak areas should not be affected by the biased PCR conditions because they similarly apply to all OTUs and samples (2, 42). However, using peak areas as an indicator for absolute abundance has been confirmed to be a fruitless exercise, even when using simplified microbial communities (4, 10, 55). However, when experiments are performed under well-defined laboratory conditions or on simplified communities whose rrs copy numbers per genome are known, quantitative estimates of each target organism may be successfully obtained via community fingerprinting methods (27, 41). A corollary of the lack of quantitativeness in community fingerprinting methods is that they may not be appropriate to assess the richness of microbial communities or to be used with diversity metrics because of their underestimation of the actual richness and because of their limited detection threshold and dynamic range (reviewed in reference 1). These methods are merely strategies designed to screen samples for OTU presence, in contrast to rrs sequencing approaches that are meant to actually sample microbial diversity.
The goal of the present study was to describe a new strategy, called qfingerprinting (for "quantitative fingerprinting"), which provides quantitative estimates of each detected OTU over a scale of several orders of magnitude. In brief, the strategy may be understood as a combination of two techniques, i.e., dilution endpoint PCR (a variant of most-probable-number-PCR [MPN-PCR]) (38) and traditional community fingerprinting analysis. First, the development and subsequent validation of the strategy were performed on a model community composed of a mixture of known sequences. Second, the approach was applied to environmental microbial communities from deep-sea sediment samples for which quantitative estimates of microbial communities were already available (20). Changes in OTU abundance as a function of depth, as well as rare, dominant, and common OTUs, were then determined. Standard ARISA and quantitative ARISA (qARISA) were compared regarding both overall community patterns and the distribution of individual OTUs. Further ecological applications of the new strategy are also proposed.
|
|
|---|
Binning.
From GeneMapper output tables, conversions to sample-by-fragment tables and subsequently to sample-by-binned-OTU tables were performed by using custom R binning scripts. The algorithm rearranges the data and calculates the relative fluorescence intensity (RFI) of each peak by dividing individual peak areas by the total peak area for the respective sample. All peaks with RFI values of <0.09% were not included in further analyses since they consisted of background peaks. To include the maximum number of peaks while excluding background fluorescence, only fragments above a threshold of 50 fluorescence units and ranging between 100 and 1,000 bp were taken into consideration.
In order to take into account the technical variability in peak size calling (due to computer interpolation of peak sizes) (12), dye migration discrepancies (48), and run-to-run variations (24, 31), all peaks within a range of sizes (a window) can be combined (i.e., binned) into bin window frames, where the window is as wide as the imprecision of the OTU size calling. The window size (WS) must be estimated empirically for each fingerprinting pipeline by running appropriate controls at different times, for instance. Binning has a significant effect on the obtained sample similarities and must thus be accounted for to avoid falsely describing ecological differences that would be only due to technical variability (12). Besides, binning is important because fragment sizes produced by the fingerprinting strategy may later be compared to databases or to sequences from clone libraries (3, 18, 28). In our case, the ARISA imprecision was ±1 bp (i.e., WS = 2 bp) as determined by running known target sequences at different times on the same sequencer under the same analytical conditions (data not shown).
A shifting window size binning strategy was used since it offers the possibility to optimally aligned electrophoretic profiles and to deal with different window starting positions (12). The binning frame that offers the highest similarity among samples is identified out of all binning frames starting at a given position. The distance between two consecutive binning frames is defined as the shift (Sh) value. In the original description of the binning approach (12), only integer values were considered as "true" values to which peaks needed to be assigned to by taking the appropriate technical inaccuracy, resulting in Sh = 1 bp. This would mean that the two bin frames (WS = 2) would start, in our case, at 100 and at 101, respectively. Due to dye migration discrepancies, however, the actual, true (but unknown) size value may be different from integer values (i.e., a decimal value could be also representing a "true" size) and may actually change over the range of sizes being examined due to sequence-specific migration discrepancies (17). For this reason, Sh values must also be variable and a good, but the computing-demanding value may be 0.1 bp. For WS = 2 bp, there would therefore be 20 bin window frames to be calculated and evaluated. The current implementation of the window shifting algorithm enables a user-defined choice of the WS and Sh values, as well as the range size and RFI cutoff value, to calculate the best binning frame for a given data set. The interactive script then reports the best frame among all calculated as the one that maximizes sample similarities.
Another script allows for an automatic calculation of a series of WS values (e.g., 0.5, 1, 2, 3, 4, and 5 bp) for a given Sh value (e.g., 0.1 bp). This enables an optimal determination of the best binning strategy for a data set without a priori knowing the ideal WS value. A compromise between high resolution (low WS) and high similarity among samples (high WS) must be found based on the output of the scanning mode script. The interactive and automatic binning algorithms are implemented in the free, R programming language (The R Foundation for Statistical Computing [http://cran.r-project.org/]) and are available with their respective manuals and examples online (http://www.ecology-research.com). After proper, platform-specific installation of the R base program, the scripts run on Windows, Mac OS, Solaris, and Linux platforms.
Description of the qARISA strategy.
Each DNA sample was serially diluted with sterile PCR-grade water in triplicates, and this is done by transferring few microliters of DNA (typically 1 or 2 µl) from one well to the next by using a single pipette with a new tip each time to avoid carrying DNA over (Fig. 1). Dilutions were directly performed in PCR microtubes each containing, for instance, 9 or 18 µl (in accordance with the volume of DNA transferred) of sterile PCR-grade water if decimal dilutions are chosen. Using a multichannel pipette instead of a single pipette led to more systematic variation due, most likely, to variation in individual volumes that are more difficult to detect when 8 or 12 pipette tips are simultaneously used (unpublished data). Negative controls consisted of PCR-grade water instead of DNA template. The standard ARISA protocol was then applied to a common volume (e.g., 2 µl) from each dilution well, and all reactions were processed at the same time to avoid introducing additional experimental variability between dilution series and replicates. After capillary electrophoresis and peak sizing, binning was also performed on all replicates and dilution series together. Using a combination of the automatic and the interactive binning algorithms was then necessary to obtain the best-fitting binning approach for the data set at hand. The binned table that contained RFI values for the dilution series and for the sample replicates was then converted to a sample-by-OTU table, where OTU abundance values were calculated based on the dilution factor for each sample and on two rules (i.e., the consensus and continuity rules) to determine whether a binned OTU was reliably present. With the consensus rule, a peak is considered present if it appears at least twice among three replicates of the sample for a given dilution level. The continuity rule states that the highest dilution level recorded is the highest one for a continuous series of successful amplifications from undiluted to more diluted DNA preparations. The abundance was then expressed as log10 values for a decimal dilution series. The table conversion that takes into account the dilution levels and the continuity and consensus rules is performed automatically with the qfingerprinting algorithm (provided at http://www.ecology-research.com).
![]() View larger version (26K): [in a new window] |
FIG. 1. qfingerprinting strategy. See the text for details.
|
Model community experiments.
The qARISA strategy was applied to a model community consisting of known DNA sequences. Those sequences originate from a sequencing project from deep sea sediments to be reported elsewhere (sequences available upon request). Four plasmids containing cloned partial 16S and ITS sequences were chosen for yielding defined peak sizes of 601, 654, 504, and 681 bp when analyzed by ARISA. They were prepared to obtain the respective decreasing concentrations of 0.2, 0.02, 0.02, and 0.002 ng/µl when mixed together in a total volume of 12 µl. Decimal dilutions were done in triplicates by transferring 1 µl of DNA solution into 9 µl of sterile water until the decimal dilution 10–6 was reached. From each DNA preparation (undiluted and diluted), 2 µl was then used per 50-µl ARISA reaction. After this procedure, the initial amount of DNA for peak 601 bp was 0.4 ng at the undiluted level. The subsequent qARISA steps were as described above. The model community experiment was repeated at three independent times and yielded highly consistent results. Here, the results of only one experiment are presented.
Environmental sample analysis.
Microbiological and geochemical data obtained from the sediment samples of reference core 51-1 used in the present study have been described previously (20). The core was sampled on 1 August 2000 by deploying a video-guided multicorer at the crest of southern Hydrate Ridge (44°34.198'N, 125°08.858'W, 775-m depth) of the Cascadia convergent margin off the coast of Oregon. The core was not enriched in methane in the surface sediments, but parallel cores contained specimens of the chemosynthetic bivalve Acharax, indicating that it was sampled from a site influenced by low seepage below the 15-cm sediment depth (20). DNA was extracted from 1 g of homogenized sediment sample by using an UltraClean soil DNA isolation kit (MO BIO Laboratories, Inc., Carlsbad, CA) according to the manufacturer's instructions. The qARISA was performed as indicated above in triplicate decimal dilution series with a starting DNA concentration of 20 ng per reaction (undiluted level). A total of eight negative controls (PCR water was used instead of DNA) were included in the 2.5 96-well plates processed by the automated sequencer in order to guarantee that no contamination occurred. All samples were analyzed together from DNA extraction to the final capillary electrophoresis separation, so as to limit additional sources of experimental variation.
Statistical analyses.
The nonparametric Kolmogorov-Smirnov two-sample test was used to compare the abundance distributions derived from ARISA and qARISA. The significance of the two-sided test was determined by performing 1,000 permutations of the non-null values of the data set. The Mantel's test was used to determine the level of correlation between the Bray-Curtis similarity matrices among samples inferred from the binned table derived from ARISA and from qARISA. The significance of the correlation value was assessed by 1,000 permutations. Overall, patterns in sample dissimilarity were explored via nonmetric multidimensional scaling (NMDS). NMDS places the samples in a two-dimensional coordinate system so that ranked dissimilarities between the samples are preserved. A stress function assesses the goodness-of-fit of the ordination compared to the original sample ranking (23). Procrustes analysis was used to determine how similar the final sample ordinations deriving from ARISA versus qARISA could be (23). The method estimates the concordance of scores between two ordinations after rotating, translating, and dilating them in order to obtain the best fit. A permutation procedure is then used to test for the significance of the concordance (34).
Detrended correspondence analysis (14) was applied to determine whether linear or unimodal OTU models better fitted the current data set (37). Principal component analysis (PCA) was done on ARISA tables that were first Hellinger transformed (2, 22, 37). To determine whether depth as an environmental parameter could significantly explain the variation in bacterial community structure, a canonical redundancy analysis (RDA) was used, and its significance was assessed by 999 Monte Carlo permutation tests. After a significant RDA, nonparametric Spearman tests were used to detect significant positive or negative correlations of each OTU abundance distribution with depth. All statistical tests and graphics (e.g., heatmap) were produced by using R packages (Stats, Vegan, MASS) and CANOCO for Windows 4.5 (47).
|
|
|---|
![]() View larger version (21K): [in a new window] |
FIG. 2. Electropherograms obtained for the qARISA approach performed on a model community. The decimal dilution factor is indicated in each panel by a figure (0 for undiluted, –1 for 0.1, –2 for 0.02, etc.) and the letter K (no DNA control). Only one replicate electropherogram per dilution level is shown (see Fig. 3 for a complete overview). The initial DNA concentrations of individual fragments 504, 601, 654, and 681 bp were 0.02, 0.2, 0.02, and 0.002 ng/µl, respectively.
|
![]() View larger version (18K): [in a new window] |
FIG. 3. Calculations of abundance values for each OTU present in the model community. After binning, the RFI (in percentages of total sample intensity) for each OTU is indicated when present at the corresponding dilution and replicate levels. Using the qARISA strategy, the abundance of each OTU was then calculated by taking the consensus and continuity rules into consideration, as well as the information from all replicates and dilution levels (see text for details). The resulting abundance values for the model community are then expressed in log10 (bottom of the figure).
|
Interestingly, the RFI values did not proportionally change with the dilution factor (Fig. 3). At the undiluted level (i.e., standard ARISA conditions), the RFI values ranging from 13 to 31% did not even reflect the fact that the concentration of each target sequence was initially consisting of 10-fold differences. Indeed, RFI values are obtained by standardizing the individual peak areas per sample (see, for example, references 44 and 54). It is thus not surprising that these values do not follow the dilution trend imposed in the study because their magnitude is directly related to the behavior of the other peaks present in the given sample (see also reference 4). Consequently, the final calculations of abundance values from the dilution series should take the occurrence of the OTUs into account, instead of their RFI values.
On the electropherograms (Fig. 2) and in the resulting sample-by-OTU table (Fig. 3), several OTUs were associated with small peaks, some of which satisfied both the consensus and the continuity rules (e.g., OTUs 522, 525, and 534). Peaks that did not correspond to one of the target sequences in the mixture were always found at low abundances. This may be due to the fact that because few targets were present in the artificial mixture, the background level of false-positive peaks was higher. Thus, unspecific, but weak amplifications were more likely to be detected. When more sequences are present in the DNA mixture, as with environment samples, the small, inconsistent peaks may be diluted out by the presence of more numerous and highly dominant peaks. In a standard ARISA, however, these small peaks would have been considered as real OTUs, whereas here the difference in magnitude reveals that they may most likely be considered PCR artifacts.
By taking the initial insert concentrations and the respective dilution factors into consideration, the theoretical detection limit of the technique can be estimated. By taking the most abundant OTU (601 bp) at the initial concentration of 0.4 ng per reaction at the undiluted level, and assuming 660 g/mol per bp (thus, 6.59 x 10–19 g per fragment), a total of
6 x 108 fragment copies of the target sequence were then initially present in the reaction. The highest dilutions where amplifications still occurred were between 10–5 and 10–6 (Fig. 2 and 3), indicating that a minimum of 600 to 6,000 target sequences were needed for a successful amplification to occur under our experimental conditions. This is in accordance with the detection thresholds of 103 cells per ml of sample generally reached when molecular community fingerprinting methods are used (6).
Overall, the qfingerprinting concept and analysis pipeline worked well on the mixture of known ITS sequences. The difference in the magnitude of the initial target ratios was maintained because the derived abundances obtained via qfingerprinting are not based on the final, biased quantitative estimate of PCR success, but rather on the qualitative outcome of the PCR amplification at different dilution levels. The test on plasmids also illustrates the need to apply advanced numerical methods to identify the best binning frames so as to identify the most likely dilution levels for OTU occurrence. Indeed, misclassification of OTUs between replicates at a given dilution level could lead to a false rejection or acceptance of the consensus or continuity rules by altering the continuity of the series of positive amplifications.
Application to environmental samples.
Samples originating from deep marine sediments forming a natural depth gradient were chosen to illustrate the advantages of the new method. A total of nine successive 1-cm sediment layers were subjected to qARISA, thus producing a total of 189 reactions (nine layers x seven dilution levels x three replicates). A total of 15,361 peaks were retained after capillary electrophoresis, which ranged from 100 to 1,000 bp and whose absolute peak areas were over 50 fluorescence units. No peaks satisfying the aforementioned criteria were observed for the eight negative controls. Due to the large number of peaks, using an automatic binning technique was very important because OTU identification and quantification had to satisfy both the consensus and the continuity rules, which could not be performed manually.
After the scanning mode of the binning algorithm was applied, the correlation coefficients between samples were all 0.46 for window sizes 2, 3, 4, and 5 bp and corresponded to total OTU numbers of 438, 298, 225, and 181, respectively. For window sizes of 0.5 and 1 bp, the correlation coefficients were smaller (r = 0.32 and 0.38, respectively), but the OTU numbers were much higher, with 789 and 815 OTUs identified, respectively. These findings are not surprising since increasing the window size leads to lumping more peaks together, hence to a smaller number of OTUs finally identified. As a compromise between high correlation among samples and high OTU number, the binning results of window size 2 were chosen for subsequent analyses. A total of 437 bins (OTUs) were thus obtained out of the 15,361 initial peaks detected when all replicated samples were considered. After the qfingerprinting function was applied to convert the binned table into a log10 abundance table (i.e., by taking the dilution series into account, as well as the consensus and continuity rules), a total of 332 OTUs were finally obtained for the nine environmental samples. For each OTU, an abundance value that ranged from 0 (absent) to 7 (present at abundance levels 7 orders of magnitude higher than at the undiluted level) was obtained, and this produced on average 173 ± 20 (standard deviation) OTUs per sample.
The variability in the quantitative estimates obtained via qfingerprinting was already taken into consideration at the levels of peak size calling and interexperimental variability by using the appropriate binning method and consensus rules. However, the current implementation does not enable the calculation of confidence intervals for those estimates. Alternatively, the MPN (5) approach could be used to analyze the data in a probabilistic framework. MPN-PCR and direct dilution calculations provide slightly different estimations of abundances, but they are mostly linearly related to each other (see, for example, references 35 and 38). The direct dilution calculation was chosen here for its ease of implementation when simultaneously dealing with hundreds of OTUs. It should be noted that the accuracy of the quantification depends on the number of replicates and dilution rates, with more accuracy obtained when more replicates are used per dilution level. The average accuracy has been shown to be almost identical for dilution rates between 2 and 10, but coefficients of variation are more stable and slightly lower for twofold dilution assays (5). The dilution rate may obviously be adapted by taking previous knowledge, specific needs, or available resources into account.
Based on the detection limit calculated in the model community experiment (see above), a specific OTU may be detected if its number of copies is at least in the range 600 to 6,000 copies at the undiluted level, under ideal amplification conditions (i.e., if we assume that DNA from the synthetic community has the same properties as that from environmental communities). This would be indicated by an abundance value of 1 log10 in the result table. Consequently, when values of 7 log10 abundance are obtained (see Fig. 5), this corresponds to at least 6 x 108 target copies detected for the specific OTUs. Noticeably, this number is very close to the reported estimates of highest cell number found in those samples, which ranged from 1.0 x 109 to 6.0 x 109 cells/cm3 (20). Despite this interesting correspondence in abundance estimates, it should be noted that the abundance values obtained via qfingerprinting are difficult to convert to cell densities or biomass because such conversion would require knowledge of the copy number per genome and the genome size, and both numbers can be very variable (9, 19).
![]() View larger version (47K): [in a new window] |
FIG. 5. Two-way clustering of OTU abundances obtained by standard ARISA and by qARISA. A complete linkage algorithm was used to perform a cluster analysis of the Euclidean-based dissimilarity matrices inferred from RFI values (standard ARISA) and log10 values (qARISA) for the same set of 332 OTUs. Major clusters are indicated by filled circles in the qARISA dendrogram and designated by the letters A to D (see the text for a description of the clusters). This arbitrary cutoff roughly corresponds to 65% similarity.
|
![]() View larger version (12K): [in a new window] |
FIG. 4. Comparison of the distribution of abundance values and of their effects on sample dissimilarities when ARISA versus qARISA is used on a set of environmental samples. (A) Comparison of all non-null values (1,560 values) from the two distributions (standard versus qARISA) when the two fingerprinting strategies were applied to a gradient of nine environmental samples. (B) Comparison of Bray-Curtis dissimilarities obtained for pairs of samples based either on qARISA (log10 abundance) values or on RFI values obtained from a standard ARISA approach (n = 36 pairwise comparisons). The dotted line corresponds to the case where both approaches provide the same dissimilarities among samples (y = x). (C) Comparison of the ordination results obtained with the two strategies by Procrustean superimposition. Open circles correspond to samples in the ordination derived from the standard ARISA, and the tips of the arrows correspond to the positions of the samples in the ordination when the qARISA procedure is used. The dotted line indicates the axis rotation needed to best superimpose the two ordinations.
|
In conclusion, using qARISA versus ARISA led to various levels of discrepancies for data distribution, estimation of sample dissimilarities, and ordination results. The significant correlation between sample ordinations implies that the standard ARISA would still be robust for inferring changes in community structure, despite its inability to accurately estimate OTU abundance. This may originate from the DNA normalization steps common to both procedures, i.e., before the initial PCR, but also before separating DNA fragments by capillary electrophoresis. In addition, it has been shown that normalized ARISA peak area may relate well to quantitative relative abundance, as determined by flow cytometry (3), and this also explains why the standard ARISA has been successfully used thus far to describe community dynamics over spatial and temporal gradients in a variety of ecosystems (13, 44, 54). Despite the apparent limited differences between the two approaches, it is not yet known whether the two techniques would (i) still yield comparable sample ordinations when more samples are included in the analyses and (ii) offer the same ecological conclusions when contextual interpretations (i.e., when additional environmental, spatial, or temporal variables are available to explain diversity patterns) of the two sample ordinations are undertaken. Those important points will need particular attention in future qfingerprinting applications.
Validity of the continuity rule.
One of the main assumptions used in the qfingerprinting approach is that, for each OTU, PCR success is monotonically related to the concentration of the OTU target sequence in the sample. It is, however, known that PCR yield may be optimal (i) at high dilution levels of the DNA solution because impurities or PCR inhibitors may be diluted out or (ii) in the middle of a target DNA concentration range because a trade-off between DNA concentration and a low level of contaminants must be found to obtain a successful PCR amplification, thus producing a unimodal distribution of amplification success for a given OTU. The first consideration should not be a major issue if we assume that, in the case of impurities, PCR inhibition would apply to all OTUs at the same time because it is the efficiency of the DNA polymerase that is mostly affected and not a specific OTU amplification. Note that this point could be further debated because it is highly likely that impurities selectively trap DNA sequences based on sequence composition (see, for example, references 45 and 49), but this goes beyond the scope of the present study.
The second hypothesis deserves more attention because it will directly entail a modification of how the algorithm currently assesses OTU presence and calculates the final OTU abundance table. By examining the raw data for each OTU before applying the qfingerprinting calculations, on average 86% ± 10% of the OTUs were present at dilution 100, and for the existing OTUs whose amplification patterns did not start at the undiluted level (i.e., presenting unimodal rather than linear patterns) they consisted more than 61% of the time of OTUs appearing only once at one dilution level, suggesting that they may be considered as unreliable peaks. Thus, it seems that the continuity rule used in OTU quantification represents a valid hypothesis to describe the behavior of OTU amplification across the dilution series. It must be noted that this conclusion, although being valid for this data set, could be inappropriate if more impurities are present (e.g., if different samples or extraction protocols are used) or if particular PCR conditions are more sensitive to suboptimal conditions.
Relationship between OTU distribution patterns and depth.
The environmental samples were coming from nine 1-cm-thick layers from the same core, and it was particularly interesting to look at the OTU distribution patterns and abundance changes across this natural depth gradient. The overall community behavior with respect to depth when determined by detrended correspondence analysis revealed that a short gradient length of 1.74 was associated with the qARISA data set, suggesting that most OTUs were linearly responding to environmental gradients (37). This was confirmed by RDA, indicating that depth significantly explained the variation in the data set (P < 0.001, based on 1,000 permutations of the multivariate model).
A detailed analysis of individual OTU abundance was performed by using two-way cluster analysis and heatmap plot representations (Fig. 5), which allow a display of OTU abundance relationships as a function of depth (the sample order with depth was kept fixed).
To allow for comparisons of abundance classes between RFI and log10 values, RFI values that ranged from 0 to 13% were first converted into a discrete 0 to 7 scale. Therefore, although the ranges of color codes (0 to 7) were the same for standard and quantitative ARISAs, their meaning was different: for standard ARISA they corresponded to fluctuations of
1 log10 value difference [log10(13) = 1.11], whereas for qARISA the variation corresponds to 7-order-of-magnitude changes (decimal logarithmic scale). Clustering RFI values from standard ARISA produced overall a very different picture of inter-OTU relationships and of abundance changes with depth compared to the outcome of qARISA log10 abundance values (Fig. 5). Noticeably, OTUs in standard ARISA seemed to fall into two main clusters, whereas four, more variable clusters (designated by the letters A to D) could be identified with qARISA data for a similar cutoff level (Fig. 5). It is important to note that clustering RFI values from standard ARISA could lead to very unpredictable patterns because, as mentioned above, these values cannot truly represent OTU abundances from a sample since they are derived from the overall peak abundances from a given electrophoretic profile. For instance, the two most abundant OTUs in standard ARISA were located in cluster A or B when qARISA was used, i.e., clusters associated with low abundant OTUs.
The four main clusters obtained with qARISA data reflected different patterns of OTU occurrence among samples and different OTU abundance categories, i.e., high, average, or low abundance. Nonparametric Spearman correlation tests were subsequently used to determine significant OTU relationships with depth. Among the 199 OTUs (60% of total OTUs) of cluster A, the abundances of 17 and 2 OTUs significantly decreased and increased with depth, respectively, and contributed to 9.5% of all OTUs in the cluster. Since the highest OTU abundance was 4 log10, cluster A may be qualified as composed of "rare OTUs with low abundance." Cluster B consisted of 84 OTUs (25% of the whole data set). Among them, two and one OTU, respectively, significantly decreased and increased with depth, representing 3.5% of the OTUs in the cluster. As such, the OTUs in cluster B may be designated as "common, but generally not affected by depth." Cluster C consisted of 40 OTUs (12% of all OTUs), with 24 OTUs significantly increasing with depth (i.e., 60% of the OTUs of the cluster), leading to the most "depth-associated" cluster. Finally, cluster D with only nine OTUs (3% of all OTUs) consisted of OTUs that were present in several samples and at high abundance (ranging from 5 to 7 log10) but whose abundance overall displayed no trend with depth. This cluster may be designated as that of "common and dominant" OTUs.
Only 27 OTUs (8% of the total OTUs) were present in all nine samples with an average abundance over 2 log10 (mean abundance for all nonzero values), and 20 OTUs (6%) were present in all samples, but with abundances lower than 2. A total of 57 (17%) rare OTUs (i.e., occurring in one sample only) were all found with an average abundance below 2 log10. The large majority (69% or 228 OTUs) of all OTUs were thus not classified as either dominant or rare. It may thus be concluded that qARISA did not only amplify the most common OTUs present in the samples, since the most abundant only consisted of 8% of all OTUs detected. The large majority (85%) was found to be composed of rare or average OTUs with generally low abundance and with weak relationships with depth. It must be noted that the finding of well-represented OTU categories from our environmental samples has a lot to do with the choice of the primer pairs. The ARISA primers in the present study were chosen for their ability to evenly amplify different OTUs in the same sample compared to other primer sets (4; unpublished results). Therefore, the finding of various OTU categories should not directly be attributed to the qfingerprinting approach but to the choice of PCR primers and amplification conditions.
The finding of a substantial proportion of rare or average OTUs is contrary to the common belief that fingerprinting methods only detect the most abundant microbes in a sample (reviewed in reference 1) and suggests that qfingerprinting may also be useful for studying the dynamics of the "rare biosphere" (16, 33). It is, however, difficult to attribute the very rare types either to PCR artifacts, as could be suggested by the low-abundance peaks detected in our model community experiments, or to a true representation of the community composition. It has previously been suggested that rare OTUs (often associated with small peak sizes) may be artifacts of fingerprinting methods (2), thus resulting in fallacious description of microbial community structure (see, for example, reference 55). Using the consensus rule as a conservative way to validate the presence of a given OTU may be one step toward rejecting trivial PCR artifacts. Caution is therefore required at this stage to avoid overinterpreting the presence of rare types in the data, especially when additional molecular proofs are not available, at least for some of the rare OTUs. Alternatively, cutoff abundance values may also be used to remove the low-abundant OTUs if one considers their presence as artifactual.
Inter-OTU relationships.
In most microbial diversity analyses, biotic interactions between components of the microbial community are overlooked, despite the fact that they may be extremely important in determining community dynamics and changes in ecosystem processes, as evidenced from macroecological studies (53). Here, the qfingerprinting approach enabled interrelationships between OTUs to be further examined in contrast to the traditional fingerprinting approaches that only offer a qualitative description of OTU presence or absence. To facilitate the depiction of the relationships, PCA was chosen since it helps identify OTUs whose abundance profiles covary positively, negatively, or neutrally with respect to each other (Fig. 6), insights that cannot be easily deduced from Fig. 5, for instance. In the ordination plot, OTU vectors that are orthogonal to each other may be considered as behaving independently, whereas the ones that are colinear may be seen as positively or negatively covarying, depending on the angles between the vectors being compared (see, for example, references 23 and 37 for further interpretation of PCA ordination plots). Noticeably, individual OTU patterns were clearly spanning all directions in the PCA ordination plot (Fig. 6) and therefore revealed the existence of much more individual variability than what could have been assumed within a single sediment core.
![]() View larger version (57K): [in a new window] |
FIG. 6. PCA of OTU distribution patterns. PCA with scaling focusing on inter-OTU distances for the 332 OTU distribution patterns was chosen to illustrate how inter-OTU relationships could be deduced from qARISA-derived abundance values. The percentages of explained variations for the first two axes were 36 and 20%, respectively (gray panel). Depth is represented as a supplementary variable in the PCA plot. Panels A, B, C, and D represent separate projections of the specific OTUs defined in the four clusters of Fig. 5, respectively. Arrowheads depicted as empty circles indicate OTU distribution patterns that are shared by multiple OTUs (i.e., 22 patterns in panel A, 1 pattern in panel B, and 3 patterns in panel C).
|
Advantages and limitations of the quantitative approach.
The present study demonstrates the usefulness of using the qfingerprinting strategy to simultaneously determine the relative abundance of hundreds of OTUs, and this could readily be applied to tens to hundreds of samples. This quantitative approach represents a major improvement for the understanding of microbial diversity patterns at the level of individual OTUs in contrast to standard community fingerprinting methods that merely qualitatively screen for OTU presence or absence in samples. At the level of whole community ordination, i.e., not comparing individual OTUs but sample similarity based on OTU abundances, patterns retrieved by the two approaches were slightly different but still consistent with each other. Hence, if detailed information about each OTU abundance is required in a given study, the new, quantitative strategy should be preferred over the standard strategy. If the study merely focuses on overall patterns of sample similarities and accurate determination of OTU abundance is not needed, then the standard fingerprinting method with the necessary DNA standardization and normalization steps described in the present study should be favored. However, caution must be taken when interpreting diversity patterns based on qualitative measures of β diversity since this has been shown to lead to different ecological insights concerning the factors that structure microbial communities (26).
An additional advantage of the new strategy is the fact that there is no need to design individual PCR primers for each OTU, as would be required for a difficult, multi-OTU quantitative PCR approach. Because it is straightforward, the technique will easily complement the existing applications of ARISA, T-RFLP, and SSCP, and any other fingerprinting methods based on ribosomal or functional genes if the latter are based on high-throughput sorting of the fragments (e.g., via capillary electrophoresis) and if internal size standards are included in every sample. These last conditions are important in order to make sure that a high reliability is associated with peak size calling and that further correction of size calling inaccuracy is possible without losing sensitivity in signal detection.
Despite the considerable advantages of the new strategy, the same weaknesses and limitations as found with fingerprinting methods and more generally with any PCR-based methods may be predicted. For instance, the method also relies on correct DNA extraction methods to obtain a good representation of the original sample diversity and is affected by primer biases, the generation of chimeric sequences, etc., and all of those issues have been extensively reviewed (50). The specificity of the strategy is directly related to the choice of the PCR primers, and this fact may be used to narrow or to expand the range of targeted OTUs (this may be thus seen as a strength or a weakness). Diversity under- or overestimation common to many fragment-based fingerprinting techniques may also be problematic (for a review, see references 1 and 8). Finally, because many PCRs are done in parallel for the dilution series, higher risk of contaminations and cost increase have to be considered. However, because dilution rates can be adapted to the specific needs of the study and as access to sequencing facilities and automatic pipetting devices becomes easier and cheaper, large-scale, high-throughput applications of the technique should become more feasible and affordable in the future.
Foreseen applications.
Future applications of the qfingerprinting strategy should provide datasets that will help establish better mathematical models of intracommunity dynamics and more generally datasets on which microbial ecological theories and community assembly rules can be further developed (15, 21, 39). Detailed analyses of OTU distribution and abundance may also be foreseen when additional contextual parameters are available. For instance, OTU indicator analysis could be performed to determine whether various microbial OTUs are influenced by the same environmental factors or are indicative of specific community status, since microbes may play a major role in detecting and characterizing changes in environmental conditions (32). In addition, density-dependent processes such as quorum-sensing mediated microbial responses (52) may be monitored using qfingerprinting, and this will help us to better understand the relationships between community structure and function in complex ecosystems. Other examples of application should be found in the description of the allelic diversity associated with functional genes so as to quantify specific groups of microorganisms involved in biogeochemical processes, bioremediation, or plant protection and to identify their interactions and behavior under specific environmental conditions.
Financial support was provided by the Max Planck Society.
Published ahead of print on 6 February 2009. ![]()
|
|
|---|
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»