Applied and Environmental Microbiology, October 2003, p. 6342-6343, Vol. 69, No. 10
0099-2240/03/$08.00+0 DOI: 10.1128/AEM.69.10.6342-6343.2003
| LETTER TO THE EDITOR |
|
|
|---|
Redundancy analysis may be appropriate for the analysis of data from designed experiments or where there is a strong environmental gradient that is expected to have a large influence on microbial ecology. But, where data do not have a strong structure defined a priori, similarities between samples are more sensibly explored by ordination methods such as principal component analysis or multidimensional scaling. The resulting visual displays give powerful insights into the data (see reference 3 for examples).
There is often no reason to expect samples to fall into discrete groups. But many clustering methods will identify apparently well-defined clusters in data where there are no natural groups (2). Ward's method is particularly prone to this problem. Cluster analysis is best viewed as a way of dividing samples up into convenient but arbitrary groups and should not be the only exploratory data analysis method used.
Using peak heights will downweight longer fragments because of diffusion during electrophoresis. It is therefore preferable to use peak areas (4).
On the basis of which similarity measure gave "the right answer" for their data, Blackwood et al. (1) recommend using Euclidean distance on square-root-transformed peak heights (Hellinger distances). Euclidean distances take absences of a species from two samples as a sign of their similarity. There is, therefore, a strong argument for preferring Bray-Curtis (Czekanowski) similarities, which are not affected by the number of joint absences (3).
It is also helpful to consider more explicitly what represents an important difference between samples. Analyses based on raw data will be dominated by variations in abundance of a small number of common operational taxonomic units (OTUs). A log or square root transformation reduces the influence of commoner OTUs. Jaccard distances and other methods based on presence/absence data give equal weighting to rare and abundant OTUs. Eukaryote ecology has often found that log or square root transformations yield the most informative analysis. But the analysis either of raw or binary data may be appropriate if one is interested in common or rare species, respectively.
|
|
|---|
|
Alastair Grant* Lesley A. Ogilvie Centre for Ecology, Evolution and Conservation School of Environmental Sciences University of East Anglia Norwich NR4 7TJ, United Kingdom
|
||||||
|
* Phone: 44 1603 592 537 Fax: 44 1603 507 719 E-mail: A.Grant{at}uea.ac.uk |
|
|
|---|
Redundancy analysis cannot be used when no information other than that from T-RFLP profiles is available. We agree with Grant and Ogilvie that in such situations it would be useful to apply, in addition to cluster analysis, an ordination technique. One method can complement the other. It may also be prudent to apply the lessons from our study when performing ordinations. Cluster analysis can be useful because it can summarize in one dendrogram the information of several ordination plots. We did not observe that cluster analysis identified well-defined groups when no groups were visible in ordination plots (comparisons not discussed in our paper), although this is possible. When there were no natural groups of profiles, dendrograms had little heterogeneity in stem lengths and ordination plots presented an undifferentiated data cloud.
There is no consensus on whether T-RFLP peak height or area should be analyzed. Grant and Ogilvie recommend analysis of peak area, which we avoided because overlapping peaks are not deconvoluted by Genescan, resulting in an artificial alteration of area based on proximity to other peaks. For comparison between communities, the downweighting of larger fragments using peak height may not be overly detrimental since this effect will be constant across profiles and could be dealt with analytically if necessary.
As we stated (1), future evaluations of T-RFLP data analysis could include other distance metrics such as the Bray-Curtis similarity mentioned by Grant and Ogilvie. We recommended either Hellinger or Jaccard distance since they performed equally well in general, and the properties of one metric may be preferred in individual circumstances. Redundancy analysis using Bray-Curtis similarity, like Jaccard distance, does not result in scores for evaluation of the effects of particular T-RFs. In the study of Legendre and Gallagher (2), its performance was very good but not equal to that of Hellinger distance.
While Grant and Ogilvie mention that analysis of raw T-RFLP data may be desirable in some situations, we observed that raw T-RFLP data are influenced by analytical noise. Profiles should at least be transformed to relative peak height, unless very different laboratory methods are used. Also, if one is interested in heavily weighting small peaks, then care must be taken to have uniform total fluorescence among profiles.
We hope that these observations and the others in our paper will serve as a good starting point for future efforts to analyze T-RFLP data.
|
|
|---|
|
Christopher B. Blackwood*
Sustainable Agricultural Systems Laboratory Beltsville Agricultural Research Center USDA-ARS Beltsville, MD 20705
Terry Marsh
Eldor A. Paul
|
||||||
|
* Phone: (301) 504-8022 Fax: (301) 504-8370 E-mail: blackwoc{at}ba.ars.usda.gov |
This article has been cited by other articles:
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»