Previous Article | Next Article ![]()
Applied and Environmental Microbiology, March 2008, p. 1876-1885, Vol. 74, No. 6
0099-2240/08/$08.00+0 doi:10.1128/AEM.01722-07
Copyright © 2008, American Society for Microbiology. All Rights Reserved.

Genomic Research Laboratory, Infectious Diseases Service, University of Geneva Hospitals, Geneva, Switzerland,1 The Forsyth Institute, Boston, Massachusetts,2 Institut für Angewandte Immunologie, Zuchwil, Switzerland,3 Reconstructive and Plastic Surgery, Faculty of Medicine, Geneva, Switzerland,4 Infection Control Program, Faculty of Medicine, Geneva, Switzerland5
Received 26 July 2007/ Accepted 10 January 2008
|
|
|---|
9,500 feature elements targeting 16S rRNA gene-specific regions. Probe design was performed by selecting oligonucleotide sequences specific to each node of the seven levels of the bacterial phylogenetic tree (domain, phylum, class, order, family, genus, and species). This approach, based on sequence information, allows analysis of the bacterial contents of complex bacterial mixtures to detect both known and unknown microorganisms. The presence of unknown organisms can be suspected and mapped on the phylogenetic tree, indicating where to refine analysis. Initial proof-of-concept experiments were performed on oral bacterial communities. Our results show that this hierarchical approach can reveal minor changes (
1%) in gingival flora content when samples collected in individuals from similar geographical origins are compared. |
|
|---|
A classical way to characterize members of complex bacterial communities relies on 16S rRNA gene sequence analysis. This target is particularly adapted to phylogenic studies since it contains highly conserved and variable moieties permitting reliable and detailed bacterial classification (12, 25, 37). In this approach, nucleic acids are directly extracted from samples without any prior cultivation; amplification is then performed using universal primers targeting conserved stretches of the 16S rRNA gene, and identification is based on similarity with sequences deposited in public ribosomal gene databases. Since the rate of nucleotide sequence change correlates with the evolutionary distance between organisms (8), sequence relatedness can be used for a phylogenic approach. To analyze complex bacterial flora, large-scale cloning and sequencing of 16S rRNA gene targets can provide a detailed catalogue of bacterial flora from a representative sample. However, despite its accuracy and potential for bacterial quantitation, this approach cannot be applied to compare large groups of samples exhibiting intra- and intervariability due to the very large number of experiments that would have to be performed.
Microarrays are frequently used to monitor gene regulation and expression on a genome-wide scale (6). Still, development of this technology for the comprehensive characterization of the content of complex microbial communities is still under way, and different approaches have been proposed. Functional gene arrays (49) target enzymes involved in a peculiar metabolic process such as methane oxidation (2), nitrogen fixation (32, 43), and sulfur metabolism or iron metabolism (50). While allowing characterization of key bacteria involved in a determined metabolism pathway, this approach is also useful for monitoring the functional state of a microbial community under different environmental conditions. Alternatively, profiling of prokaryotic populations has been achieved by using 16S rRNA gene microarrays with probes targeting bacterial groups such as Cyanobacteria (5), Rhodocyclales (31), or Alphaproteobacteria (40). Development of high-density microarrays allowed extending the scope of phylogenetic oligoarrays to the whole bacterial kingdom (3, 33, 48). Although 16S rRNA microarrays do not appear to be optimal for discovering new taxa, this approach has permitted the detection of a broader bacterial diversity than the use of clone libraries (10).
The utilization of microarrays is appealing for evaluating samples containing complex flora. The design of oligonucleotide probes appears to be mandatory for resolving punctual sequence differences compared to PCR products that have been shown to exhibit poor performance for single-nucleotide polymorphism or punctual mutation analysis (19, 27, 29, 38). Moreover, oligonucleotide probes are more flexible and can be tailored to meet critical criteria such as sequence specificity and physicochemical properties.
To enable large-scale studies of complex bacterial flora composition in collections of samples, we developed an original oligonucleotide microarray design based on a phylogenic approach. Microarray design was performed by selecting
9,500 25-nucleotide probes recognizing 16S rRNA gene targets that were specific to nodes matching the seven levels of the bacterial phylogenetic tree (domain, phylum, class, order, family, genus, and species). While providing information on the taxonomic composition of microbial communities, this approach should also prove useful for detecting uncharacterized species or to detect over- or under-representation of specific bacterial groups leading to imbalanced flora content. Our study shows that this hierarchical approach can reveal minor changes in microflora composition-as low as 1% of the global composition—when two complex, but related, bacterial populations are compared.
|
|
|---|
First, each SSU rRNA gene sequence was scanned from its 5' to 3' end in order to extract all possible 25-nucleotide subsequences as a pool of candidate probes. Melting temperatures (Tm) were determined for each candidate probe by using thermodynamic parameters based on a nearest-neighbor model (41). Candidate probes with a predicted Tm of 60 ± 5°C were stored in a hash table structure to eliminate duplicate sequences and to permit rapid data processing.
The next stage involved assigning a node to each candidate probe. Probes matching several SSU rRNA gene sequences were assigned to the nearest parent node common to the referred sequences. For example, a probe matching both Streptococcus and Lactococcus genera was assigned to the Streptococcaceae order (see probe C in Fig. 1).
![]() View larger version (20K): [in a new window] |
FIG. 1. Schematic representation of the probe selection process on a small subset of 16S rRNA gene sequences. Candidate probes are compared to the library of 16S rRNA gene sequences and assigned to the most distal common node. For example, probe A, which is common to the Lactobacillus, Paralactobacillus, Pediococcus, and Streptococcus genera, is assigned to the Lactobacillales order level. Probe C is assigned to the family level of the Streptococcaceae since it detects both Streptococcus and Lactococcus spp. In contrast, probe B is specific to some Pediococcus species, and it is therefore assigned to the genus level.
|
1.2% coverage of all sequences of a given node, as determined empirically). Stages 1 to 3 provided a set of 8,195 probes describing phylogenetic classes from the domain to the species level. About 2.3% of these probes presented one or more ambiguous nucleotides. Since most of these polymorphisms conveyed a significant contribution to phylogenic coverage, we decided to consider as a fourth stage all possible degenerated positions for this subset of targets, yielding a final probe set of 9,477 probes resulting in a global coverage of 78.3%. To minimize steric hindrance, all 25-mer probes were poly(T)-tailed to reach an overall length of 60 nucleotides. Microarrays were manufactured by in situ synthesis (Agilent Technologies, Palo Alto, CA).
Biological samples.
Two distinct bacterial mixtures were used to validate our microarray approach. We first generated a defined artificial sample (sample A) using equal amounts of cRNA originating from three different organisms: Streptococcus pyogenes ATCC 12344, Fusobacterium necrogenes ATCC 25556, and Chromobacterium violaceum ATCC 12472. Artificial sample A (200 ng of total cRNA) was compared to the same bacterial mixture previously spiked with 25% (i.e., 50 ng of total RNA) Escherichia coli ATCC 25922, yielding to another 200-ng cRNA sample (spiked sample A).
Sterile endodontic paper points were used to collect gingival fluid from the dentogingival sulcus of two healthy European male subjects aged 28 and 41 years (samples B1 and B2, respectively). Samples were stored in RLT buffer (RNeasy Minikit; Qiagen, Basel, Switzerland) at –80°C for subsequent analyses. Then, 2 µg of cRNA of gingival samples B1 and B2 was then compared against their equivalents, but previously spiked with a lower concentration (1%) of F. necrogenes ATCC 25556 (spiked sample B1 and spiked sample B2, respectively).
RNA extraction and quantification.
To lyse cells, 100 mg of glass beads (diameter, 100 µm; Schieritz & Hauenstein AG, Arlesheim, Switzerland) were added to the samples. Volume was adjusted to 350 µl with RLT buffer, and samples were vortex mixed for 1 min. Total RNA was isolated and purified by using the RNeasy Micro kit (Qiagen) according to the manufacturer's instructions. Samples were lyophilized and dissolved in 5 µl of sterile water. Total RNA quality was assessed by using RNA Picochips on a BioAnalyzer 2100 (Agilent). The RNA quantity was assessed by one-step reverse transcription-quantitative PCR using 0.2 µM concentrations of two primers (forward, GGCAAGCGTTATCCGGAATT; reverse, GTTTCCAATGACCCTCCACG; Invitrogen, Basel, Switzerland) and a 0.1 µM concentration of probe (CCTACGCGCGCTTTACGCCCA, 5'-end coupled to FAM and 3'-end coupled to TAMRA; Eurogentec, Seraing, Belgium) designed in a highly conserved region of bacterial 16S rRNA gene, allowing amplification of most of the bacterial 16S rRNA sequence. One-step reverse transcription-quantitative PCR amplification (final volume of 15 µl; Invitrogen) was performed on an SDS 7700 (PE Biosystems, Santa Clara, CA) using the following cycling procedure: t1, 20 min at 50°C; t2, 10 min at 94°C; t3, 15 s at 94°C; and t4, 1 min at 60°C (t3 and t4 were each repeated 40 times). Using this strategy, a positive fluorescent signal was obtained between cycles 21 and 30.
RNA amplification.
All of the purified RNA was subjected to in vitro transcription using a MessageAmp II-Bacteria kit (Ambion, Austin, TX) according to the manufacturer's instructions. Amplified RNA was labeled during in vitro transcription in the presence of Cy3 or Cy5 cyanine dyes (Perkin-Elmer, Boston, MA). Quality, quantity, amplification efficiency, and dye incorporation were evaluated using the NanoDrop ND-1000 spectrophotometer (NanoDrop Technologies, Inc., Rockland, DE) and a BioAnalyzer 2100 on RNA Nano 6000 chips (Agilent).
Microarray hybridization, scanning, and analysis.
All samples were hybridized in duplicate. Cy5- and Cy3-labeled cRNAs were diluted in a total of 250 µl of Agilent hybridization buffer and hybridized at 60°C for 17 h in a dedicated hybridization oven (Robbins Scientific, Sunnyvale, CA). Slides were washed, dried under nitrogen flow, and scanned (Agilent) using 100% PMT power for both wavelengths.
Image analysis and signal quantification were achieved by using Feature Extraction software (version 6; Agilent). Probes exhibiting a nonuniform signal (i.e., pixel noise exceeding an established threshold) or mean signal values inferior to the corresponding background plus 2.6 standard deviations were excluded from subsequent analyses.
For spiking experiments, LOWESS (locally weighted linear regression) transformation was used to correct background subtracted signals for unequal dye incorporation, and a geometric mean was applied to average signals between duplicates. Statistical analysis consisted of a two-tailed Student t test with a P value tailored according to the relative spike abundance (P < 0.01 for sample A, 25% spiking; P < 0.05 for samples B1 and B2, 1% spiking).
For assessing the bacterial subgingival flora on nonspiked samples, the local background was subtracted from the raw mean signal. Subsequently, probe signals were averaged among duplicates by using a geometric mean.
Comparison with 16S rRNA gene sequencing.
To compare obtained results with data generated by large-scale cloning and sequencing approaches in different populations of healthy volunteers, we used published data of Kroes et al. (26) and Paster et al. (35). Briefly, Paster et al. used clone library sequencing to investigate the bacterial diversity in the subgingival plaque of healthy subjects and subjects affected by various kinds of periodontitis and acute necrotizing ulcerative gingivitis. From the five libraries matching healthy subjects, we chose three libraries showing the widest diversity in species and phylotypes in order to perform a direct comparison with our data. In the same way we used the data from the study of Kroes et al. describing the bacterial diversity found in a single healthy human volunteer, likewise using clone library sequencing.
|
|
|---|
![]() View larger version (21K): [in a new window] |
FIG. 2. Microarray coverage for the 32 phyla described in the RDP (see Materials and Methods). Bars represent the percentage of strains detected within each phylum. The number of selected probes for providing this phylum coverage is specified to the right of the figure.
|
109 (P = 0.0004) and would not be expected, based on in silico predictions. Partial cross-hybridization could also explain why probes that do not directly match E. coli ATCC 25922, but other nodes, such as Shigella, are revealed by our analysis. Note, however, that these two species displayed strongly homologous ribosomal gene sequences.
![]() View larger version (10K): [in a new window] |
FIG. 3. Volcano plots display a summary of test statistics as a function of fluorescence intensity ratios (i.e., fold change). Each plot compares spiked versus nonspiked samples, using material of increasing bacterial complexity. Red dots in the upper left or upper right corners depict significant targets in the nonspiked sample or in the spiked sample, respectively. (A) Comparison of one defined bacterial mixture containing three bacterial species with or without a spike of 25% of the nucleic acid amounts (significance is defined as a fold change of 2 and P 0.01). (B and C) Comparisons of gingival flora from two healthy volunteers (B1 [B] and B2 [C]) with or without a spike of 1% of the nucleic acid amounts (significance is defined as a fold change of 2 and P 0.05).
|
106-fold for sample B1 and
45-fold factor for sample B2. In contrast, no probes yielded statistically significant signals in the nonspiked samples, as illustrated by the upper left corners of the volcano plots (Fig. 3B and C). |
View this table: [in a new window] |
TABLE 1. Nodes identified as statistically significant in spiked samples B1 and B2
|
![]() View larger version (28K): [in a new window] |
FIG. 4. Phylogenic tree representation of all phyla detected by statistical analysis (P < 0.05) comparing a gingival sample (B2) to its replicate spiked with 1% F. necrogenes ATCC 25556. Red dots and lines depict nodes and branches where at least 1 probe yielded a statistically significant signal. For better readability, only the Alphaproteobacteria class is fully represented among the Proteobacteria phylum.
|
![]() View larger version (14K): [in a new window] |
FIG. 5. Cumulative prevalence of phylotypes identified using microarrays (samples B1 and B2) compared to previously published analyses of gingival flora. The figure depicts all phylotypes identified by any of these studies. Lib1, Lib2, and Lib3 describe libraries generated for three healthy subjects as described by Paster et al. (35; B. Paster, unpublished data).
|
|
|
|---|
9,500 probe set on our oligoarray covers 78.3% of the 194,696 bacterial SSU rRNA gene sequences described in release 9.34 of RDP. Although phyla coverage ranges from 21% (Chloroflexi) to 100% (e.g., Chlamydiae), the design process was not intended to be limited to the study of specific phyla and thus should prove useful for studying the whole eubacterial domain, which was the starting point of our strategy. This last point is of crucial importance since no specific phylum should have a priori more weight during the profiling of any altered oral flora. Potential applications of such a microarray consist of detecting bacterial composition changes over time or across many related samples (e.g., healthy versus diseased or site for the study of noma, for example).
In the present study, the limited sensitivity of microarray techniques required amplification of the starting nucleic acids. In addition, the amplification strategy proposed here appears to be robust and is potentially utilizable for other samples for which the amounts of starting material are strictly limited. Our strategy proved to be efficient in detecting minor differences (as low as 1% of F. necrogenes) in flora composition in controlled mixtures and, more importantly, in complex natural samples. Our approach was able to characterize the diversity of the gingival flora down to the genus level in two healthy European subjects. The results showed good congruence with previous studies performed in American subjects (26, 35) using a cloning-sequencing strategy. This observation is noteworthy because this comparison involved volunteers originating from two distinct continents and using two markedly different methods. Our results suggest that different social and dietary habits have limited influence on gingival flora composition, as defined by our microarray strategy. This microarray approach revealed a broader diversity of microorganisms at the genus level than traditional clone libraries methods, a finding that has already been reported by DeSantis et al. (10). The underestimation of microbial diversity may be explained either by a cloning bias or by the paucity of clones sequenced. Moreover, since our strategy is based on 16S rRNA sequences, we can hypothesize an additional bias due to sequence over-representation in public sequence databases (23). This bias can be expected on important human or animal pathogens, as well as organisms of economic interest.
Nonetheless, this phylogenetic approach not only allows the monitoring of bacterial population dynamics in different ecosystems but also permits the detection of potential pathogenic bacteria or bacterial taxons, as recently illustrated in a study of airborne bacterial composition (4). In addition, the same strategy, as well as functional gene arrays, can be implemented to evaluate temporal evolution of complex bacterial communities such as in agricultural, environmental, or human commensal ecosystems. Such applications include evaluating the impact of ecosystem modification on environmental flora (3) or the identification of a specific bacterial population showing particular metabolic capacities, such as the metabolism of toxic compounds (21, 39). In human medicine, a potential use might be for monitoring flora composition alteration or bias composition, due to chemical, antimicrobial treatments (15) or due to metabolic dysfunction, or for understanding its natural evolution during the life cycle (34). This type of strategy represents serious advantages over culture or even other non-culture-based methodologies (22). The characteristics of sensitivity and specificity, as well as the actual throughput, are now compatible with real-time monitoring or analysis (17).
We have found that the efficiency of our microarray can be improved in several ways. We could rely on novel 16S sequences retrieved from the latest release of RDP, as well as data emerging from various metagenomic projects (16, 45). In addition, it would be useful to add probes targeting the Archaebacteria domain, since methanogenic Archaea have been recently reported in subgingival sites of patients suffering from periodontitis (28). Obviously, our design is dependent on the quality of the sequences retrieved by RDP. In this regard, short 16S rRNA gene sequences may be prone to misclassification at the genus level due to anomalies in the taxonomy and that lack of data on short sequences (46). In addition, close sequence homology across different genera (e.g., Escherichia and Shigella) explains why specific probes cannot always be designed. The potential for cross-hybridization by the probes should not be minimized, especially in the highly conserved 16S rRNA gene sequences. Whereas careful assessment of such cross-hybridization potential appears warranted for the "de novo" characterization of a microbial community, this is clearly less of a concern here because our approach was developed and tested to detect differences between related samples.
Finally, the arbitrary twofold change cutoff was selected as an empirical compromise between sensitivity and specificity. Further experimental determinations are now warranted to more precisely define this cutoff value and determine whether it can be applied to the whole range of fluorescence signals. It remains to be proven whether statistical approaches might prove more reliable in the long run.
Preliminary steps to the identification of uncharacterized bacterial species can be performed using our hierarchical approach, providing an alternative to classical sequencing techniques. However, this point remains to be experimentally proven. Initially intended for the study of bacterial diversity in oral samples, our approach may be useful for the study of various bacterial communities associated either with medical (intestinal or skin flora), environmental (soil, sludge, wastewaters, etc.), or food industry samples. In addition, the same approach could be used to monitor the evolution of bacterial communities over time, replacing laborious techniques such as library cloning and sequencing.
This approach can be conveniently implemented to perform large-scale profiling studies of the oral bacterial flora. Experiments are under way to monitor gingival flora in noma lesions from individual patients and compare these samples to matched healthy controls from the same geographical origin. Again, implementing such novel microarray strategies for flora composition analyses require careful validation.
Published ahead of print on 18 January 2008. ![]()
|
|
|---|
This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»