Previous Article | Next Article ![]()
Applied and Environmental Microbiology, April 2005, p. 2086-2094, Vol. 71, No. 4
0099-2240/05/$08.00+0 doi:10.1128/AEM.71.4.2086-2094.2005
Copyright © 2005, American Society for Microbiology. All Rights Reserved.
Jason Hinds,4
Achim Kohler,2
Brendan W. Wren,3 and
Knut Rudi2*
Department of Chemistry, Biotechnology and Food Science, Norwegian University of Life Sciences,1 Matforsk, Norwegian Food Research Institute,Ås, Norway,2 Department of Infectious and Tropical Diseases, London School of Hygiene and Tropical Medicine,3 Department of Cellular and Molecular Medicine, St. George's Hospital Medical School, London, United Kingdom4
Received 15 July 2004/ Accepted 19 October 2004
|
|
|---|
|
|
|---|
In this work we set up a complex experimental design in which DNA microarrays were used to investigate global gene expression patterns and Fourier transform infrared (FT-IR) spectroscopy allowed the analysis of the total biomolecule composition in the cell (11, 12). Interacting effects of experimental factors were investigated by using a novel multivariate analysis of variance (MANOVA) approach (50-50 MANOVA) (20) and rotation testing (21). 50-50 MANOVA was developed to handle several colinear responses in combination with general linear modeling (analysis of variance and regression). This is not possible with traditional MANOVA analysis. Rotation testing is advantageous compared to the commonly used permutation testing, since permutation testing on multifactor experiments is technically difficult and exact permutation analysis is limited to relatively simple models. In combination with rotation testing, a novel approach for determining the false-discovery rate (FDR) was developed for better significance testing of interacting parameters (see Appendix). This new FDR method is more reliable than other FDR methods when there is high correlation between responses.
The aim of this work was to investigate the global survival mechanisms of Campylobacter jejuni in the environment by using the novel analytical concepts described above. C. jejuni is the leading cause of food-borne bacterial diarrheal disease throughout the world (2). The four major sources of infection are undercooked poultry meat, untreated water, raw milk, and pets (19). Although the vast majority of infections are self-limiting, severe sequelae can occur in the form of a neuromuscular paralysis such as Guillain-Barré syndrome (24). C. jejuni does not normally grow outside the host, unlike most other bacterial food-borne pathogens, but it appears to have the ability to survive for extended periods in the environment (25). Nevertheless, C. jejuni is very sensitive to environmental stress and appears to lack most of the stress response factors necessary for survival of other bacteria in adverse environments (25, 26). Additionally, it does not normally grow aerobically and is not readily transmitted between humans, yet it is the most frequent food-borne pathogen. This is often referred to as the Campylobacter conundrum (15) and underlines the fact that the multiple survival mechanisms in the environment of this important human pathogen are poorly characterized. Understanding these mechanisms is crucial to the design of intervention strategies to reduce C. jejuni in the food chain and to reduce the burden of C. jejuni-associated disease.
We focused our study on nongrowth survival conditions and emphasized the interaction between temperature and oxygen tension in our experimental design. C. jejuni encounters a wide range of temperature and oxygen concentrations in its contamination cycle and must therefore respond to these conditions. Our novel analytical approaches were used to develop a model for the potential mechanisms for survival of C. jejuni in the environment.
|
|
|---|
We based our experimental design on an initial screening to get an overview of the survival range in C. jejuni. This screening was performed by using FT-IR spectroscopy, viable-dead staining (BacLight), and plate counts. Samples from all of the experimental combinations were analyzed after 0 (reference), 1, 2, 3, and 4 weeks. In addition plate counts and BacLight analyses were performed on samples stored for up to 11 weeks for the 5°C anaerobic condition.
In the final experimental design, DNA microarray analyses were performed in addition to FT-IR spectroscopy, BacLight analysis, and plate counts. Samples were analyzed after 0 (reference), 2, 4, and 7 days by using all of the analysis techniques described above. DNA microarray analyses were not performed on the 5°C aerobic day 7 samples or the 25°C aerobic day 4 and day 7 samples, as more than 10% of the bacterial cells were nonviable (determined by BacLight analysis). All analysis was based on three biological replicates, and in addition, microarray hybridizations were performed in duplicate for each total RNA sample, giving a total of six hybridizations per sample. RNA from day 0 was used as a reference in all hybridizations
Strain and growth conditions.
The genome-sequenced C. jejuni NCTC 11168 strain was kindly provided by Brendan Wren. This clone was used for the Sanger Centre C. jejuni sequencing project and was previously obtained from the National Collection of Type Cultures. The subculture history of this variant is unknown.
Bacteria were grown on selective blood agar plates (Oxoid Ltd.) in a microaerobic atmosphere for 48 h at 42°C. One colony was used to inoculate 50 ml of MH broth (Oxoid Ltd.), which was then incubated microaerobically for 48 h at 42°C. This culture was used to inoculate two 1-liter volumes of MH broth (prewarmed to 42°C under microaerophilic conditions) to a 1:100 dilution. The cultures were incubated microaerobically at 42°C with shaking at 75 rpm for 12 h to yield a cell count of approximately 108 CFU/ml. The reasons for using cells in exponential growth are that C. jejuni lacks a stationary-phase response (17) and that the responses are complex and difficult to define when the cultures reaches high cell densities. The two cultures were mixed, day 0 samples were collected, and the culture was then divided into six flasks, each containing 300 ml. The flasks were placed at their respective temperatures and atmospheres, with shaking at 75 rpm. Microaerobic and severely oxygen-limited (anaerobic) conditions were established by using the Oxoid CampyGen and AnaeroGen systems, respectively. All plate counts were done on MH agar plates.
BacLight live-dead fluorescence microscopy.
One milliliter of bacterial culture was centrifuged at 10,000 x g and 4°C in a microcentrifuge for 10 min. The supernatant was removed, and the cells were resuspended in 1 ml of filter-sterilized peptone water. This suspension was diluted to give approximately 107 CFU/ml, and 0.5 ml of this dilution was mixed with 0.5 ml of the BacLight (Molecular Probes) mixture (BacLight ampoules mixed in 5 ml of filter-sterilized water). The BacLight-cell mixture was incubated for 15 min in the dark at 4°C before being filtered on a 0.22-µm-pore-size black polycarbonate filter (Osmonics Inc.). The filter was washed two times with 1 ml of filter-sterilized water and then viewed by fluorescence microscopy (Leica DMLB microscope) with an RT color spot camera and Spot Advanced software (Diagnostic Instruments, Inc.).
FT-IR sample preparation and measurements.
The samples (30 ml) for FT-IR measurements were centrifuged at 4°C and 10,000 x g for 30 min. The supernatant was removed before the cells were washed twice with 1 ml of saline solution with centrifugation at 4°C and 16,000 x g for 15 min. After being washed, the cells were suspended in 40 µl of distilled water before 35 µl of each suspension was transferred to an IR-transparent crystal (ZnSe). Before FT-IR analysis, the samples were dried in a desiccator at 104 Pa with anhydrous silica gel (Prolabo, Paris, France). To acquire FT-IR spectra of the samples, a Bio-module (Bruker Optics, Ettingen, Germany), which is especially designed to measure microorganisms, coupled to an Equinox 55 spectrometer (Bruker Optics) was used. The spectra were recorded in the region between 4,000 and 500 cm1.
FT-IR spectra of microorganisms are usually divided into five regions that contain information from different cell components (24), as follows: 3,000 to 2,800 cm1, fatty acids in the bacterial cell membrane; 1,800 to 1,500 cm1, amide bands from proteins and peptides; 1,500 to 1,200 cm1, mixed region (proteins and fatty acids); 1,200 to 900 cm1, polysaccharides within the cell wall; and 900 to 700 cm1, "true" fingerprint region containing bands which cannot be assigned to specific functional groups. Before data analysis, the spectra were preprocessed. In the region from 2,300 to 720 cm1 we used extended multiplicative signal correction (23). In the region from 3,000 to 2,800 cm1, the second derivative was calculated prior to extended multiplicative signal correction.
RNA extraction.
Total RNA was extracted from C. jejuni by using the RNeasy Protect Bacteria Mini Prep kit (Qiagen). Briefly, 20 ml of RNA Protect Bacteria reagent (Qiagen) was added directly to 10 ml of bacterial culture, vortexed for 5 s, incubated at room temperature for 5 min, and then centrifuged at 10,000 x g for 30 min at room temperature. The supernatant was discarded, and the pellet was frozen at 20°C for later RNA purification. The pellet was thawed, resuspended in 200 µl of Tris-EDTA buffer containing 1 mg of lysozyme per ml, and incubated at room temperature for 10 min with vortexing every 2 min. The manufacturer's protocol was followed from this point, including the "on-column" DNase treatment. RNase-free water was added to the membrane, and after 1 min the spin column was centrifuged at 10,000 x g for 1 min. The concentration and purity of the total RNA were analyzed with an Ultrospec 3000 spectrophotometer (Pharmacia Biotech) and the RNA 600 Nano LabChip system (Agilent Technologies).
Labeling of total RNA and microarray hybridization.
Total RNA was reverse transcribed by using random hexamers in the presence of aminoallyl-dUTP, followed by labeling with Cy3 and Cy5 monoreactive dyes (The Institute for Genomic Research [TIGR] protocol, standard operating procedure no. M007). Briefly, 2 µg of total RNA and 6 µg of random hexamers (Invitrogen) in a reaction volume of 18.5 µl were denatured at 70°C for 10 min, snap cooled on ice, and centrifuged briefly. Then, 6 µl of First Strand buffer (Invitrogen), 3 µl of 0.1 M dithiothreitol, 1.2 µl of 12.5 mM deoxynucleoside triphosphate-aminoallyl-dUTP labeling mix (aa-dUTP [Ambion]-dTTP [Invitrogen], 2:3), and 400 U of SuperScript II reverse transcriptase (Invitrogen) were added. The labeling reaction mixture was incubated at 42°C overnight (
16 h). The first-strand synthesis reaction was stopped by adding 10 µl of 0.5 M EDTA and 10 µl of 1 M NaOH. The reaction mixture was incubated at 65°C for 15 min, and then 25 µl of 1 M Tris-HCl (pH 7.0) was added. The removal of unincorporated aa-dUTP and free amines was performed according to the TIGR protocol with the Qiagen QIAquick PCR purification kit. After being dried in a Speed Vac, the samples were stored at 20°C. Coupling of aminoallyl-labeled cDNA to Cy Dye Ester (Amersham Biosciences) was done according to the TIGR protocol. The eluted Cy3- and Cy5-labeled samples were mixed and dried in a Speed Vac.
The C. jejuni slides were developed and printed by the BµG@S Group (Bacterial Microarray Group at St. Georges) in the United Kingdom. All procedures in the development of these arrays have been described previously (14). Microarray slides were incubated in a prehybridization buffer (3.5x SSC [1x SSC is 0.15 M NaCl plus 0.015 M sodium citrate], 0.1% sodium dodecyl sulfate [SDS], 10 mg of bovine serum albumin per ml) at 65°C for 20 min. After prehybridization, the slides were washed in distilled water for 1 min, followed by a 1-min wash in isopropanol. The dried Cy3- and Cy5-labeled cDNA was resuspended in 30-µl hybridization solution (4x SSC, 0.3% SDS), denatured at 95°C for 5 min, vortexed at low speed, and heated again at 95°C for 5 min. The sample was centrifuged briefly and applied to the prehybridized microarray underneath a 22- by 25-mm LifterSlip coverslip (Erie Scientific Company). The slides were placed in a waterproof hybridization chamber for hybridization in a 65°C water bath overnight (
16 h). After hybridization, the slides were washed in 1x SSC buffer with 0.05% SDS at 65°C for 2 min, followed by two washes in 0.06x SSC buffer, each for 2 min at room temperature. The slides were dried by centrifugation (90 x g for 12 min).
Data acquisition and analysis.
Slides were scanned with the ScanArray Express 1.0 scanner (Packard BioScience), following the manufacturer's guidelines. The intensities of the fluorescent spots were quantified with ImaGene 5 (BioDiscovery Inc.) software. Background subtraction and normalization were performed in GeneSpring 6.1 (Silicon Genetics).
Statistical analyses was performed according to the approach implemented in the 50-50 MANOVA software (20; www.matforsk.no/ola). Global significance was tested by 50-50 MANOVA, and corresponding explained variances were calculated based on sums of squares summed over all responses. Significance analysis for each individual response was performed with P value adjustment calculated according to the rotation testing principles described by Langsrud (21). However, a modified procedure that adjusts P values according to an FDR criterion was used (see Appendix). The main effect of day (time) was illustrated by using adjusted (adjusted for unbalanced design caused by missing observations) means. These means were shown directly as curves (FT-IR spectroscopy). A principal-component score plot (22) was used to illustrate the microarray data.
The analyses were performed in two steps to handle the split-plot structure (4) of this experiment. The first step used a model that was saturated in the whole-plot factors (biological replicate, temperature, and atmosphere). In such a model the effects of day (time) and interactions with day could be tested the usual way. Only data for day 2 were used to analyze the whole-plot factors to simplify the analyses. Then, the biological replicate was considered as a blocking factor (i.e., no interactions with biological replicate). Ordinary second-order models were used, and the factors day and atmosphere were modeled as both categorical and continuous variables. Reformulated models that include the effects of day and atmosphere separately for the two temperatures were also utilized.
The input data for the 50-50 MANOVA analyses were the normalized log of ratio data (normalized to 16S and 23S rRNAs). The data did not include any missing values, resulting in the analysis of 1,169 genes out of a total of 1,654 C. jejuni NCTC 11168 genes.
The microarray data are available through BµG@Sbase and ArrayExpress. The accession number for BµG@Sbase is E-BUGS-19 (http://bugs.sghms.ac.uk/E-BUGS-19), and the accession number for ArrayExpress is E-BUGS-19.
|
|
|---|
Based on this initial investigation, time points up to 7 days were chosen for further investigation of the survival mechanisms of C. jejuni. This was a balance between maintaining at least 90% viable cells as assessed by BacLight staining and ensuring that the experiment should cover the main changes in the FT-IR spectrum.
Investigation of the survival mechanism. (i) Global macromolecular changes.
FT-IR spectroscopy was used to investigate the global changes in the cell biomolecule composition. The FT-IR spectra showed an overall decrease in the protein region (data not shown) and an increase over time in the polysaccharide-oligosaccharide region (1,200 to 900 cm1) for the conditions tested (Fig. 1). Table 1 shows that both day (time) and temperature influenced the polysaccharide-oligosaccharide region significantly. The influence of the temperature is seen as slight pattern differences in some areas in the spectra between the two temperatures (based on day 2 only) (data not shown). The effect of atmosphere, however, did not influence the polysaccharide-oligosaccharide region. The lipid region of the spectrum (3,000 to 2,800 cm1) was influenced mainly by the incubation temperature (Table 2) and to some degree by time (lower explained variance than temperature). The effect of atmosphere was not significant (Table 2).
![]() View larger version (13K): [in a new window] |
FIG. 1. Difference spectra (FT-IR) of the polysaccharide-oligosaccharide region. Difference spectra between the average of the day 0 samples and the average of the samples from days 2, 4, and 7 are shown. The horizontal bars indicate the areas of the spectra with significant changes (P < 0.05 by FDR).
|
|
View this table: [in a new window] |
TABLE 1. Explained variance and significance (by 50-50 MANOVA) of the polysaccharide-oligosaccharide region of the FT-IR spectra
|
|
View this table: [in a new window] |
TABLE 2. Explained variance and significance (by 50-50 MANOVA) of the lipid region of the FT-IR spectra
|
The overall biological information in the data was evaluated by using principal-component analyses prior to the in-depth statistical analyses. A principal-component score plot of all the data showed that the biological information in the data is high compared to the variation between the biological replicates (Fig. 2). The major biological trends were that most of the genes were either down regulated or unchanged under nongrowth survival conditions and that the 25°C anaerobic condition was distinct from the other conditions tested (Fig. 3). In this condition, the cells had even more genes down regulated than in the other conditions. Although they were down regulated compared to day 0, most genes increased their expression over time in this condition (Table 3). To identify up-regulated genes, significance (5% FDR) and corresponding t values were tested for all genes. The analysis was based on day 2 values only and identified 102 genes with an average log of expression ratio greater than zero over all experimental levels.
![]() View larger version (20K): [in a new window] |
FIG. 2. Principal-component score plot of the microarray data. The different symbols represent the different atmospheric conditions ( , aerobic; *, microaerobic; , anaerobic), and the lines between the symbols connect the different biological replicates.
|
![]() View larger version (44K): [in a new window] |
FIG. 3. Gene and condition cluster. The yellow lines represent samples from 5°C, and the red lines represent samples from 25°C. The brackets show areas with up-regulated genes and the corresponding pathways. Similarity measure: change correlation.
|
|
View this table: [in a new window] |
TABLE 3. Significance and corresponding t values of the gene expression data according to 5% FDR
|
|
View this table: [in a new window] |
TABLE 4. Explained variance and significance (by 50-50 MANOVA) of the gene expression data
|
![]() View larger version (33K): [in a new window] |
FIG. 4. Plot of interaction between day and temperature (5 and 25°C). The solid line represents the 5% FDR, and the dotted line represents the 1% FDR. The t value indicates whether the genes have increasing an ratio over time (positive t value) or a decreasing ratio over time (negative t value). Genes: Cj0025c, putative transmembrane symporter gene; Cj0062c, putative integral membrane protein gene; Cj0150c, aminotransferase gene; Cj0169, sodB; Cj0313, putative integral membrane protein gene, Cj0404, putative transmembrane protein gene; Cj0428, hypothetical protein Cj0428 gene; Cj0520, putative membrane protein gene; Cj0528, flgB; Cj0687c, flgH; Cj0702, purE; Cj0759, dnaK; Cj0774c, ABC transport system ATP-binding protein gene; Cj0987c, putative integral membrane protein gene, Cj1001, rpoD; Cj1122c, wlaJ; Cj1209, hypothetical protein Cj1209 gene; Cj1220, groES; Cj1221, groEL; Cj1271, tyrS; Cj1357c, putative periplasmic cytochrome c gene; Cj1360c, putative proteolysis tag for 10Sa_RNA; Cj1389, pseudogene (transmembrane transport protein); Cj1423c, putative sugar-phosphate nucleotidyltransferase gene; Cj1497c, hypothetical protein Cj1497c gene; Cj1523c, hypothetical protein Cj1523c gene; Cj1572c, nuoH; Cj1650, hypothetical protein Cj1650 gene.
|
![]() View larger version (38K): [in a new window] |
FIG. 5. Plot of interaction between atmosphere and temperature (5 and 25°C). The solid line represents the 5% FDR, and the dotted line represents the 1% FDR. The t value indicates whether the genes have an increasing ratio with increasing oxygen concentration (positive t value) or a decreasing ratio (negative t value). Genes: Cj0025c, putative transmembrane symporter gene; Cj0129c, outer membrane protein gene; Cj0239c, NifU protein homolog gene; Cj0240c, putative aminotransferase (NifS protein homolog gene); Cj0264c, molybdopterin-containing oxidoreductase gene; Cj0265c, putative cytochrome c-type heme-binding periplasmic protein gene; Cj0343c, putative integral membrane protein gene; Cj0467, amino acid ABC transporter integral membrane protein gene; Cj0491, rpsL; Cj0492, rpsG; Cj0605, putative amidohydrolase gene; Cj0633, putative periplasmic protein gene; Cj0641, hypothetical protein Cj0641 gene; Cj0702, purE; Cj0709, ffh; Cj0734c, hisJ; Cj0802, cysS; Cj0820c, fliP; Cj0883c, hypothetical protein Cj0883c gene; Cj0940c, glnP; Cj0952c, putative membrane protein gene; Cj1030c, lepA; Cj1148, waaF; Cj1150c, waaE; Cj1190c, putative methyl-accepting chemotaxis protein domain signal transduction protein gene; Cj1270c, hypothetical protein Cj1270c gene; Cj1327, neuB2; Cj1356c, putative integral membrane protein gene; Cj1374c, hypothetical protein Cj1374c gene; Cj1402c, pgk; Cj1410c, putative membrane protein gene; Cj1493c, putative integral membrane protein gene; Cj1497c, hypothetical protein Cj1497c gene; Cj1711c, ksgA; Cj1726c, metA; Cj1727c, metY.
|
28 factor gene flgM (6, 13, 31) was up regulated, while the alternative sigma factor genes rpoN and fliA were down regulated in all conditions except the 25°C aerobic condition. The flagellin genes flaA and flaB were down regulated. Among the genes that were expressed at a higher level at the lower temperature were genes involved in the tricarboxylic acid cycle, oxidative phosphorylation, and glycolysis and gluconeogenesis. |
|
|---|
Permutation testing (1, 8, 30) is an alternative approach, which avoids the need for data to fit the normal distribution. However, to detect differences, such methods need larger experiments than the corresponding normal-distribution-based methods. Permutation testing in multifactor experiments is also technically very difficult. To analyze each factor, the permutation process has to be done within level combinations of other factors. Models with interactions make such analysis even harder. Finally, exact permutation analysis is limited to relatively simple models. Nevertheless, complex models could be analyzed by certain permutation methods (e.g., permutation of residuals) and various bootstrap techniques (30). However, these methods assume that the data follow the normal distribution, and they are only approximately correct. Therefore, rotation testing is preferable, since this method is known to be correct when a normal distribution is assumed.
Adjusted P values according to false discovery-rates, or so-called q values, could alternatively be calculated by using the methodology described by Storey and Tibshirani (28). This means that the q values are calculated purely from the unadjusted P values, without allowing for dependence among the responses. It has, however, been shown that such q values are conservative under so-called weak dependence (28). It is furthermore hypothesized that it is likely that the weak dependence requirement is met in a genome-wide data set. Whether the weak dependence requirement is met in a microarray data set is dependent not only on the nature of the actual genes. The external sources of variation present in a specific experiment will also influence the dependence structure. Note that the dependence among the wavelengths of the described FT-IR data is definitely not "weak."
The suggested false-discovery rate method makes use of the actual dependence between the responses, and it can safely be applied to data with any kind of dependence. The results are more conservative than the q values described above. This false-discovery rate approach can be combined with both rotation testing and permutation testing.
Survival model.
It is not yet established in the literature whether the survival of C. jejuni in the environment is an active or passive process (18, 19). We have shown that it is likely that C. jejuni has active mechanisms for survival in the environment under nongrowth conditions. We have identified global changes in both the lipid and polysaccharide-oligosaccharide composition. Furthermore, several genes are also up regulated under these conditions.
The general trend for the gene expression is that there is a massive down regulation under the different survival conditions compared to the reference (day 0). This reflects the fact that the cells were still actively metabolizing in the reference condition. Nevertheless, many genes are up regulated compared to the reference, and analyses indicate that most of the expression is stress related. More genes are expressed under high oxygen tension and temperature than under low oxygen and temperature. These are also the conditions with the lowest survival frequency. A possible survival mechanism may be to down regulate as many genes as possible to save energy and to up regulate genes involved in energy metabolism and modification of the cell wall components. Even though C. jejuni does not grow at below 30°C, it has previously been shown that C. jejuni has considerable electron transfer chain activity even at temperatures below the minimal growth temperature (10). Interestingly, a significant number of genes involved in energy metabolism had higher ratio at 5°C than at 25°C. This could indicate that C. jejuni actually has a greater need for energy at lower temperatures. The 25°C anaerobic condition is distinct from the other conditions tested in that more genes are down regulated than in the other conditions. Anaerobic conditions have previously been suggested to be important in the pathogenesis of C. jejuni. Gaynor et al. showed that exposing C. jejuni NCTC 11168 to anaerobic conditions at 37°C for 24 h prior to inoculation increased its colonization potential (9).
Many of the up-regulated surface structure genes are involved in flagellar assembly. The flagellum is important for both motility and pathogenicity (29). In this pathway most flg genes (rod-ring-hook) were up regulated, while most fli genes were either unchanged or down regulated. The flagellin genes flaA and flaB were also down regulated. The fact that most flagellar assembly genes are up regulated, except for the flagellin genes and the fli genes, could mean that the cells have most of the flagellar assembly ready but that specific environmental signals may be required for the production of the flagellum.
It should, however, be emphasized that the genome-sequenced strain used in this study has undergone a series of phenotypic changes over time and may not be completely representative of the wild type. Although this strain and the original clonal clinical isolate appear to be clonal, it has been shown that these two isolates differ in colonization, virulence, and gene expression associated with respiration and metabolism genes (9). However, Jones et al. (16) have shown that NCTC 11168 is adaptable to chicken colonization and concluded that this strain is a suitable model strain. Also, due to the diversity among C. jejuni strains and the fact that potential survival mechanisms of C. jejuni still are poorly characterized, the hypotheses presented represent a novel insight into survival mechanisms of C. jejuni. Further investigation using several different C. jejuni strains is needed to test the generality of these hypotheses. The same will also hold true for other C. jejuni model strains, since the genomic stability and phenotypic diversity within this species are not well defined.
The observation that the amount of polysaccharides and oligosaccharides increases over time in all nongrowth conditions may indicate that these are important for survival. However, since none of the known polysaccharide genes are up regulated, we do not know if these polysaccharides correspond to changes in capsular polysaccharide. Recently, a new glycan that is susceptible to changes in growth conditions has been identified (C. Szymanski, L. Fiori, and H. Jarrell, personal communication). This glycan may correspond to osmoregulated periplasmic glucans (OPGs), which are important for survival in extreme conditions (3). The increase of polysaccharides and oligosaccharides observed in the FT-IR spectrum may be related to the expression of these OPGs. Future research is needed to confirm that the increase in the FT-IR spectrum is caused by OPGs and potentially to elucidate the genes involved. There is also a need to gain more general knowledge about polysaccharides and oligosaccharides and their potential role in the survival mechanisms of C. jejuni.
|
|
|---|
When an ordinary significance test is performed in a situation with no real effect, the probability of incorrectly obtaining a significant result (type I error) equals the significance level (usually 5%). When several responses (genes) are analyzed by individual significance tests, we will expect many type I errors, and it is questionable whether significant results can be interpreted as real effects. To avoid this problem, an alternative is to adjust the P values according to the family-wise error rate (FWE) criterion, which means that the probability of at least one type I error among all responses does not exceed the significance level. A classical method for such P value adjustment is Bonferroni's correction; that is, the P values are multiplied by the number of responses. But this method is extremely conservative and becomes useless in cases with a large number of responses. This problem can, however, be handled by using modern FWE methods based on permutation testing (30) or rotation testing (21). These methods take the correlations among responses into account.
A common viewpoint is that such P value adjustment is too strict. Instead of considering the probability of at least one type I error, an alternative is to estimate the false-discovery rate, which is the (expected) proportion of type I errors among all responses reported as significant. When using adjusted P values according to FDR, one accepts that 5% of the responses reported as significant at the 5% level are type I errors. The common FDR methods do not make use of the correlation structure among the responses and cannot be trusted if the weak dependence requirement is not met (28). Below we describe a new method that handles this problem. This method is obtained by modifying (or generalizing) the permutation and rotation methods for FWE adjustment.
Assuming a data set of N responses, we consider a particular model term and the corresponding univariate F statistics, F1, F2, ..., FN. We want to compare these observed statistics to statistics that are simulated under the complete null distribution. As discussed by Langsrud (21), such simulations can be performed by permutations (assuming independent observations) or by rotations (assuming multivariate normality). Then, a set of simulated statistics, F1(m), F2(m), ..., FN(m), is generated M times (where m = 1, 2, ..., M).
We denote the ordered observed F statistics by F(1)
F(2)
...
F(N), and we compute their corresponding adjusted P values, P(1), P(2), ..., P(N). Similarly, the simulated statistics are ordered as F(1)(m)
F(2)(m)
...
F(N)(m). For n = 1, 2, ..., N, a preliminary P value is computed as
![]() |
(n) is the estimated expected proportion of the n largest F statistics that exceed F(n). Accordingly, this is one way to estimate the false-discovery rate. Note that this definition makes active use of the dependence structure. To make the expression in accordance with exact Monte Carlo P values, "1 +" is added in the numerator and denominator.
We can see that
(1) is identical to an adjusted P value according to the FWE criterion. For n > 1,
(n) will be less than (or equal to) the FWE-adjusted P value. On the other hand, when simulations are performed by rotations,
(N) is an estimate of the ordinary raw P value. For n < N,
(n) is greater than (or equal to) the raw P value. This means that our new P values are something between FWE-adjusted P values and raw P values. When the method is based on permutations, the relation to raw P values is more complicated, since the test statistics and the raw-permutation P values can be differentially ordered.
Our P values are not necessarily increasing, and we enforce monotonicity by
for n = 1, 2, ..., N. In other words, we apply a step-up procedure.
Unlike for several other FDR methods, the above description does not involve an estimate of the amount of responses with true null hypotheses. Instead, the calculations are directly based on the complete null distribution, and the results are therefore conservative. This choice is made to allow any kind of dependence among the responses. In fact, the procedure is valid in the extreme case where the correlation between all responses is 1. In that case, the raw P value (equal for all responses) is unchanged.
The microarrays were kindly provided by the BµG@S facility. We thank the Wellcome Trust for support of the BµG@S facility.
* Corresponding author: Mailing address: Matforsk, Norwegian Food Research Institute, Osloveien 1, N-1430 Ås, Norway. Phone: 47 64 97 01 00. Fax: 47 64 97 03 33. E-mail for Birgitte Moen: birgitte.moen{at}matforsk.no. E-mail for Knut Rudi: knut.rudi{at}matforsk.no. ![]()
Present address: Department of Genetics, University of Leicester, Leicester LE1 7RH, United Kingdom. ![]()
|
|
|---|
This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»