| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
,
Department of Public Health, Comparative Pathology and Veterinary Hygiene, University of Padova, Agripolis, Viale dell'Università 16, 35020 Legnaro, Italy
Received 4 July 2007/ Accepted 25 November 2007
| ABSTRACT |
|---|
|
|
|---|
| INTRODUCTION |
|---|
|
|
|---|
Comprehensive studies using a large range of biochemical and molecular biology techniques have been carried out to test the validity of the species status of the various isolates and to elucidate the relationships among them (7, 8, 28, 29, 31, 36, 37, 52, 58, 60). All of these studies, together with complete genome comparisons (25), have revealed extensive genomic similarity with only sets of genes being identified as species specific, indicating that the genetic content of the genomic backbone alone does not adequately define the different virulence and ecological characteristics of the species. The population structure within the B. cereus group has been addressed in several studies, but their results were inconclusive. Some studies provided evidence for a horizontally spread gene transfer (28, 37), while other analyses suggested more typical clonal behavior (29-31, 52, 58, 60). Such marked differences could be due to the different approaches used, as well as taxon sampling.
Several B. cereus strains have been identified as the cause of two kinds of gastrointestinal disease associated with very different types of toxins, namely, emetic syndrome and diarrheal poisoning (57). Emetic syndrome is caused by a heat-stable peptide toxin named cereulide (13), which is preformed in food. Recently, the peptide-synthetase genes responsible for the nonribosomal production of cereulide (ces genes) were identified and characterized. A molecular assay for the detection of emetic toxin producers was also developed (14, 16, 17). The other gastrointestinal disease caused by B. cereus is diarrheal poisoning. This is due to heat-labile enterotoxins produced during vegetative growth of B. cereus in the small intestine. At present, three different toxins have been associated with food poisoning outbreaks: the protein complex hemolysin BL (HBL [6]), the nonhemolytic enterotoxin (NHE [23]), and the single protein cytotoxin CytK (41). Two additional single cytotoxin proteins produced by B. cereus, bceT and entFM, have been described (1, 4, 27). Immunological assays are commercially available for the detection of NHE and HBL, as well as monoclonal antibodies targeting these enterotoxin complexes (10, 11), whereas no such tools are yet available for CytK or cereulide. This could be because studies on the B. cereus toxins have been focused on the nhe and hbl genes (20, 24, 26, 49, 53). Despite the recognition of B. cereus as a food-borne pathogen over 50 years ago and the identification of several enterotoxin genes, its virulence mechanisms have still not been fully elucidated. The in vivo roles of the putative virulence factors produced by B. cereus have not been characterized, except for HBL, which is associated with diarrheal food poisoning (6). Based on their biological activities, these factors are likely to be involved in B. cereus infection and illnesses. NHE, a homolog of HBL, likely possesses biological activities similar to those of HBL and could be a factor in the diarrheal syndrome; however, this hypothesis has not been tested (41).
Multilocus sequence typing (MLST) studies have previously been used to examine the phylogeny of the B. cereus complex (5, 30, 31), identifying three distinct lineages. These lineages largely correspond to the distribution of the species. The clinical isolates used in these studies (5, 30), however, comprised largely clinical isolates from soft tissue infections. Therefore, the relationship of food-borne isolates to the inferred phylogeny is not known. In the present study, we analyzed the MLST profile of 47 strains isolated from different types of food. We analyzed the MLST profiles of our samples in combination with others available in the public database in order to elucidate the phylogenetic relationships among different species included in the B. cereus complex. The presence or absence of toxin genes was also studied to better understand the evolutionary history of this group of proteins, which has received only limited attention thus far. The population structure of the newly characterized isolates, together with strains available in the literature, was surveyed. Finally, we investigated the occurrence of horizontal gene transfer (HGT) in the B. cereus complex and its effect on the evolution of this group of bacteria.
| MATERIALS AND METHODS |
|---|
|
|
|---|
|
New allele sequences were submitted to GenBank and to the B. cereus MLST database. Accession numbers, allele codes, and ST numbers are provided in Table 1.
Molecular phylogenetic analysis.
Publicly available MLST sequences (downloaded from http://pubmlst.org/bcereus/version on 11 October 2006) were combined with new data obtained in the present study in order to produce the multiple alignment used for the phylogenetic analysis. Different sequence types (STs) were numbered according to the MLST scheme developed for B. cereus (33). Multiple alignments containing the concatenated sequences (referred to hereafter as the MLST data set) were straightforward and performed according to the MLST scheme (33). All analyzed MLST sequences were the same length. The gene order in the MLST data set was as follows: glpF, gmk, ilvD, pta, pur, pycA, and tpi (33). This order differs from the common gene order of glpF, ilvD, pur, gmk, pycA, tpi, and pta observed in all complete genomes (available in GenBank) sequenced thus far for species of the B. cereus complex.
A phylogenetic tree was inferred by using the neighbor-joining (NJ) method (55). The NJ analysis was performed by using the MEGA 3.1 program (38) applying the Tamura and Nei evolutionary model (46). Nonparametric bootstrap resampling (BT) (19) was performed to test the robustness of the tree topology (1,000 replicates).
Toxin genes analysis.
Portions of the hblA, hblC, hblD, nheA, nheB, nheC, entFM, bceT, and cytK genes and the emetic-specific sequence (em) were amplified using the set of primers listed in Yang et al. (61) with the 16S to 23S rRNA internal transcribed sequence (ITS) used as an internal control. The amplification was performed in quadruplex (mix A) or pentaplex (mix B and mix C) reactions with the primer pairs reported in Table 2. Amplifications were performed on a Primus thermal cycler (Euroclone, Celbio, Milan, Italy) under the following conditions: an initial denaturation for 2 min at 94°C and 30 s at 94°C, followed by 30 s at 56°C and 45 s at 72°C, in a final volume of 20 µl. The amplification mix contained 1 U of GoTaq polymerase (Promega, Madison, WI), 1x GoTaq buffer, 1.5 mM MgCl2, 200 µM concentrations of each deoxynucleoside triphosphate, and variable numbers and quantities of primers (Table 2). The result of each amplification mix was visualized on a 1.8% agarose gel stained with ethidium bromide. All samples were analyzed with two independent PCRs.
|
Initially, the eBurst algorithm subdivides STs into groups. eBurst analysis was performed by either (i) using the default, very stringent parameters (http://eburst.mlst.net/), in which STs are assigned to the same group only if six of seven alleles in the MLST loci are identical or (ii) relaxing the default parameters to allow groups of STs sharing five or more identical alleles. In our analyses, we considered 449 isolates belonging to 297 distinct MLST profiles. The eBurst analyzed data set was produced, combining ST profiles determined in present study together with those available at the B. cereus group MLST database (http://pubmlst.org/bcereus/), which implements the MLSTdbnet software (33).
HGT detection.
HGT was investigated by applying multiple approaches, which limits the risk of identifying false events of recombination or overlooking the occurrence of true recombinations (50, 51). We performed recombination analyses by applying the following algorithms: BooTScan (56), CHIMAERA (50), GENECONV (48), MAXIMUM
2 (44), RDP (42), and SiScan (21), as implemented in the RDP2 program (43).
The MLST data set was used in all of the analyses performed with the RDP2 program.
| RESULTS |
|---|
|
|
|---|
Isolates, including the strains studied here, listed as food-borne in the MLST database corresponded to 18.7% of the whole currently available set (84 of 449). Food-borne strains were included in 50 different ST profiles, 45 of which were isolated exclusively from foodstuff, while 5 were collected from other sources.
Phylogenetic analysis.
MLST multiple alignment was 2,829 bases long and contained 296 unique MLST sequences. The phylogenetic tree obtained from the analysis performed on the MLST data set is presented in Fig. 1. Three main lineages, hereafter named I, II, and III, were recognized as all receiving strong BT support. Lineage I included B. anthracis, many B. cereus strains, and most of the variously named B. thuringiensis serovars. Lineage I was the largest cluster and was subdivided in the two major groups, Ia and Ib, both receiving statistical support. Group Ia included 102 STs mostly belonging to B. cereus and also contained all STs used to define clade 1 (B. cereus) of Priest and coworkers (52). Clade 1 was named B. cereus clade by Priest et al. since that was the predominant organism of the cluster (52). Group Ia contained all B. anthracis isolates that were tightly linked and formed a monophyletic taxon. Finally, group Ia included ST profiles identifying isolates for which full-length genome sequences are available in GenBank (Fig. 1). Group Ib included STs belonging to diverse B. thuringiensis serovars and unknown B. cereus complex isolates, as well as B. cereus strains. Group Ib also contained the STs used to delimit the B. thuringiensis clade 2 by Priest et al. (52). Lineage I included all new food-borne isolates investigated in the present study and nearly all of the food-borne isolates contained in the MLST database. The only noteworthy exceptions were ST 21 and ST 41, which were placed within lineage II. New food-borne profiles identified in the present study were scattered all over lineage I. Among the 50 known food-borne ST profiles, 58% were included within group Ia. Lineage II contained B. mycoides, B. weihenstephanensis, and B. cereus isolates, as well as unknown isolates. Lineage III included only six isolates. Notably, different serovars of B. thuringiensis and B. cereus isolates showed identical MLST sequence (e.g., ST 12) (Fig. 1).
|
Toxin genes exhibited a scattered distribution over all of lineage I (Fig. 1). However, distribution patterns varied among different genes. The nhe genes were present in almost all STs sampled with the exclusion of ST 38. The hbl and entF genes also exhibited a broad distribution. Conversely, the bceT gene was limited mainly to the cluster ranging from ST 24 to ST 15, with the only exception represented by ST 160. A similar behavior was observed for cytK gene with the exception represented, in this case, by ST 371. The em sequence showed the narrowest distribution, being restricted only to ST 26. The maximum number of co-occurring toxin genes was 5. Toxin genes shared among different STs varied from 0 to 5 with the most spread combination represented by hbl plus nhe plus entFM genes. Current limits on the knowledge of toxin distribution notwithstanding, some patterns can be identified. It is plausible to imagine that hbl, nhe, entFM, bceT, and cytK genes were already present in the common ancestor of lineage I. The loss of genes followed a different pattern in groups Ia and Ib. In fact, all five types of genes have been retained (coinherited) in several STs belonging to the Ib group (e.g., ST 142 and ST 376), while a more marked reduction occurred in the Ia group, where the loss particularly affected bceT and cytK genes. Note that gene loss is not linked to a particular species. The pX01- and pX02-encoded toxins appeared during evolution just once in the B. anthracis taxon. The more widespread coinheritance occurred for the three groups of genes: hbl and nhe; hbl, nhe, and entFM; and nhe and entFM. Nothing can be said for lineages II and III because no data on toxin distribution in these lineages are currently available.
eBurst analysis.
Population structure analysis performed using the eBurst algorithm identified 35 potential clonal complexes (CCs) (Fig. 2 and Fig. 3). These complexes included 210 isolates belonging to 115 ST allelic profiles. The remaining 239 isolates, included in 181 ST allelic profiles, did not cluster with any other ST profile and were taken as singletons (18). Only four complexes included five or more STs, seven contained four STs, one had three STs, and each of the remaining 23 complexes included only two STs (see additional information in the supplemental material).
|
|
Several CCs identified by eBurst analysis resulted in a paraphyletic (e.g., CC111) or polyphyletic (e.g., CC73 and CC182) in the phylogenetic tree (Fig. 3). This discrepancy between the phylogenetic reconstruction and eBurst output suggests an extensive occurrence of HGT among various isolates (see also paragraph below).
HGT detection.
The main outputs of the HGT analyses performed by using the RDP2 package are summarized in Table 3 and the HGT figure (see the supplemental material). Only HGT events with both major and minor sequences parents identified in the MLST data set were considered. Major and minor sequences parents (43) are defined as the two sequences that produced each recombined ST allelic profile through the HGT process. The occurrence of a potential HGT event was accepted only if validated by at least three distinct methods and sustained by strong statistical support. HGT events with only one potential parent identified were not further investigated even if statistically supported. As a consequence, Table 3 provides a conservative estimate of HGT events and does not fully cover the whole range of HGT events affecting the MLST data set. Nevertheless, this conservative approach revealed at least 25 HGT events, involving 85 STs. Some of the HGTs shared the same major and minor sequence parents and this applied both to HGTs spanning two or more genes (see, for example, references 14 to 17), as well as to HGT restricted to a single gene (15, 21). All seven gene fragments concatenated in MLST profiles were affected by HGT events. In 14 HGTs, event breakpoints were located within a single gene, and the pur locus was particularly affected by this phenomenon (Table 3). In three HGT events, breakpoints were limited to pairs of genes arranged consecutively, but not adjacent to one another, in genomes sequenced thus far (ilvD versus pur, pycA versus tpi). The remaining eight HGT breakpoints involved nonconsecutive genes (glpF versus gmk, ilvD versus pta, and pur versus pta). Each identified HGT spanning more than a single gene could actually be the result of two independent events that were not detectable with currently available algorithms. STs involved in HGT events are scattered all over the tree (see the HGT figure in the supplemental material), and no clear bias was detected that linked peculiar species and/or serovars to HGT occurrences. An analysis of alleles involved in single HGT events did not identify alleles clearly favoring HGT occurrence. For example, the five HGTs (HGTs 3, 4, 5, 6, and 7) restricted to the ilvD gene showed that 11 different ilvD alleles were involved. Most of them were restricted to the single HGT event, with a partial overlapping in HGTs 5 and 6. Five of the newly described food-borne strains (ST 369, ST 371, ST 373, ST 374, and ST 378) were also involved in HGT events. For most STs involved in HGT events the toxin distribution is not known (Table 3). Thus, it was impossible to investigate toxin transmission patterns in relation to HGT occurrence.
|
| DISCUSSION |
|---|
|
|
|---|
The majority of STs associated with food-borne strains (58%) were assigned to group Ia, which comprised 34.4% of the total ST profiles. Group Ia contains mostly B. cereus isolates and can be viewed as an expansion of the B. cereus clade 1 defined by Priest et al. (52). The presence of food-borne isolates mainly within group Ia suggests that strains belonging to this group could preferentially grow on some foodstuff matrices. The B. cereus species has long been recognized as a food-borne microbe that plays a key role in food poisoning. The inclusion of food-borne isolates within a group dominated by B. cereus strains is in agreement with microbiological evidence. However, in the present study, a food matrix preference (i.e., egg product, milk products, or other) for different B. cereus isolates was not found, in contrast to a previous report (22). However, the limited isolate sampling suggests the need for further investigation of this phenomenon. Thus, additional studies on a larger number of food-borne isolates are needed to better address this issue. In addition, experiments based on artificial contaminations of food and analysis of growth will also be necessary in order to elucidate the features of different strains characterized through MLST or other genomic approaches.
As a final remark to this paragraph, it must be noted that the broad occurrence of HGT within the B. cereus complex may affect the phylogenetic relationships described in here, as well as in previous analyses (7, 8, 28, 29, 31, 36, 37, 52, 58, 60) (see below for further discussion of this point).
Evolution of toxin genes in B. cereus group.
All toxin genes are contained in group I, which includes most of the strains considered in the present study (Fig. 1). Lineages II and III contain the B. mycoides and B. weihenstephanensis species, which are currently not known as pathogens. Thus, the absence of toxin genes could explain why these species are not isolated from human clinical samples and are not associated with disease. The broad distribution of nhe, hbl, and entF genes makes the possibility that they are widespread or even ubiquitous within lineage I plausible. A narrower distribution characterizes the gene bceT, since it seems restricted mostly to a subgroup of lineage I, with the only exception being represented by ST 160. This latter occurrence could be due to a recent event of HGT (see below), or it could be the result of a generalized loss of the bceT gene in other STs. Current evidence does not help to discriminate between these opposing hypotheses, and further research is needed to settle this point. We used the same set of primers for all sampled strains; thus, the lack of a specific toxin gene should be evidence of the true absence of that toxin in the analyzed strain. However, we cannot exclude the possibility that the regions on which PCR primers were designed underwent some rearrangement, thus preventing us from amplifying the specific toxin gene. Distribution of the em sequence is restricted solely to the ST 26. Location of the em sequence is on a plasmid; thus, its sporadic occurrence appears to be tightly linked to the presence of this plasmid (14, 15). Genes encoding lethal toxins carried by pX01 and pX02 plasmids are currently known only for B. anthracis strains (25, 34, 35). Both phylogeny and eBurst analysis recovered B. anthracis as a monophyletic taxon; thus, it is plausible to suggest that pX01 and pX02 plasmids were acquired just once in this species.
The presence of toxin-encoding genes does not automatically equate with pathogenicity. Regulation of enterotoxin expression is complex, and the current available knowledge is incomplete. Furthermore, the enterotoxin production (12, 47, 62) alone does not fully explain the food poisoning phenomenon that is regulated by other key factors, such as bacterial concentration. Identification of a toxin gene in the bacterial genome indicates only the potential pathogenicity of the strain, while the true poisoning activity may be absent due to the lack of genes regulating toxin expression. All of the food-borne isolates described here are potential causes of food poisoning, since all possess at least one enterotoxin gene. Quite remarkably, isolates with the highest number of enterotoxin genes (ST 142, ST 370, ST 372, and ST 376) are all located in the Ib group.
A possible mechanism of transmission of toxin genes between two bacterial strains was recently identified by Han et al. (25). These researchers found transposable elements in the genomic portion encoding for toxin genes in pathogenic strains of B. cereus and B. thuringiensis, which could be a potential mechanism for gene transmission. Thus, HGT of toxin-encoding genes mediated by transposable elements could be the general mechanism shaping the evolution and distribution of toxin genes within the B. cereus complex. The present analysis supports the view that HGT is a major player in the evolution of B. cereus complex.
Population structure of B. cereus group and HGT.
The population structure of B. cereus cannot be discussed without considering the effect of HGT. This genomic mechanism is pervasive in B. cereus strains. Our analyses identified at least 25 independent events of HGT involving more than one-fourth (28.72%) of all analyzed ST profiles. Our estimate of HGT was very conservative (see above), suggesting that HGT is a very widespread phenomenon in the B. cereus group. This finding was corroborated by several independent methods developed to detect HGT; therefore, we believe that misidentification of HGTs is unlikely. The occurrence of HGT in the B. cereus complex has not been seen consistently in the literature, since some reports cite strong evidence of HGT (28, 37), whereas other analyses showed evidence for clonal behavior (29-31, 52, 58, 60).
In the present study we benefit from an unparalleled number of ST profiles (n = 296) which can be used to check for the occurrence of HGT events. The large size of MLST profiles drastically improved the statistical power of detection by the suite of methods applied here (50, 51). We were able to provide evidence that HGT is a general phenomenon characterizing the evolution of the B. cereus group.
The broad occurrence of HGT is reflected in the marked discrepancies observed between phylogenetic analysis and eBurst population outputs. Several of the potential CCs did not appear monophyletic in the phylogenetic tree, and even the relaxation of the number of identical alleles shared among members of same complex did not modify the pattern of clear differences between phylogeny and eBurst analysis. HGT occurrence, however, acts both on eBurst and phylogenetic outputs. Thus, an integrated approach is necessary to fully investigate the evolutionary history of the B. cereus complex. A better instantaneous representation of the relationships among strains of B. cereus complex is provided by eBurst analysis, which was specifically developed for this task (18). The algorithm has proven to be efficient with other bacterial species exhibiting high levels of recombination (e.g., Campylobacter jejuni) (18). Conversely, when investigation of species relationships is extended throughout their evolutionary history, a phylogenetic approach is preferable (46). Routinely applied phylogenetic algorithms are not able to cope with the HGT occurrence, and thus new algorithms are necessary to correctly represent the partly reticulated and partly dichotomous phylogenetic relationships among B. cereus complex species. HGT may also explain the current difficulty in assigning a single strain to a monophyletic group based on the serovar concept (e.g., see the scatter distribution of B. thuringiensis serovar tolworthi in lineage Ib). Another important effect of HGT is that formally defining subspecies or species within the B. cereus complex becomes a very difficult task. In fact, the use of different names for identifying serovars with a peculiar pathogenic behavior or restricted to a particular host and/or geographic area is, in most cases, not supported by biological evidence, even though it may have a practical utility. More generally, our study provides evidence that the evolution of B. cereus complex is characterized by reticulate evolution with continuous and extensive exchange of genomic portions among different strains.
In conclusion, traditional phylogenetic analysis of MLST sequences might not fully resolve the taxonomy of B. cereus isolates due to the presence of extensive HGT. An eBurst analysis and HGT detection are also necessary to deal with this topic. Prediction of pathogenicity in particular strains is not a trivial task with the currently available evidence, since isolates bearing enterotoxin genes do not belong to a single phylogenetic group.
The development of phylogenetic algorithms that take HGT into account, as well as provide a more accurate identification of HGT events, will greatly improve our ability to understand the evolutionary history of the B. cereus complex. Limits of currently available analytical methods notwithstanding, the MLST approach proved to be a highly reproducible, fast, and accurate method for strain genetic typing. International food trade requires methods suitable for tracking the spread of disease-causing organisms and MLST analysis of B. cereus isolates might become the method of choice for the epidemiology of food-borne diseases produced by this bacterium. Our analysis demonstrated the presence of toxin genes in many strains of the B. cereus complex that are potentially pathogenic. However, further studies are required to characterize the key factors that regulate toxin expression and to determine the different levels of pathogenicity among the various strains.
| ACKNOWLEDGMENTS |
|---|
| FOOTNOTES |
|---|
Published ahead of print on 14 December 2007. ![]()
Supplemental material for this article may be found at http://aem.asm.org/. ![]()
| REFERENCES |
|---|
|
|
|---|
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| J. Bacteriol. | Microbiol. Mol. Biol. Rev. | Eukaryot. Cell | All ASM Journals |
|---|