Previous Article | Next Article ![]()
Applied and Environmental Microbiology, March 2009, p. 1291-1300, Vol. 75, No. 5
0099-2240/09/$08.00+0 doi:10.1128/AEM.02563-08
Copyright © 2009, American Society for Microbiology. All Rights Reserved.
,
UMR 1219, Université de Bordeaux, INRA, ISVV, Talence 33405, France
Received 10 November 2008/ Accepted 21 December 2008
|
|
|---|
|
|
|---|
Since the first description of the species by Garvie in 1967 (12), O. oeni, formerly Leuconostoc oenos, was studied not only for characterizing O. oeni isolates from various origins and eventually for identifying candidates for commercial purposes but also in order to decrypt the evolution and population structure of this singular species. Yang and Woese deduced from the analysis of 16S rRNA genes of LAB that O. oeni is a tachytelic (fast-evolving) species (47). Although this conclusion led to controversy (4, 34), it was well supported by a phylogeny derived from the genome sequences of 12 LAB (27, 33). O. oeni lacks the ubiquitous mismatch repair system genes mutS mutL, which normally contribute to lowering the rates of spontaneous mutations and recombination (28). The absence of these genes was recently correlated to the hypermutability of the species and thus provides a rational basis for the tachytelic status of O. oeni (29). At the infraspecies level, 16S, 23S, and16S-23S spacer sequences proved to be highly conserved among O. oeni strains, suggesting that the species is genetically homogeneous (23, 31, 32, 48). This was supported by the levels of DNA-DNA homology and by similarities of the genetic maps of diverse strains (7, 50). In addition, differentiation of O. oeni strains can only be achieved by using methods targeting minor genotypic differences: random amplification of polymorphic DNA (RAPD) (36, 48, 49), DNA fingerprinting (46), amplified fragment length polymorphism (3), or pulsed-field gel electrophoresis (PFGE) patterns of low-frequency restricted genomes (14, 20-22, 38, 45). However, the concept of genetic homogeneity was repeatedly challenged by the detection of groups of strains—often two major groups—distinguished on the basis of molecular and/or phenotypic/metabolic characteristics, leading authors to suggest that O. oeni might be divided into two subspecies (35, 45).
Analysis of O. oeni strains by multilocus sequence typing (MLST) has provided a new picture of the diversity and population structure of the species (5). MLST is a strategy based on the sequence polymorphism of a set of genes, usually 7 to 10, which has the advantages of being robust (based on genetic data) and electronically portable, to generate data that can be used not only for strain differentiation but also for evolutionary and population studies (25). Although it was originally developed for pathogenic bacteria, MLST became the gold standard for studying lineages and population structures of all kinds of microorganisms (26). The only MLST analysis of O. oeni reported to date has examined variations at five genetic loci among 18 strains (5). The authors of that study detected a high allelic diversity and evidences that recombination events contributed to the dissemination of alleles. They concluded that the O. oeni population is panmictic (no line of clonal descent is easily discernible) and suggested that frequent recombination events and horizontal gene transfers (HGTs) occurred between strains. This hypothesis was well supported by the comparative analysis of LAB genomes, which argued in favor of a substantial number of gene losses and acquisitions during the evolution of LAB (27, 28). In addition, HGTs may be particularly favored in O. oeni that lacks the mutS mutL genes (29). However, the hypothesis of a panmictic population contrasted with both the genetic homogeneity and the existence of subspecies suggested by other typing methods.
The present study was undertaken to develop a new MLST scheme targeting eight housekeeping genes and to compare its discriminatory potential with PFGE, which is generally recognized as the most efficient typing method for O. oeni strains. Comparing the two methods was also interesting since they target different genetic variations: MLST reveals punctual mutations in a few genes, whereas PFGE is more sensitive to large-scale genomic rearrangements due to the presence of genomic islands, insertion sequences or mobile elements. The discriminatory ability of both methods was compared by testing a collection of 43 O. oeni strains, and the obtained data were also used to revisit the population structure of O. oeni.
|
|
|---|
|
View this table: [in a new window] |
TABLE 1. Origins and typing data of O. oeni strains analyzed in this study
|
DNA extraction, amplification, and sequencing.
The genomic DNA of bacteria was extracted by using a Wizard genomic DNA purification kit according to the manufacturer's instructions (Promega) with minor modifications, including a 1-h pretreatment at 37°C of bacteria cells in EDTA (50 mM, pH 8.0) containing 10 mg of lysozyme ml–1. DNA preparations were resuspended in sterile water to 10 ng µl–1 and stored at –20°C. The primers used for PCR amplifications and sequencing reactions are listed in Table 2. PCR amplifications were performed in 25-µl reactions containing a custom-made PCR master mix (MP-Biomedicals), 10 ng of template genomic DNA, and 10 pmol of each primer. The PCR program was 95°C for 2 min; followed by 30 cycles of 95°C for 30 s, 58°C for 30 s, and 72°C for 30 s; followed in turn by a final extension of 10 min at 72°C. PCR products were purified by using magnetic beads (Agencourt Ampure; Beckman Coulter), and their purity was controlled by analyzing aliquots on 1% agarose gels. The purified PCR products were sequenced on both DNA strands. Sequencing reactions were performed by using a BigDye Terminator v1.1 cycle sequencing kit (Applied Biosystems). The obtained products were purified and sequenced at the Genotyping and Sequencing Facility of Bordeaux.
|
View this table: [in a new window] |
TABLE 2. Genes and primers used for MLST
|
MLST data treatment and bioinformatic analyses.
Analysis, editing, and comparison of the 688 chromatograms and sequences obtained for the eight genes of the MLST scheme and the 43 bacterial strains were performed by using Bionumerics 5.1. Each distinct gene sequence was assigned an allele number, and each unique combination of eight allele numbers was assigned a sequence type (ST). The same ST was used for several strains when they share the same allelic profile.
Descriptive analyses of the genetic variability at MLST loci such as the determination of the mean G+C content, the number of polymorphic sites, and the nucleotide diversity were performed using DnaSP 4.5 (37). The same software was also used to calculate the dN/dS ratio (where dN is the number of nonsynonymous substitutions per nonsynonymous site and dS is the number of synonymous substitutions per synonymous site) and to perform the Tajima's D neutrality test (43).
The phylogenetic trees were constructed by the neighbor-joining method with a Kimura two-parameter distance model using MEGA 4 (44) or by the maximum-likelihood method with the HKY 85 model using Tree-Puzzle 5.2 (39). Bootstrap values were obtained after 1,000 replicates. Congruencies between maximum-likelihood trees were statistically assessed by performing the Shimodaira-Hasegawa test implemented in the CONSEL software (42).
Clonal complexes of strains were investigated by a minimum spanning tree analysis, based on number of mutations between concatenated sequences, using Bionumerics 5.1. Strain relationships were also analyzed using the eBURST program, which focuses on number of variable loci between allelic profiles (11).
The standardized index of association (ISA) was calculated to determine the degree of linkage disequilibrium between alleles using START 2 (19). The split decomposition method was used to assess the degree of tree-like structure for alleles of each locus or for concatenated sequences using SPLITSTREE 4.1 (17). A compatibility matrix of all of the informative sites identified in MLST loci was generated by using the RETICULATE program (18). Recombination events were searched between sequences of single and concatenated loci using seven algorithms (RDP, Geneconv, BootScan, Maximum
2, 3Seq, Chimaera, and Sister Scanning) implemented in the RDP 3.27 software (30). Only recombination events detected by at least three methods and involving parental sequences present in the MLST data set were considered.
Nucleotide sequence accession numbers.
Nucleotide sequences of MLST loci were deposited in GenBank under accession numbers FJ392687 to FJ392689 (ddl), FJ392690 to FJ392698 (dnaE), FJ403333 to FJ403343 (g6pd), FJ403344 to FJ403350 (pgm), FJ403351 to FJ403361 (purK), FJ403362 to FJ403370 (recP), FJ403371 to FJ403373 (rpoB), and FJ413033 to FJ413040 (gyrB).
|
|
|---|
All strains were analyzed by PFGE using NotI for macrodigestion of genomic DNAs. The 43 O. oeni displayed 30 well-discriminated DNA patterns composed of 6 to 11 fragments of 24 to 300 kb in size and 1 or 2 fragments greater than 300 kb (Fig. 1). The similarities between DNA patterns did not always reflect the proximity of the strain isolation sites. The DNA pattern of the O. kitaharae strain was strikingly different since it comprised only small fragments not exceeding 170 kb. A UPGMA tree was constructed from PFGE data and rooted with O. kitaharae (Fig. 1). In this tree the 43 O. oeni strains were clearly subdivided into two clusters, named A and B, each comprising 28 and 15 strains of PFGE patterns sharing more than 67.5 and 64.8% similarity, respectively. Interestingly, eight of the nine commercial strains were grouped together in cluster A, along with all strains isolated from fortified wines (Pineau, Banyuls, and Floc) and two strains from champagne. In addition, at least 11 of the 15 strains of cluster B were isolated before 1993 (isolation date is missing for four strains, Table 1), whereas cluster A contained only three strains obtained before this date.
![]() View larger version (96K): [in a new window] |
FIG. 1. UPGMA tree based on NotI-PFGE macrorestriction patterns of 43 O. oeni isolates.
|
The eight genes were successfully amplified and sequenced for all 43 O. oeni strains but not for O. kitaharae, which therefore was not further considered in the MLST analysis. Nucleotide sequences of 496 bp (ddl) to 665 bp (dnaE) were determined (Table 3). Their mean G+C content varied from 36.6% (ddl) to 46.0% (recP), while it is 38% in the whole O. oeni genome (33). The level of nucleotide variation differed greatly among genes (Table 3). Only two polymorphic sites were detected in ddl compared to 48 sites in rpoB. Similarly, the nucleotide diversity (the average number of nucleotide differences per site from two randomly selected sequences) was very low at two loci (ddl, 0.0011; pgm, 0.0014) but high in recP and rpoB (0.0204 and 0.0370, respectively). Taken together, rpoB and recP accounted for more than half of all polymorphic sites (81 of the 156 sites). Despite the high diversity of these two genes the data demonstrated that the overall nucleotide diversity of the 43 strains at the eight loci was low. The dN/dS ratios were calculated to estimate the level of selection applied to each gene (the value obtained for ddl was unreliable since it was based on only two polymorphic sites; Table 3). All values did not exceed 0.5, indicating that there was a strong selective pressure against amino acid changes, as typically observed for housekeeping genes. Besides nucleotide variations, we identified a one-base deletion at position 381 in the recP sequence of O. oeni IOEB-SARCO 422 and a 860-bp transposon inserted at position 456 in the purK sequences of strains IOEB-SARCO 422 and 444. Since these modifications disrupted or dramatically changed the open reading frames, there is no doubt that the encoded proteins were not functional. The genetic equilibrium of alleles was analyzed by using Tajima's D neutrality test (43). D values obtained for gyrB, g6pd, pgm, ddl, dnaE, purK, and recP did not deviate significantly from zero, supporting a neutral selection of the alleles of these genes (Table 3). In contrast, the positive D value (3.5) measured for rpoB denoted an important balancing selection, which may explain why this gene presents the highest number of polymorphic sites (n = 48), along with the lowest number of alleles (n = 3).
|
View this table: [in a new window] |
TABLE 3. Genetic variability at O. oeni loci
|
Phylogeny based on MLST data.
The phylogeny of the 43 O. oeni isolates was analyzed by constructing a neighbor-joining tree from the 4,581-bp concatenated sequence of the eight loci (Fig. 2). The tree revealed two major phylogroups, named A and B, strongly supported by bootstrap values, and two descents were also detected: A1, A2, B1, and B2. The phylogroup A included all strains of ST-1 to ST-22 and in B were strains with ST-23 to ST-34. Interestingly, there was a perfect correlation between these two phylogroups and the two PFGE clusters (Fig. 1). To determine whether one of the eight genes used in the concatenated sequence influenced this tree topology, this was compared to the topologies of the eight trees constructed independently from each gene. Six trees based on gyrB, g6pd, ddl, dnaE, purK, and rpoB showed the same topology supporting the two phylogroups, while only two trees derived from pgm and recP showed different topologies (see Fig. S1 in the supplemental material). Therefore, the distribution of O. oeni strains in two distinct groups did not result from the allelic diversity of a single gene but more likely from a general tendency of whole genomes. A statistical comparison of tree topologies performed by the Shimodaira-Hasegawa's test (41) revealed a significant lack of congruence in many pairwise comparisons and particularly for the concatenated sequence (see Table S1 in the supplemental material), indicating that independent evolutionary mechanisms affected the MLST genes and that the phylogeny of O. oeni strains cannot be inferred accurately from only one or a few genes.
![]() View larger version (17K): [in a new window] |
FIG. 2. Neighbor-joining phylogenetic tree constructed from 34 concatenated nucleotide sequences of eight loci. Bootstrap values above 80% are indicated. Two major phylogroups, designated A and B, and their respective descents, A1, A2, B1, and B2, are mentioned.
|
![]() View larger version (17K): [in a new window] |
FIG. 3. Minimum spanning tree analysis of 43 O. oeni strains based on the concatenated sequences of eight loci. Each circle corresponds to an ST. Different shadings were used for phylogenetic groups A1 (black), A2 (dark gray), B1 (clear), and B2 (light gray). Circle sizes denote the number of strains sharing the same ST (1, 2, or 3). The number of mutations between STs is indicated. STs that belong to the same clonal complex (CC) are shown as circles grouped in a gray area.
|
The split decomposition method (17) was used to examine the impact of recombination in the eight loci and in the concatenated sequence. Split graphs of each locus showed treelike structures, except for purK and recP, where some networks were detected, indicating that most of the genes were not significantly affected by intragenic recombination (see Fig. S2in the supplemental material). In contrast, the split graph of the concatenated sequence had a "rectangular" network shape where the two subpopulations A and B and their descents were clearly disconnected (Fig. 4), which implies that intergenic recombination events were important during O. oeni evolution and for the emergence of lineages.
![]() View larger version (22K): [in a new window] |
FIG. 4. Split graph deduced from the concatenated sequences of the eight loci for the 34 STs. Circles indicate the positions of the groups A1, A2, B1, and B2.
|
![]() View larger version (32K): [in a new window] |
FIG. 5. Compatibility matrix of the 137 informative sites of the eight loci. Highly incompatible sites are indicated by black squares.
|
|
View this table: [in a new window] |
TABLE 4. Detection of possible recombination events among O. oeni strainsa
|
|
|
|---|
Two subpopulations in the species O. oeni.
A phylogenetic tree constructed from the concatenated sequences of the eight loci showed that the 43 O. oeni strains form two distinct phylogroups of 28 and 15 strains that we have designated subpopulations A and B. The detection of two subpopulations was not the result of an evolutionary distortion due to one or a few genes given that it was supported by the topologies of six independent trees constructed from gyrB, g6pd, ddl, dnaE, purK, and rpoB sequences. It denotes more likely a global evolutionary tendency of bacterial genomes. In agreement with this hypothesis, IsA values calculated from the whole O. oeni population (0.197) and from each subpopulation (0.026 and 0.027) support the existence of two groups of strains carrying their own allelic contents. PFGE analysis of the 43 strains also provided a strong support to this possibility since it disclosed two groups of strains that are perfectly correlated to the two subpopulations A and B. It is noteworthy that PFGE and MLST analyses target different variations of the genome: large-scale modifications due to genomic rearrangements or insertions or deletion of mobile DNA elements and point mutations in a few genes, respectively. Therefore, strains that belong to the same subpopulation share not only similarities of allelic content but also similarities in genome organization. Interestingly, eight of the nine commercial strains-which are expected to have optimal abilities for performing the malolactic fermentation of wine—were grouped together in subpopulation A. In future studies, it will be useful to examine possible relationships between phenotypic and genotypic traits of strains since this could help to select industrial strains and also to decide if the nontaxonomic but informative concept of subspecies can be used instead of subpopulation.
The detection of two subpopulations differs with a previous MLST analysis of 18 O. oeni strains that revealed only one group (5). However, the number of loci analyzed in this former study was probably insufficient to disclose accurately phylogenetic groups, given that the same 18 strains formed two distinct groups by ribotyping analysis (5). Although only two subpopulations were identified in the set of 43 strains analyzed here, we cannot exclude the possibility that more would have been detected by analyzing a larger collection of strains. However, previous studies often reported two groups of strains. The most recent is the typing of 67 isolates from Germany that were classified in two major groups based on PFGE patterns (22). Similarly, two groups of strains were detected in other studies based on PGFE analysis (45), RAPD analysis (49), ribotyping (5, 49), and metabolic characterization (35). Unfortunately, the data reported in these works were produced by different techniques or even by PFGE analyses, which are difficult to compare. Therefore, it is not yet possible to precisely determine whether the species O. oeni contains several subpopulations or only two that were repeatedly detected in independent works. However, in favor of the latter possibility, it is important to note that the 43 strains that we analyzed were collected from many different sources.
Mutability and evolution of O. oeni genome.
The absence of mutS and mutL genes in O. oeni was correlated to the hypermutability of its genome (29). However, the nucleotide diversity measured at the eight loci of the 43 strains analyzed here was rather low and not as important as expected for a hypermutable genome. Only 3.4% of variable sites were detected in the 4,581-bp concatenated sequence. This is in the same range as values obtained from other LAB: 1 to 7.7% in Lactobacillus plantarum (6), 0 to 2.67% in Pediococcus parvulus (2), and 1.4 to 7.8% in Lactobacillus casei (1). The corresponding allelic diversity reaches only 7.6 alleles per locus, whereas higher values are usually measured. For instance, an analysis of six housekeeping genes in 40 L. casei strains showed a mean value of 13.8 alleles per locus (1). Therefore, it is possible that the hypermutable status concerns only some genes of O. oeni. In the species Mycobacterium tuberculosis that also lacks mut genes (9), housekeeping genes are extremely well conserved, whereas a significant mutation rate was measured for genes involved in DNA repair, replication, and recombination (8). The authors of that study have suggested that the lack of fidelity in genome maintenance would be the starting point of evolution in stressful conditions, such as antibiotic resistance or adaptation to a particular niche. A similar situation could be responsible for the adaptation of O. oeni to the wine environment. Consistent with the low genetic diversity at most MLST loci, the pictures of the O. oeni population structure obtained by eBURST and the minimum spanning tree methods revealed only a few small clonal complexes and a large number of single STs. During the evolution of O. oeni the emergence of clonal descents by accumulation of punctual mutations was limited, while the impact of recombination events was probably much more important and produced many strains with remote genotypes.
Intergenic recombination shapes the O. oeni population.
A critical role of recombination in O. oeni evolution was pointed out in a former MLST analysis (5). These authors suggested that the impact of recombination was so important in this species that it could be an example of panmictic population, i.e., a population where clonal descents are hardly detectable. This conclusion contrasted with other results obtained by DNA-DNA hybridizations, 16S-23S ISR sequences, ribotyping, or RAPD patterns (7, 23, 48, 49) that suggested a genetically homogeneous O. oeni population and a clonal mode of evolution. According to our results, the apparent genetic homogeneity and clonality were most likely due to the limited genetic diversity in O. oeni and the presence of two subpopulations. Although these two subpopulations exist, our data confirmed the importance of recombination in O. oeni evolution. As described above, the two subpopulations represent strains sharing genomic and allelic similarities, but their alleles are widely disseminated and close to linkage equilibrium (IsA = 0.026 and 0.027 in subpopulations A and B, respectively). A split graph representation of the concatenated sequence of the eight loci has clearly shown that the two subpopulations and their descents originated from intergenic recombination events. According to analyses made with the RDP3 package, it is possible that the descents A2, B1, and B2 result from events that occurred between strains of each subpopulation involving DNA regions that include the recP gene. To better understand how recombination has modeled the O. oeni genome, it will be interesting to compare the full genomes of O. oeni PSU1 (33) and ATCC BAA 1163 (unfinished genome) that belong to each subpopulations.
We thank C. Miot-Sertier for technical assistance and the Genotyping and Sequencing facility of Bordeaux for performing the sequencing reactions (grants from the Conseil Régional d'Aquitaine [20030304002FA and 20040305003FA] and from the European Union [FEDER 2003227]).
Published ahead of print on 29 December 2008. ![]()
Supplemental material for this article may be found at http://aem.asm.org/. ![]()
|
|
|---|
guez, W. Zhang, J. R. Broadbent, and J. L. Steele. 2007. Genotypic and phenotypic characterization of Lactobacillus casei strains isolated from different ecological niches suggests frequent recombination and niche specificity. Microbiology 153:2655-2665.
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»