Characterization of ISRgn1, a Novel Insertion Sequence of the IS3 Family Isolated from a Bacteriocin-Negative Mutant of Ruminococcus gnavus E1

ABSTRACT ISRgn1, an insertion sequence of the IS3 family, has been identified in the genome of a bacteriocin-negative mutant of Ruminococcus gnavus E1. The copy number of ISRgn1 in R. gnavus E1, as well as its distribution among phylogenetically E1-related strains, has been determined. Results obtained suggest that ISRgn1 is not indigenous to the R. gnavus phylogenetic group but that it can transpose in this bacterium.

ISRgn1, an insertion sequence of the IS3 family, has been identified in the genome of a bacteriocin-negative mutant of Ruminococcus gnavus E1. The copy number of ISRgn1 in R. gnavus E1, as well as its distribution among phylogenetically E1-related strains, has been determined. Results obtained suggest that ISRgn1 is not indigenous to the R. gnavus phylogenetic group but that it can transpose in this bacterium.
Insertion sequences (ISs) are mobile DNA elements capable of mediating various types of genome rearrangements. They also play an important role in evolution by facilitating horizontal gene transfer (for a review on insertions sequences, see reference 11). One of the largest families of ISs is the IS3 family, which is widely distributed in both gram-negative and gram-positive bacteria. Members of this group have sizes from 1.2 to 1.5 kb. They often possess imperfect terminal inverted repeats and generate a target site duplication of 3 to 5 bp. The majority of the IS3 insertion sequences have at their ends the dinucleotides 5Ј-TGѧCA-3Ј. They contain two consecutive and partially overlapping open reading frames (ORFs), with the second or downstream ORF (orfB) at phase Ϫ1 relative to the first or upstream ORF (orfA). The functional transposase is proposed to be a fusion protein, OrfAB, generated by programmed ribosomal frameshifting between both orf genes.
Ruminococcus gnavus E1 is a gram-positive strictly anaerobic strain isolated from a human intestinal microbiota that produces ruminococcin A (RumA), a trypsin-dependent lantibiotic (5). Lantibiotics are posttranslationally modified peptides characterized by the presence of lanthionines, ␤-methyllanthionines, and didehydrated amino acid residues (15). Recently, a 12.8-kb region from the genome of R. gnavus E1 containing the genetic determinants for RumA production has been characterized (9). The rumA gene cluster is organized in three operons with predicted functions in lantibiotic biosynthesis and secretion (rumA 1 A 2 A 3 MTX), signal-transduction regulation (rumRK), and producer self-immunity (rum-FEGHR2) (see Fig. 1A). Transcription of rum operons has been shown to be induced in the presence of trypsin, probably through the two-component system RumRK (9).
In this work, we report on the identification and character-ization of a novel insertion sequence present in the genome of a spontaneous E1 bacteriocin-negative mutant (E1-Bac Ϫ ). This element, designated ISRgn1, showed the typical features of the IS3 family and to our knowledge is the first insertion sequence isolated from R. gnavus. Identification of ISRgn1 in spontaneous bacteriocin-negative mutant of R. gnavus E1. A mutant of the RumA producer R. gnavus E1 unable to produce bacteriocin activity in a trypsin-supplemented medium was isolated under standard laboratory culture conditions. To test the presence of the genes (rumA 1 A 2 A 3 ) encoding the RumA precursor, total DNA of E1-Bac Ϫ was digested with different restriction enzymes and subsequently subjected to Southern hybridization analysis using the rumA-specific primer ol30 (5) as a probe. Restriction DNA fragments hybridizing to the probe were obtained, but the hybridization pattern differed from that of the original producer strain used as a control (data not shown).
Based on these data, a genomic library of E1 Bac Ϫ was constructed in E. coli DH5␣ cells by using established protocols for molecular biology techniques (16). Total DNA was isolated using the NucleoSpin CϩT kit (Machery-Nagel GmbH & Co., Düren, Germany), digested with PstI and SstI, and subsequently cloned into the corresponding sites of pBluescript SK(ϩ). After electroporation of E. coli DH5␣ cells, clones harboring the plasmid pLEM501 ( Fig. 1) were identified by colony blot hybridization with ol30. The DNA sequence of the fragment cloned in pLEM501 comprised 5,451 bp, with an overall GϩC content of 32%.
Database similarity searches allowed the identification of eight ORFs, each of them preceded by potential ribosome binding sites (RBSs). Comparison of the sequenced fragment of the E1-Bac Ϫ DNA ( Fig. 1B) with the RumA locus of the E1 wild-type strain (Fig. 1A) showed that pLEM501 contained rumA 1 A 2 A 3 as well as a 3Ј-truncated rumM gene, the gene presumably encoding the pre-RumA modifying enzyme (9).
However, upstream of rumA 1 , the DNA sequence of the pLEM501 insert differed from that of the corresponding region in the wild-type RumA locus, suggesting that a genetic rearrangement occurred in the E1-Bac Ϫ mutant. Starting at 1,147 bp upstream of the start codon of rumA1, the gene rumB encoded a 66-amino-acid peptide with a theoretical molecular mass of 7.39 kDa, showing similarity (33% identity, 53% similarity) with the precursor of mersacidin, a globular type B lantibiotic produced by Bacillus sp. strain HIL Y-85,54728 (1). The following ORF, rumN, encoded a putative protein of 248 amino acids (29.2 kDa) exhibiting significant sequence similarity (25% identity, 43% similarity) with the N-terminal region of MrsM, the modifying enzyme of the mersacidin precursor (1), and other LanM proteins supposed to catalyze dehydration and subsequent thioether ring formation in lantibiotic prepeptides (17,21). Both rumB and rumN were named on the basis of database similarity searches and were in accordance with conventional lantibiotic clusters nomenclature (6).
Upstream of rumB, two ORFs, orfA and orfB, and their neighboring sequences fulfilled typical features of an IS of the IS3 family. This element was named ISRgn1, and it is described below.
To complete the comparison between the RumA loci of the E1 wild-type strain and the E1-Bac Ϫ mutant, PCR amplifications were performed using primer sets based on the nucleotide sequences of genes surrounding the rumA copies on the RumA wild-type locus. The primers utilized and their corresponding target sites within the R. gnavus E1 RumA locus (EMBL accession no. The results obtained demonstrated that E1-Bac Ϫ conserved the genes rum-MTX belonging to the hypothetical biosynthetic operon, whereas the operons presumably involved in regulation (rumRK) and immunity (rumFEGHR2) were absent (Fig. 1B).
Taken together, the data obtained by DNA sequencing and PCR amplification showed that the genetic rearrangement that occurred in E1-Bac Ϫ resulted in the deletion of at least 5 kb at the 5Ј end of the RumA locus. On the other hand, the rearrangement could also affect the hypothetical rumB gene cluster. First, sequence comparisons suggested that RumN could be truncated. It is 248 amino acids long, whereas typical LanM proteins consist of over 900 residues. Furthermore, RumN lacks the C-terminal region, which is thought to be involved in the formation of thioether bonds from previously didehydrated residues in lantibiotic prepeptides (17,21). Second, no other genes typically associated with bacteriocin production, as that encoding the specific transporter, have been found in the vicinity of rumN. Further studies are now necessary to analyze the chromosomal region surrounding rumB in the wild-type strain and to determine if a type B lantibiotic is really produced.
Sequence analysis and features of ISRgn1. Similarities found after database comparisons of nucleotide and amino acid sequences, suggested that ISRgn1 belonged to the IS3 subgroup of the IS3 family (11). ISRgn1 was 1,278 bp long, with a GC content of 36%. It carried 38-bp imperfect inverted repeats at the ends and was flanked by identical 3-bp direct repeats, probably generated by duplication of the target insertion site. Another IS3-conserved feature was the presence of the terminal dinucleotides TG (5Ј) and CA (3Ј). ISRgn1 contained two overlapping ORFs, orfA and orfB, in the reading phases 0 and Ϫ1, respectively. The putative proteins deduced from these orf genes exhibited a high degree of similarity to transposases of the IS3 family (between 30% and 56% of identity). The predicted product of orfA was a protein of 97 amino acids possibly starting at an ATG initiation codon. ISRgn1-OrfA contained a putative DNA-binding helix-turnhelix motif, which could be involved in the sequence-specific recognition of the terminal inverted repeats by the transposase (20). orfB encoded a hypothetical protein of 303 amino acids containing a highly conserved DDE motif, which has been proposed to be the catalytic site of IS3 transposases and retroviral integrases (3,13).
As other IS3 members, the nucleotide sequence of ISRgn1 contained a conserved frameshift motif (2, 4), 5Ј-AAAAAAA-3Ј, in the 46-bp region of overlap between orfA and orfB. For some IS3 elements, it has been demonstrated that the functional transposase is an OrfAB fusion protein generated by a Ϫ1 translational frameshift as a consequence of ribosome slippage in this motif (14,22). The efficiency of the frameshifting depends also on the presence of an RBS and a secondary structure upstream and downstream, respectively, of the frameshift window (4). As is shown in Fig. 2, these features were present in the orfA-B overlapping region of ISRgn1.
Genomic copy number and distribution of ISRgn1. To determine the genomic copy number of ISRgn1, total DNA was obtained from the E1 Bac Ϫ mutant and seven independent isolates of R. gnavus E1 exhibiting a wild-type phenotype.
DNAs were digested with EcoRI and subsequently hybridized with an internal EcoRV-EcoRI fragment of 0.9 kb from IS-Rgn1. Since one cleavage site for EcoRI is present in the cloned ISRgn1 but not in the probe (Fig. 1B), the number of hybridization bands was assumed to be the number of the IS copies.
As is shown in Fig. 3, two hybridizing fragments, with molecular sizes of 6.5 and 1.7 kb, were detected in the genome of the E1-Bac Ϫ mutant (lane 8) as well as in six other E1 isolates. These results suggested that chromosomal copies of the IS have the same localization in the E1 wild-type strain and in the nonproducer mutant and, consequently, that ISRgn1 has not transposed in E1-Bac Ϫ . However, the presence of two additional copies of ISRgn1 in the genome of one E1 isolate (Fig.  3, lane 3) strongly suggested that ISRgn1 is an active insertion sequence that could transpose in R. gnavus E1. Nevertheless, the mechanism by which ISRgn1 transposes is unclear. The fact that the copy number of the IS increases suggests a replicative mechanism of transposition, but these data are also consistent with a nonreplicative process followed by segregation of the donor molecule (11).
The presence of ISRgn1 in the genome of 17 R. gnavus E1-phylogenetically related strains was also investigated. All belonged to the Clostridium coccoides phylogenetically defined group (RDP registration no. 2.30.4.1) (12), which includes some of the predominant bacterial genera found in the human large bowel (18). Fourteen of these strains had been isolated from human fecal samples according to their capability to  produce a trypsin-dependent antimicrobial substance active against Clostridium perfringens (12a). Strains were cultivated as described by Dabard et al. (5), and EcoRI-digested total DNAs were hybridized with the EcoRV-EcoRI fragment of ISRgn1 (Fig. 4). The results obtained demonstrated that ISRgn1 was present in a single copy in the genome of only one tested strain, R. gnavus LEMB53 (Fig. 4,  lane 10).
Restricted distribution of ISRgn1 among closely E1-related strains, along with the discrepancy between the GC content of the sequenced copy of IS (36%) and that of the R. gnavus E1 genome (43%; 12a), suggest that ISRgn1 is not indigenous to R. gnavus but could be acquired by horizontal transfer from a microorganism with a lower GC%.
In fact, results reported by Marcille et al. strongly suggest that genes encoding RumA-like bacteriocins might be located in a mobile genetic element able to transfer among bacteria belonging to the C. coccoides phylogenetic group (12a). Several lantibiotics, such as nisin, lacticin 3147, and lacticin 481, are encoded by gene clusters located in transposons (10) or composite transposons (7,8).
The characterization of such a mobile structure and its putative participation in the dissemination of adaptive functions among bacteria inhabiting the digestive ecosystem will be the subject of future research.
Nucleotide sequence accession number. The sequence reported here has been deposited in the GenBank Sequence Database from the National Center for Biotechnology Information (Bethesda, Md.) under accession number AF320327. This work was supported by the grants "Actions Concertées Coordonnées Sciences du Vivant" from the French Ministry for Research and Technology and "FAIR CT 95-0433" from the European Community.