Multilocus Variable-Number Tandem-Repeat Analysis of Yersinia ruckeri Confirms the Existence of Host Specificity, Geographic Endemism, and Anthropogenic Dissemination of Virulent Clones

This comprehensive population study substantially improves our understanding of the epizootiological history and nature of an internationally important fish-pathogenic bacterium. The MLVA assay developed and presented represents a high-resolution typing tool particularly well suited for Yersinia ruckeri infection tracing, selection of strains for vaccine inclusion, and risk assessment. The ability of the assay to separate isolates into geographically linked and/or possibly host-specific clusters reflects its potential utility for maintenance of national biosecurity. The MLVA is internationally applicable and robust, and it provides clear, unambiguous, and easily interpreted results. Typing is reasonably inexpensive, with a moderate technological requirement, and may be completed from a harvested colony within a single working day. As the resulting MLVA profiles are readily portable, any Y. ruckeri strain may rapidly be placed in a global epizootiological context.


RESULTS
MLVA development and deployment. Following identification and verification of putative VNTR regions in the Y. ruckeri genome, 10 informative loci were selected for inclusion in the present MLVA and divided equally among two multiplex PCR assays (Table 1). Capillary electrophoresis (CE) performed on PCR products detected peaks corresponding to all 10 VNTR loci in 480 of the 484 Y. ruckeri isolates examined, with the four remaining isolates apparently lacking 1 to 3 loci (see Tables S1 and S2 in the supplemental material). Amplicons were easily distinguished based on fluorescent labeling and size (see Fig. S1 in the supplemental material). Within each of the two respective multiplex assays developed, no overlap in size was observed between identically labeled loci. Split peaks in electropherograms separated by a single base pair, a common CE artifact due to incomplete nontemplated 3= adenosine addition, were occasionally observed despite an extended final extension period (60 min). In such cases, the longer fragment was consistently selected for downstream analysis.
Due to the predictable nature of discrepancies identified between PCR fragment size as called by CE and Sanger sequencing, locus-specific correctional values were calculated (see Fig. S2 in the supplemental material) and employed for improved precision of CE-based VNTR fragment size calling. Single-base-pair deletions identified in a few strains in the downstream flank of VNTR loci YR3168 and YR1070 did not affect the number of predicted repeats. A total of 329 unique MLVA profiles were detected among the 484 Y. ruckeri isolates typed.
Allelic diversity and statistical evaluation. The allelic diversity within individual VNTR loci varied between 9 and 32 alleles (not counting missing amplicons), with Simpson's index of diversity (SID) values ranging from 0.64 to 0.87 ( Table 1). The SID for all 10 loci combined was Ͼ0.99, indicating the very high probability of separating nonclonal isolates. LIAN analysis resulted in a standardized index of association (I A s ) of 0.2772, which differs significantly from zero (P Monte Carlo Ͻ 0.0001). This confirms linkage disequilibrium, reflecting a low rate of recombination and the clonal nature of the investigated population.
MLVA cluster analysis. Minimum-spanning-tree (MST) cluster analysis of MLVA profiles, utilizing a relatively stringent cluster partitioning threshold (Յ4/10 nonidentical loci), placed 83% of the studied isolates within either of nine major clonal complexes (CC), each comprising five or more isolates. These clonal complexes were strongly biased toward one or more epizootiological attributes, e.g., host fish species, geographic origin, and/or serotype ( Fig. 1 and Tables 2 and S1). The remaining isolates represented either singletons or minor clusters. While extensive allele variation was evident in all 10 loci (Table 1), all 484 studied isolates could be linked to at least one other isolate by three or more common VNTR alleles.
c MLST sequence types following some modifications to the initial scheme (see Results). d Ten of 11 isolates in CC 7 were recovered from biofilm in a single freshwater salmon farm (no clinical symptoms were or had been reported). e All isolates in CC 8 and 9 were recovered from the egg fluid of clinically healthy brood stock in a single salmon farm.
f NA, not available.
within the 10 VNTR loci. A single strain acquired one additional repeat copy in YR1899 between passages 0 and 10 and subsequently two more copies in this locus between passages 30 and 40, while another single strain acquired one additional repeat copy in YR1070 between passages 30 and 40. MLVA typing of 19 Y. ruckeri isolates recovered from 13 fish during a single yersiniosis outbreak in a commercial salmon farm revealed a single extra repeat copy in YR3750 in one isolate compared to the remaining 18. Separate MST analysis involving four collections of epizootiologically related isolates verified farm-specific subclustering within CC 1 (Fig. 2). Comparative resolution of MLST versus MLVA. Sequence inconsistencies were discovered in two loci (thrA and recA) for identical strains in two previously published MLST studies (15,16). The sequence differences, which occurred throughout the respective data sets, were situated in the 6 to 13 terminal base pairs (both termini) of thrA and in the final base pair of recA (see Fig. S3 in the supplemental material). BLAST searches of available Y. ruckeri whole-genome sequences consistently identified sequences in agreement with the latest study (16). To obtain uniformity, the ambiguous sequence termini were removed prior to MLST meta-analysis in the current study, resulting in locus sizes of 286 and 471 bp for thrA and recA, respectively, versus 303 to 305 bp (thrA) and 472 bp (recA) in previous publications (15,16).
Comparison of MST diagrams based on MLST and MLVA data from 134 Y. ruckeri isolates revealed largely consistent clustering patterns and verified the considerably higher resolution of MLVA, with the 35 MLST sequence types included discriminated further into 123 distinct MLVA profiles (see Fig. S4 in the supplemental material). Multilocus sequence analysis (MLSA) based on concatenation of the truncated housekeeping gene sequences further highlighted the phylogenetic distances between the investigated MLST sequence types (see Fig. S5 in the supplemental material).

DISCUSSION
Infectious bacterial diseases of humans, plants, and animals are commonly caused by the emergence and spread of host-adapted, virulent endemic clones (see, e.g., references 23, 24, and 25). Characterization of clinical isolates to the clonal and subclonal levels is, therefore, essential to better understand the underlying epizootiology of any particular disease toward development of avoidance strategies and to aid selection of relevant strains for vaccine development. MLVA has been successfully deployed to describe the epidemiology/epizootiology at various scales for a number of bacterial pathogens of plants (26), mammals (27), and fish (20)(21)(22). We have developed a ten-locus MLVA assay for the fish pathogen Yersinia ruckeri and employed it to characterize the population structure within a collection of 484 isolates derived from highly diverse spatiotemporal and biological origins. Our findings support the previous contention (2) that this bacterium has an almost pan-global endemic distribution comprising, with the exception of some anthropogenically transported strains, geographically distinct and host-specific populations.
The MLVA is robust and internationally applicable, as proven by the detection of all 10 VNTR loci in Ͼ99% of the examined isolates. The minor variability following serial in vitro passage and low intra-outbreak variability, together with a combined SID of Ͼ0.99, suggest that the assay combines levels of both stability and variability suitable for epizootiological use. Largely consistent minimum-spanning-tree (MST) clustering of MLST and MLVA data underpins the suitability of both methods for inferring the Y. ruckeri population structure. The considerably higher level of strain differentiation provided by MLVA relative to MLST illustrates, however, the greater utility of MLVA for epizootiological study of "local" strains of the same MLST sequence type (see Fig. S4 in the supplemental material). MLST, which relies on sequence variability in evolutionarily conserved genes, may well provide a more unified picture of the overall population structure, simply due to its lower resolution.
Internationally, yersiniosis is most economically important in rainbow trout farming, and disease in this fish species is most commonly associated with Y. ruckeri serotype O1 strains of both biotypes 1 and 2. MLVA allocated 83% of the 81 serotype O1 isolates from rainbow trout examined to a single large clonal complex (CC 2), with the remaining isolates primarily occupying minor clonal complexes or appearing as singletons (Fig. 1a). CC 2 could be further subdivided into three main subpopulations (denoted a, b, and c), of which CC 2a and CC 2b comprise isolates originating almost exclusively from the United Kingdom and the United States, respectively, whereas CC 2c contains isolates primarily from continental Europe but also from Ireland, North America, and Peru (Fig. 1b). The presence of three geographically biased subpopulations within CC 2, with the "central" positioning of the U.S. cluster, is consistent with a situation in which CC 2 strains have spread from North America on at least two separate occasions: once to the United Kingdom and once to continental Europe and/or South America. The presumed direction of spread is further supported by the fact that the first detection of CC 2b (United States) predates that of the two other CC 2 subpopulations by 17 years or more. Due to the proximity within CC 2c of the relatively recent Peruvian isolates (recovered in 2008) to those from continental Europe, it appears likely that they both descend from the same North American lineage. Combined, these findings are supportive of the reported existence of geographically confined subpopulations of rainbow trout-associated Y. ruckeri serotype O1 in the mentioned regions (8).
Nonmotile Y. ruckeri biotype 2 strains increasingly dominate the disease situation in rainbow trout farming in many countries, reflecting independent and parallel evolution, presumably provoked by vaccines targeting flagellar antigens (8,9,28). In this regard, the proportion of rainbow trout isolates classified in the present study as biotype 2 increases from 33% to 77% for those recovered before and after the turn of the century, respectively. Clonal expansion of these mutants in the post-vaccination era is visualized in Fig. 1d, in which groups of biotype 2 isolates from rainbow trout form defined sublineages within CC 2. In the case of the geographically widely distributed CC 2c, MLVA clustering indicates emergence of biotype 2 in continental Europe after introduction of biotype 1 from North America (9) and is also consistent with an independent biotype shift in South America. This is in contrast to the biotype 2 phenotype of the United Kingdom-associated CC 2a, which appears to have been introduced from North America as biotype 2 (9). If this is in fact the case, however, it is hard to explain the occurrence of a single biotype 1 isolate, recovered in England 9 years after the first detection of CC 2a in the country, relatively deep within this clonal complex, unless it represents reversion from biotype 2 to biotype 1.
Two Scottish rainbow trout isolates described as serotype O8, a serotype most commonly associated with Atlantic salmon (7), were found within CC 2a (Fig. 1c), a situation which could be explained by recombinational events involving the lipopolysaccharide (LPS) biosynthesis cascade. The importance of LPS has recently been verified in elicitation of a protective immune response against Y. ruckeri in rainbow trout (29), and, conceivably, vaccine-related evolutionary pressures similar to those associated with the independent emergences of the various Y. ruckeri biotype 2 lineages may have prompted this putative serotype shift.
Yersiniosis in farmed Atlantic salmon is a significant problem in Norway, Scotland, Australia (Tasmania), and Chile and may be associated with various Y. ruckeri serotypes, although serotype O1 is generally considered to be the most virulent (2,4,7). In contrast to the situation with rainbow trout, most Y. ruckeri isolates from Atlantic salmon were separated by MLVA into discrete (unlinked) clonal complexes specific to particular geographic regions ( Fig. 1a and b and Table 2). The clonal complex currently dominating in Australian salmon farming (CC 5) includes two isolates recovered in 1959 in Victoria, Australia. This predates the import of Atlantic salmon to Tasmania between 1984 and 1986 from a landlocked population in New South Wales (established with Canadian stock in 1965) and therefore verifies the native status of this clone in Australia (2). Geographically biased clustering was also identified among isolates from Norwegian (CC 1 and 3) (see below) and Scottish (CC 4 and 6) Atlantic salmon ( Table 2). The high degree of diversity and spatially linked clustering among clinical Y. ruckeri isolates from Atlantic salmon supports, therefore, the previously proposed geographic endemism in this bacterium (2,7,8). Unfortunately, too few isolates from Chilean and North American salmon were examined to corroborate the existence/absence of salmonspecific clones in these areas.
While Y. ruckeri serotype O2 is sporadically detected in Norway, almost all major yersiniosis outbreaks in modern Norwegian salmon farming have been associated with serotype O1. Knowledge of the genetic diversity of Norwegian strains is scarce, however, and very few isolates had been characterized prior to the present study. Of the 286 isolates examined from Atlantic salmon in Norway, recovered between 1985 and 2017, we found 77% to belong to CC 1 (exclusively serotype O1) and 12% to CC 3 (exclusively serotype O2). Although CC 1 contains only Norwegian isolates, this clone may share a relatively recent ancestry with the "Scottish" CC 4, as they belong to the same MLST sequence type, in addition to three VNTR loci being entirely conserved across both of these clonal complexes. This association could conceivably be explained by geographic proximity, as may the presence of a single English isolate peripherally in the predominantly "Norwegian" CC 3 (Fig. 1b). In contrast, the appearance of a Chilean isolate (from 2008) relatively deep within this clonal complex is more likely to reflect anthropogenic spread. Despite the evident existence of various serotype O1 clones in Norway (Fig. 1b and c and Table 2), all serotype O1 isolates recovered from clinical yersiniosis cases in Norwegian Atlantic salmon since 1995 to date belong to CC 1, indicating the relatively high virulence of this clonal complex toward this fish species. MLVA clustering of isolates associated with individual freshwater farms over several years further verifies persistent colonization of these farms by individual CC 1 strains (Fig. 2).
As with most previous investigations involving Y. ruckeri, the collection examined in the present study is dominated by clinical isolates from diseased fish. As such, the scrutinized material provides a poor basis for investigation of the genetic structure within the overall population of what may well be an essentially environmental bacterial species. The disproportionate frequency of isolation of certain genotypes from diseased fish against a background of other genotypes does, however, provide support for increased host specificity/virulence in particular strains. There is also increasing evidence that avirulent strains of Y. ruckeri exist (30). In the present study, none of the Norwegian serotype O1 isolates cultured from the egg fluid of otherwise healthy brood stock salmon, or from biofilm within farming sites with no recorded history of yersiniosis, fell within the disease-associated CC 1 by MLVA. Instead, such isolates appeared entirely as singletons or formed distinct clonal complexes. In particular, CC 7 and CC 8 and 9, respectively, consist of nonclinical serotype O1 isolates recovered primarily from two separate yersiniosis-free freshwater salmon farms (Table 2). Conceivably, clonal expansion of host-adapted, virulent Y. ruckeri strains, from essentially environmental and/or commensal background populations, may have occurred independently in several salmon-producing countries and resulted in the observed geographic endemism.
In conclusion, this broad population study of Yersinia ruckeri substantially expands on the existing epizootiological history of this important fish pathogen and supports or verifies previous notions of host specificity, geographic endemism, and anthropogenic dissemination. Particularly, we verify by MLVA that yersiniosis in international rainbow trout farming is dominated almost entirely by a clonal strain of Y. ruckeri serotype O1 (CC 2) which appears to have been spread on separate occasions from North America to the United Kingdom and continental Europe, respectively. In contrast, we find that yersiniosis in international salmon farming is dominated mainly by geographically restricted and presumably native clones for which a recent common ancestry has not yet been clearly established. We show that a single, exclusively Norwegian, Y. ruckeri serotype O1 clone (CC 1) dominates the disease situation in Norwegian salmon farming. The MLVA assay further enables separation of putatively virulent and avirulent serotype O1 strains in Norway and indicates long-standing colonization of freshwater farms with specific Y. ruckeri strains. The scheme thus offers an extremely sensitive epizootiological tool that yields easily interpretable data. The entire procedure from agar plate to inclusion in an MST cluster analysis may be completed in less than a working day.

MATERIALS AND METHODS
Bacterial strains. A total of 484 Y. ruckeri isolates, including reference strains of serotypes O1, O2, O5, O6, and O7 (7,12), covering 19 species of fish/animals or environmental sources, 19 countries (from four continents), and 7 decades (1959 to 2017), were included in this study (see Table S1 in the supplemental material). Stock cultures, cryopreserved at Ϫ80°C, were revived on 5% bovine blood agar (BA) and incubated at 22°C for 1 to 2 days prior to further processing. All Norwegian isolates were serotyped by slide agglutination as previously described (31), using antisera raised against Y. ruckeri serotypes O1 (NCTC 12266), O2 (NCTC 12267), and O5 (NCTC 12268). Isolates previously serotyped in other laboratories were not re-serotyped in the present study. Biotyping of selected isolates was conducted as previously described (32). For PCR, genomic DNA was extracted by boiling bacterial cells from a single colony in 50 l Milli-Q water for 7 min, followed by centrifugation and use of the supernatant as template.
Identification of informative VNTR loci. Y. ruckeri genome assemblies retrieved from NCBI GenBank and/or generated in-house in participating laboratories (unpublished), representing a broad range of serotypes and spatiotemporal origins, were subjected to analysis with Tandem Repeats Finder v4 (33) in combination with BLAST searches. Of over one hundred putatively repetitive loci identified, 10 variable loci (Table 1), confirmed by singleplex PCR and Sanger sequencing (not shown), were selected for further MLVA development. VNTR locus selection criteria were (i) ubiquitous occurrence in Y. ruckeri, (ii) repeat unit size uniformity, (iii) extensive inter-strain copy number variation, and (iv) sufficiently conserved flanking regions. While minor repeat sequence heterogeneity was accepted, 100% conservation of repeat unit size was set as a requirement to allow precise calling of repeat numbers by CE. In accordance with suggested guidelines (34), the selected VNTR loci were annotated according to their position (closest kbp) within the PacBio-generated and circularized genome of Y. ruckeri strain CSF007-82 (accession no. LN681231).
Multiplex PCR and capillary electrophoresis (CE). Two multiplex PCR assays (I and II) ( Table 3) were established, each containing five primer pairs designed using MPprimer software (35) to provide an appropriate intra-assay amplicon size range. Forward primers (Applied Biosystems) were 5= labeled with either of three fluorescent dyes (6-carboxyfluorescein [6FAM], VIC, or NED). For each assay, care was taken to avoid amplicon size overlap between loci labeled with identical dyes.
Multiplex PCR mixtures (i.e., two per isolate tested) contained 12.5 l 2ϫ Multiplex PCR master mix (Qiagen), 0.1 to 0.2 M each appropriate primer pair (Table 3), 2 l DNA template, and a volume of RNase-free water amounting to a total reaction volume of 25 l. Subsequent PCRs involved, for both assays, (i) 5 min at 95°C (ii) 30 cycles of 0.5 min at 95°C, 1.5 min at 60°C, and 1 min at 72°C, and (iii) 60 min at 68°C, followed by cooling to 4°C indefinitely. PCR products were verified by gel electrophoresis and then diluted 1:10 (vol/vol) in Milli-Q water. From the diluted samples, 0.5 l was added to 9 l Hi-Di formamide (Applied Biosystems) and 0.5 l GeneScan 600 LIZ dye size standard v2.0 (Applied Biosystems). Samples were denatured for 3 min at 95°C prior to CE on an Avant 3500xl Genetic Analyser (Applied Biosystems) utilizing POP-7 polymer (Applied Biosystems) and the following settings: 5-s injections at 1.6 kV (32 V/cm) and a 32-min run time at 15 kV (300 V/cm) and 60°C.
VNTR fragment size calling and MLVA profiling. Electrophoretic peaks were identified and size-called in GeneMapper 5 (Applied Biosystems). Differences in VNTR fragment sizes as called by CE and Sanger sequencing were identified, a phenomenon previously attributed to biased amplicon mobility patterns in CE machines (36,37). These discrepancies were stable in relation to allele size, and size calls were subjected to locus-specific correction and converted to VNTR repeat counts according to the following formula (sizes in base pairs): VNTR repeat count ϭ (CE size call ϫ s ϩ i Ϫ amplified flank size)/VNTR repeat size, where s and i represent the slope and intersection point, respectively, for individual VNTR loci identified by plotting accurate PCR fragment sizes (as determined by Sanger sequencing) against corresponding fragment sizes called by CE. The line-of-best-fit equation for each locus was identified by linear regression utilizing data from 15 to 19 strains displaying various alleles and representative for the size spans observed (see Fig. S2 in the supplemental material). Each isolate was thus assigned a ten-digit integer string (MLVA profile) representing the number of whole repeats identified at each VNTR locus. Absent CE peaks were assigned a repeat count of 0.
Allelic diversity and statistical evaluation. Based on the observed allelic diversity, the discriminatory capacity of the studied VNTR loci, both individually and in combination, was evaluated by calculating SID values (38). Possible linkage disequilibrium among the loci was investigated using LIAN version 3.7 (39), employing the Monte Carlo model with 10 000 iterations. Only single representatives of each MLVA profile were included for LIAN analysis.
MLVA cluster analysis. MLVA profiles for all 484 isolates were imported into BioNumerics v6.6 (Applied Maths NV, Sint-Martens-Latem, Belgium), and MST cluster analysis performed with default settings. In the resulting MST diagram, a cluster (clonal complex) partitioning threshold of Յ4/10 nonidentical loci was employed, and branches representing Ͼ5/10 nonidentical loci were hidden. VNTR stability. Six Y. ruckeri isolates, representing various MLVA clonal complexes, were subjected to 40 serial passages at 1-to 2-day intervals to assess the in vitro stability of the VNTR loci. For each passage, single colonies were resown onto fresh BA plates and incubated at 22°C. MLVA profiles were obtained as described above from colonies harvested after 0, 10, 20, 30, and 40 passages. To assess the short-term in vivo stability of the VNTR loci, MLVA typing was also performed on 19 Y. ruckeri isolates recovered from 13 Atlantic salmon sampled simultaneously during an ongoing yersiniosis outbreak in a Norwegian freshwater facility. The longer-term environmental stability of the VNTR loci under industrial aquaculture production conditions was assessed by MLVA typing performed on putative "house strain" isolates associated with four freshwater production sites for Atlantic salmon smolts with a history of recurring yersiniosis. For each site, 4 to 33 isolates, recovered over a period of 2 to 6 years, were examined.
Comparative resolution of MLST versus MLVA. DNA sequences for the six previously published MLST loci (15) were extracted from available Y. ruckeri genome assemblies (NCBI GenBank or unpublished). Published MLST sequences (15,16) were downloaded from www.pubmlst.org/yruckeri and NCBI GenBank, respectively. AT and ST designations were assigned in accordance with, or as a continuation of, previous MLST studies (15,16). Necessary trimming of two loci (see Results) resulted in reclassification of several previously published AT/ST profiles.
AT profiles from 134 isolates (involving 35 ST) for which both MLST and MLVA information was available (see Table S1 in the supplemental material) were subsequently imported into BioNumerics v6.6. The epizootiological resolution of the two methods was then compared by MST cluster analyses (default settings) based on MLVA and MLST data, respectively. The truncated DNA sequences underlying the modified MLST were also subjected to MLSA. As such, the concatenated gene sequences were aligned with ClustalX (40) and used for constructing a maximum-likelihood tree in MEGA6 (41) with default settings employed.
Accession number(s). DNA sequences corresponding to the four novel Y. ruckeri MLST allele types identified during the current study were submitted to NCBI GenBank under accession no. MH156841 to MH156844 (see Table S3 in the supplemental material).

SUPPLEMENTAL MATERIAL
Supplemental material for this article may be found at https://doi.org/10.1128/AEM .00730-18.