Skip to main content
  • ASM
    • Antimicrobial Agents and Chemotherapy
    • Applied and Environmental Microbiology
    • Clinical Microbiology Reviews
    • Clinical and Vaccine Immunology
    • EcoSal Plus
    • Eukaryotic Cell
    • Infection and Immunity
    • Journal of Bacteriology
    • Journal of Clinical Microbiology
    • Journal of Microbiology & Biology Education
    • Journal of Virology
    • mBio
    • Microbiology and Molecular Biology Reviews
    • Microbiology Resource Announcements
    • Microbiology Spectrum
    • Molecular and Cellular Biology
    • mSphere
    • mSystems
  • Log in
  • My alerts
  • My Cart

Main menu

  • Home
  • Articles
    • Current Issue
    • Accepted Manuscripts
    • COVID-19 Special Collection
    • Archive
    • Minireviews
  • For Authors
    • Submit a Manuscript
    • Scope
    • Editorial Policy
    • Submission, Review, & Publication Processes
    • Organization and Format
    • Errata, Author Corrections, Retractions
    • Illustrations and Tables
    • Nomenclature
    • Abbreviations and Conventions
    • Publication Fees
    • Ethics Resources and Policies
  • About the Journal
    • About AEM
    • Editor in Chief
    • Editorial Board
    • For Reviewers
    • For the Media
    • For Librarians
    • For Advertisers
    • Alerts
    • RSS
    • FAQ
  • Subscribe
    • Members
    • Institutions
  • ASM
    • Antimicrobial Agents and Chemotherapy
    • Applied and Environmental Microbiology
    • Clinical Microbiology Reviews
    • Clinical and Vaccine Immunology
    • EcoSal Plus
    • Eukaryotic Cell
    • Infection and Immunity
    • Journal of Bacteriology
    • Journal of Clinical Microbiology
    • Journal of Microbiology & Biology Education
    • Journal of Virology
    • mBio
    • Microbiology and Molecular Biology Reviews
    • Microbiology Resource Announcements
    • Microbiology Spectrum
    • Molecular and Cellular Biology
    • mSphere
    • mSystems

User menu

  • Log in
  • My alerts
  • My Cart

Search

  • Advanced search
Applied and Environmental Microbiology
publisher-logosite-logo

Advanced Search

  • Home
  • Articles
    • Current Issue
    • Accepted Manuscripts
    • COVID-19 Special Collection
    • Archive
    • Minireviews
  • For Authors
    • Submit a Manuscript
    • Scope
    • Editorial Policy
    • Submission, Review, & Publication Processes
    • Organization and Format
    • Errata, Author Corrections, Retractions
    • Illustrations and Tables
    • Nomenclature
    • Abbreviations and Conventions
    • Publication Fees
    • Ethics Resources and Policies
  • About the Journal
    • About AEM
    • Editor in Chief
    • Editorial Board
    • For Reviewers
    • For the Media
    • For Librarians
    • For Advertisers
    • Alerts
    • RSS
    • FAQ
  • Subscribe
    • Members
    • Institutions
Food Microbiology

Implications of Mobile Genetic Elements for Salmonella enterica Single-Nucleotide Polymorphism Subtyping and Source Tracking Investigations

Shaoting Li, Shaokang Zhang, Leen Baert, Balamurugan Jagadeesan, Catherine Ngom-Bru, Taylor Griswold, Lee S. Katz, Heather A. Carleton, Xiangyu Deng
Edward G. Dudley, Editor
Shaoting Li
aCenter for Food Safety, Department of Food Science and Technology, University of Georgia, Griffin, Georgia, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Shaokang Zhang
aCenter for Food Safety, Department of Food Science and Technology, University of Georgia, Griffin, Georgia, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Leen Baert
bNestlé Institute of Food Safety and Analytical Sciences, Nestlé Research, Vers-chez-les-Blanc, Lausanne, Switzerland
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Balamurugan Jagadeesan
bNestlé Institute of Food Safety and Analytical Sciences, Nestlé Research, Vers-chez-les-Blanc, Lausanne, Switzerland
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Catherine Ngom-Bru
bNestlé Institute of Food Safety and Analytical Sciences, Nestlé Research, Vers-chez-les-Blanc, Lausanne, Switzerland
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Taylor Griswold
cEnteric Diseases Laboratory Branch, Centers for Diseases Control and Prevention, Atlanta, Georgia, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Lee S. Katz
aCenter for Food Safety, Department of Food Science and Technology, University of Georgia, Griffin, Georgia, USA
cEnteric Diseases Laboratory Branch, Centers for Diseases Control and Prevention, Atlanta, Georgia, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Heather A. Carleton
cEnteric Diseases Laboratory Branch, Centers for Diseases Control and Prevention, Atlanta, Georgia, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Xiangyu Deng
aCenter for Food Safety, Department of Food Science and Technology, University of Georgia, Griffin, Georgia, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Edward G. Dudley
The Pennsylvania State University
Roles: Editor
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
DOI: 10.1128/AEM.01985-19
  • Article
  • Figures & Data
  • Info & Metrics
  • PDF
Loading

ABSTRACT

Single-nucleotide polymorphisms (SNPs) are widely used for whole-genome sequencing (WGS)-based subtyping of foodborne pathogens in outbreak and source tracking investigations. Mobile genetic elements (MGEs) are commonly present in bacterial genomes and may affect SNP subtyping results if their evolutionary history and dynamics differ from that of the bacterial chromosomes. Using Salmonella enterica as a model organism, we surveyed major categories of MGEs, including plasmids, phages, insertion sequences, integrons, and integrative and conjugative elements (ICEs), in 990 genomes representing 21 major serotypes of S. enterica. We evaluated whether plasmids and chromosomal MGEs affect SNP subtyping with 9 outbreak clusters of different serotypes found in the United States in 2018. The median total length of chromosomal MGEs accounted for 2.5% of a typical S. enterica chromosome. Of the 990 analyzed S. enterica isolates, 68.9% contained at least one assembled plasmid sequence. The median total length of assembled plasmids in these isolates was 93,671 bp. Plasmids that carry high densities of SNPs were found to substantially affect both SNP phylogenies and SNP distances among closely related isolates if they were present in the reference genome for SNP subtyping. In comparison, chromosomal MGEs were found to have limited impact on SNP subtyping. We recommend the identification of plasmid sequences in the reference genome and the exclusion of plasmid-borne SNPs from SNP subtyping analysis.

IMPORTANCE Despite increasingly routine use of WGS and SNP subtyping in outbreak and source tracking investigations, whether and how MGEs affect SNP subtyping has not been thoroughly investigated. Besides chromosomal MGEs, plasmids are frequently entangled in draft genome assemblies and yet to be assessed for their impact on SNP subtyping. This study provides evidence-based guidance on the treatment of MGEs in SNP analysis for Salmonella to infer phylogenetic relationship and SNP distance between isolates.

INTRODUCTION

Foodborne illness continues to be a major public health challenge. Each year an estimated total of 9.4 million episodes of foodborne illness occur in the United States, resulting in 128,000 hospitalizations and 3,000 deaths (1). Subtyping and differentiation of closely related strains of foodborne infectious disease agents are important for timely detection of foodborne outbreaks, accurate identification of transmission vehicles, and successful trace-backs to contamination sources (2). The increasingly routine use of whole-genome sequencing (WGS) for subtyping of foodborne pathogens has been transforming foodborne illness surveillance and outbreak investigation (2–4), as well as pathogen source tracking in food production environments (5).

Mobile genetic elements (MGEs) are commonly present in bacterial genomes. Major categories of MGEs in bacteria include plasmids, phages, insertion sequences (IS), integrons, and integrative and conjugative elements (ICEs) (6). Some plasmids carry virulence and antibiotic resistance genes of clinical and epidemiological importance (7–9); their identification and characterization may help outbreak and source tracking investigations. Besides plasmids, other MGEs, such as phages, insertion sequences (IS), integrons, and ICEs, play important roles in bacterial virulence, antibiotic resistance, niche adaptation, and evolution, which have been reviewed elsewhere (6, 10–12). We speculate that MGEs may affect single-nucleotide polymorphism (SNP) subtyping results if their evolutionary history and dynamics differ from that of the bacterial chromosomes.

SNP identification from WGS data is one of the most widely used methods for WGS-aided subtyping of foodborne bacterial pathogens (2). Several factors, according to our observations, contribute to the popularity of using SNPs for foodborne pathogen subtyping. First, determination of a single-base-pair difference is algorithmically easier than other sequence polymorphisms such as insertions, deletions, and repetitions, especially from short sequencing reads. Second, the common prerequisite for SNP subtyping is a single reference genome for SNP identification in genomes to be investigated (query genomes). In comparison, whole-genome multilocus sequence typing (wgMLST), the other major WGS-based subtyping method, relies on a database of many allelic loci, which are available for several foodborne pathogens, but not all pathogens (13–15). Finally, SNPs are useful phylogenetic markers to infer evolutionary relationship among taxa and organisms (16). Both SNP subtyping and wgMLST can produce hierarchical clustering of isolates that can indicate evolutionary relationship between isolates. However, SNP differences in wgMLST only lead to a single allelic difference if the SNPs are clustered on the same allele and do not capture intragenic nor intergenic differences. Therefore, SNP phylogenies may give a higher-resolution view of ancestral relationships and phylogenetic distances (17).

In the most common practice of SNP subtyping, a bacterial genome is designated as a reference to which raw sequencing reads of a query genome are aligned to identify SNPs that meet certain quality criteria (17, 18). SNPs derived from pairwise comparisons between every query and the reference are combined to identify all SNP positions along the reference genome, where at least one query has a SNP. The number of SNPs between two isolates, regularly referred to as the SNP distance, is often used as a measure of phylogenetic distance between isolates (17).

In outbreak and source tracking investigations, isolates of interest are often clonally related, differing by small numbers of SNPs (e.g., <100 [19]). If the reference is distantly related to the clonal group, these SNPs may not be identified because genome regions harboring the SNP positions may be absent from the reference. Consequently, the phylogenetic structure among query isolates can be collapsed due to lack of differentiation among the queries. Therefore, the choice of reference genome is critical to accurately infer the relatedness of the isolates and identify an outbreak cluster. It is desirable to use a reference that is phylogenetically closely related to the queries, and a common strategy is to designate a query genome within the clonal group as the reference (20, 21).

When using a query genome as the reference for SNP typing, the query is often not fully assembled into a circular chromosome, especially in routine surveillance and investigations where short sequencing reads (150 to 300 bp) are most commonly used (22, 23). Draft assemblies from short reads typically contain multiple assembled genome fragments of various lengths called contigs (24). Because plasmids are common in bacteria, they are often cosequenced and coassembled, along with bacterial chromosomes, and consequently mixed with chromosomal contigs in draft assemblies (25). We reason that if a plasmid-containing draft assembly is used as the reference for SNP subtyping, it is possible that the plasmid can affect the inference of phylogeny and SNP distances among query isolates. The impact can be substantial if the interfering plasmid has a different evolutionary history than the chromosome of the pathogen. For example, plasmids can be involved in horizontal gene transfer (HGT), which may confound a phylogenetic inference that is based on vertical transmission of genetic information from ancestors to progenies (26). In addition to plasmids, other types of MGEs involved in HGT may also cause phylogenetic interference (27). Since the determination of the phylogenetic relationship between isolates derived from SNP subtyping plays an important role in outbreak and source tracking investigations, it is important to know the impact of MGEs on the phylogeny.

For SNP typing, regions in the reference genome that are involved in HGT and/or part of MGEs may be masked in order to exclude these regions from subtyping and phylogeny building (28–30). As elevated SNP densities may signal HGT and a mosaic genome composition (31), the Lyve-SET pipeline used in outbreak surveillance and investigation at the Centers for Diseases Control and Prevention (CDC) excludes clustered SNPs by removing SNP positions that are nearby each other (typically ≤5 bp for Salmonella) (17). Lyve-SET also typically excludes SNPs in phage regions by comparing the reference genome assembly to the PHAST database (32). Similarly, the CFSAN SNP pipeline developed by the U.S. Food and Drug Administration allows default removal of SNPs that occur in high densities over short stretches of genomes (>3 SNPs per 1,000 bp) (18, 33). However, the actual impact of various categories of MGEs on SNP subtyping and source tracking investigation of foodborne pathogens has not been evaluated.

S. enterica is one of the most prevalent foodborne pathogens worldwide (34). It has a highly diverse population structure, including more than 2,600 reported serotypes (35). Some serotypes are genetically homogenous, and their subtyping has been substantially improved by WGS and SNP analysis (36). Using S. enterica as a model organism, the objectives of this study were to survey major categories of MGEs, including plasmids, in major S. enterica serotypes and evaluate the impact (or lack thereof) of MGEs on SNP subtyping by retrospectively investigating nine outbreak clusters.

RESULTS

Overview of detected chromosomal MGE in 21 major S. enterica serotypes.We surveyed prophages, IS, integrons, and ICEs in 21 major serotypes of S. enterica on bacterial chromosomes (chromosomal MGEs). The majority of the serotypes analyzed here are among the most prevalent serotypes in human infections in the United States. From 2007 to 2016, these serotypes accounted for about 70% of laboratory-confirmed Salmonella isolates from human sources reported to the CDC (37). Overall abundances of chromosomal MGEs by total MGE lengths vary among serotypes, from 60,212 bp in Kentucky to 189,210 bp in Weltevreden (Fig. 1A). The median total length of chromosomal MGEs of all surveyed genomes (n = 990) was 120,625 bp, about 2.5% of an average S. enterica chromosome (4,800,000 bp). The median chromosomal MGE count of surveyed genomes was 14, with serotype Weltevreden having the highest median count (n = 21) and serotype Javiana having the lowest median count (n = 10) (Fig. 1B). Prophages and IS accounted for almost all the chromosomal MGEs. ICE and integrons were rarely found on S. enterica chromosomes (Fig. 1) but were occasionally found on S. enterica plasmids (data not shown). Integrons were most consistently found in serotype Typhimurium, although they were sporadically found in the 20 other serotypes (Fig. 1B). Majority of isolates from serotypes Dublin, Oranienburg, and Senftenberg were found to carry one ICE, while ICEs were rarely detected in the seventeen other serotypes (Fig. 1B).

FIG 1
  • Open in new tab
  • Download powerpoint
FIG 1

Abundance of detected chromosomal MGE in 21 major S. enterica serotypes. Red lines indicate the median lengths or counts for individual serotypes and all serotypes. (A) Distribution of cumulative MGE length (bp) by serotype. (B) Distribution of MGE counts by serotype.

Overview of detected plasmids in 21 major S. enterica serotypes.At least one plasmid contig was detected in 682 of the 990 (68.9%) surveyed S. enterica genomes. For serotypes Berta, Dublin, Enteritidis, Heidelberg, Kentucky, Saintpaul, Typhimurium, and Weltevreden, more than 80% of analyzed genomes were found to have plasmid(s). In comparison, less than 30% of the genomes of serotypes Javiana, Paratyphi B (the gastrointestinal pathotype), and Thompson contained plasmids (Fig. 2A). The median total length of plasmid sequences of the 682 plasmid-carrying genomes was 93,671 bp. Serotype Infantis had the longest median total length per genome (309,201 bp) (Fig. 2B), likely due to a megaplasmid (pESI, ∼280,000 bp) associated with this serotype (38). This megaplasmid was found in 19 of 45 Infantis genomes analyzed in this study (data not shown). The median plasmid count of the 682 genomes was two per genome, with serotypes Anatum, Berta, Braenderup, Dublin, Heidelberg, Infantis, Kentucky, Muenchen, Newport, Saintpaul, and Senftenberg exceeding a median of two plasmids per genome. Serotypes Agona, Bareilly, Enteritidis, Javiana, Montevideo, Oranienburg, Paratyphi B, Thompson, and Weltevreden have a median of one plasmid per genome (Fig. 2C). Serotypes Enteritidis, Dublin, and Typhimurium are known to harbor the Salmonella virulence plasmid that carries the spv virulence locus (8, 39). Three spv loci (spvB, spvC, and spvR) were found on identified plasmids from most of the Enteritidis (44 of 50) and Dublin (46 of 47) genomes and more than half of the Typhimurium genomes (26 of 48) analyzed in this study (data not shown), confirming the common presence of the virulence plasmid in the strains of these serotypes.

FIG 2
  • Open in new tab
  • Download powerpoint
FIG 2

Abundance of detected plasmids in 21 major S. enterica serotypes. Red lines indicate the median lengths or counts for individual serotypes and all serotypes. (A) Percentages of isolates that have plasmid(s) detected in individual serotypes and all serotypes. (B) Distribution of cumulative plasmid length (bp) within plasmid-positive isolates. (C) Distribution of plasmid counts within plasmid-positive isolates.

Impact of plasmids in reference genomes on SNP subtyping.We evaluated whether plasmid sequences in the reference genome, often unidentified and entangled with chromosome contigs in draft genome assemblies, had any impact on SNP subtyping. For this evaluation, we retrospectively investigated nine outbreak clusters of 461 S. enterica isolates detected by PulseNet USA in 2018. An outbreak cluster is defined as a set of epidemiologically related isolates with the source of infection either confirmed or suspected. Each cluster represented a specific Salmonella serotype and contained isolates carrying zero, one, or multiple plasmids. A total of 12 major plasmids (i.e., plasmid clusters, see Materials and Methods) were detected in 168 isolates of eight outbreak clusters (Table 1). A Chailey cluster had no detectable plasmid (data not shown). A serotype Newport cluster had plasmids with no SNPs detected among outbreak isolates (Table 1). Phylogenies and pairwise SNP distances of the other seven outbreak clusters were affected by plasmid-containing reference genomes to various degrees (Table 1 and Fig. 3; see also Fig. S1 in the supplemental material). SNP analysis of outbreak clusters by both CFSAN SNP Pipeline and Lyve-SET generated similar results (Fig. S1 and S2). Only the CFSAN SNP pipeline results are shown in Table 1.

View this table:
  • View inline
  • View popup
  • Download powerpoint
TABLE 1

Major plasmids detected in outbreak clusters

FIG 3
  • Open in new tab
  • Download powerpoint
FIG 3

Impact of plasmids on SNP subtyping of a S. Typhimurium outbreak cluster of 17 isolates. Phylogenies were inferred from SNPs and rooted at midpoint. Pairwise SNP distances (0 to 524 SNPs) are indicated by heat maps. A total of 891 SNP sites were identified (Table 2). Colors in the heat maps indicate the log-transformed numbers of pairwise SNP distances between isolates, with blue being the lowest and light orange being the largest. The scale bars measure the numbers of substitutions per site. Ratios on certain tree branches indicate branch support by the default approximate likelihood ratio test (aLRT) in PhyML. Only support greater than 0.8 is shown. (A) Genome SRR6107509 that contained plasmid sequences was used as reference for SNP analysis. Clustered SNPs were included. (B) Genome SRR6317337 that did not contain any detectable plasmid was used as reference for SNP analysis. Clustered SNPs were included. (C) Plasmid-containing genome SRR6107509 was used as the reference for SNP analysis. Clustered SNPs were excluded using the default CFSAN SNP Pipeline setting. (D) Plasmid-containing genome SRR6107509 was used as the reference with plasmids masked from reads mapping and SNP calling. Clustered SNPs were included.

Taking a serotype Typhimurium outbreak as an example, a plasmid contig with assembled length between 42,139 bp and 96,288 bp was found in 10 of the 17 isolates in the cluster (Table 1, c476). When including clustered SNPs (see Materials and Methods), the use of an in-cluster reference genome (SRR6107509) that contained the plasmid led to the identification of two major clades and substantial SNP distances in between (Fig. 3A). Switching to a different in-cluster reference genome (SRR6317337) that did not contain any detectable plasmid revealed phylogenetic polytomies (unresolved to bifurcating branches because of unknown branching orders) with reduced pairwise SNP distances and unresolved branching orders among most of the isolates in the cluster (Fig. 3B). The interfering plasmid contig (Table 1, c476) featured high SNP densities (clustered SNPs) especially in ICE- and IS-associated regions (Fig. 4). A similar impact of plasmids on SNP phylogeny was observed in outbreak clusters of other serotypes analyzed in this study (Fig. S1).

FIG 4
  • Open in new tab
  • Download powerpoint
FIG 4

SNP density of the identified plasmid contig (c476) on the serotype Typhimurium reference genome (SRR6107509). (A) An SNP density curve that shows changing numbers of SNP loci within a sliding window of 1,000 bp across the whole plasmid contig. (B) SNP loci on the plasmid contig. Each vertical line represents a SNP locus. The orange bar indicates an ICE-associated region. The blue bar indicates an IS-associated region detected on the plasmid contig.

The CFSAN SNP Pipeline has a filter to remove clustered SNPs. By the default setting, this filter only permits a maximum of three SNPs of >1,000 bp. To evaluate whether the default filter could offset the impact of the high SNP density plasmid on SNP subtyping, the serotype Typhimurium cluster was reanalyzed by using the plasmid-containing genome (SRR6107509) as the reference and applying the default filter to remove clustered SNPs. While the impact of the plasmid was apparently alleviated, delineation of two major clades and occurrences of increased SNP distances (maximum of 14 SNPs) were still observed (although not with high branch support) (Fig. 3C). However, if the plasmid was identified in the reference genome (SRR6107509) and masked from reads mapping and SNP calling, the polytomic phylogeny and smaller SNP distances (maximum of 8 SNPs) among the outbreak isolates were recovered even with the clustered SNP filter disabled (Fig. 3D).

SNP positions on chromosomal MGEs.In addition to plasmids, we assessed whether chromosomal MGEs, including prophages, IS, integrons, and ICEs, affected SNP subtyping. For this assessment, we analyzed the nine outbreak clusters and searched for SNP loci that were located in chromosomal MGEs. Overall, isolates within each cluster shared similar chromosomal MGE contents (Fig. S3), suggesting that the choice of in-cluster reference genome had mild impact on the discovery of chromosomal MGE-borne SNPs. For consistency with aforementioned SNP subtyping investigations, a plasmid-containing reference genome specified in Table 1 for each cluster was used as the reference for this analysis. When clustered SNPs were excluded according to the default setting of the CFSAN SNP pipeline, not a single SNP locus was found on chromosomal MGEs in the Chailey, Paratyphi B, and Typhimurium clusters. Similarly, only a small fraction of SNP positions (from 1 to 7%) were detected on chromosomal MGEs in each of the other six clusters; almost all of these SNPs were located on prophages (Table 2). When clustered SNPs were included, SNPs on chromosomal MGEs still had no or limited impact on total SNP counts, contributing no (Chailery, Paratyphi B, and Typhimurium clusters), <2% (Agbeni, Litchfield, Newport, and Thompson clusters), or <10% (Heidelberg and Saintpaul clusters) of all SNPs (Table 2). Almost all the SNPs on chromosomal MGE, clustered or not, were located on prophages (Table 2).

View this table:
  • View inline
  • View popup
  • Download powerpoint
TABLE 2

SNP loci on chromosomal MGEs in nine outbreak clusters

Excluding clustered SNPs substantially reduced the numbers of SNP positions on chromosomal MGEs in the Heidelberg (from 170 to 7) and the Saintpaul (from 299 to 7) clusters but only had a minor (reduction of 1 SNP in the Litchfield cluster) or no impact on the identification of SNPs on chromosomal MGEs in the other seven clusters (Table 2). Both the Heidelberg and the Saintpaul clusters appeared to be polyclonal by involving multiple strains, as indicated by larger amounts of SNP loci compared to other clusters (Table 2). Prophages contributed nearly all the chromosomal MGE-borne SNPs in the two clusters (Table 2).

DISCUSSION

Our study demonstrates the importance of disentangling plasmids from bacterial genomes prior to SNP subtyping. The involvement of plasmids as part of the reference for SNP typing was shown to distort the inference of phylogeny and SNP distance among closely related isolates. The distortion can occur if the implicated plasmid has a different evolutionary history than the bacterial chromosome (40, 41). Plasmids can shuttle among bacteria through conjugation-mediated HGT and frequently carry cargos such as antibiotic resistance genes, virulence factors, and other MGEs (42). These plasmid-borne genes and elements may evolve under different selective pressures than the chromosome (43). The resulting SNP phylogeny contains convoluted phylogenetic signals from plasmids and chromosomes, which may lead to misinterpretation of the population structure of the outbreak cluster. Phylogenetic polytomies were frequently observed in analyzed Salmonella outbreak clusters using a plasmid-free reference genome for SNP typing (Fig. 3B; Fig. S1). Unresolved branching order may better reflect the limited degree of chromosomal diversification, if any, among closely related outbreak isolates during the span of the outbreak. In comparison, some dichotomic splits of outbreak isolates were caused by signals of plasmid origin. Epidemiological and source tracking investigations are performed under the assumption that phylogenetic relatedness between case-defining isolates is proportional to epidemiological association between cases. It should be noted that the eight plasmid-containing outbreak clusters consisted of a mixed set of isolates, i.e., some isolates did not have any plasmid, while others had different plasmids. In case all isolates from one outbreak cluster contain the same plasmid, using a reference genome with this plasmid had no impact upon SNP subtyping (the Newport outbreak cluster in this study).

Numerous tools have been developed for plasmid analysis using WGS data. While reconstruction of plasmids is known to be challenging from short read WGS data due to frequent occurrences of repetitive elements on plasmids, identification of sequences of plasmid origin has been shown to be tractable and effective (44, 45). Even though the identified plasmid sequences might be incomplete, fragmented, or incorrectly assembled, their identification could help prevent plasmids from confounding SNP subtyping. In this study, we used MOB-suite to identify plasmids from the draft genome assemblies. MOB-suite has been reported to deliver high sensitivity and specificity in plasmid identification (44).

A common approach to prevent MGE from affecting SNP typing is the exclusion of clustered SNPs. However, our retrospective investigation of a serotype Typhimurium outbreak cluster suggests that residual plasmid SNPs after excluding clustered SNPs were still enough to affect the inference of phylogeny and SNP distances (Fig. 3C). In addition, some plasmids may have SNPs that are not clustered or are below the threshold of a clustered SNP filter. The filter will not act on these SNPs, which may still affect inferred phylogeny and SNP distance.

In polyclonal outbreak clusters investigated in this study where multiple strains were involved, including the Heidelberg and the Saintpaul clusters, chromosomal MGEs, in particular, prophages were found to carry numerous SNPs. However, these SNPs only accounted for a fraction of all SNPs (<10%), and the common practice of excluding clustered SNPs effectively eliminated most of these SNPs.

While we found plasmids interfering and chromosomal MGEs to be less consequential in SNP subtyping analysis of closely related isolates, these elements may not be categorically disregarded for outbreak and source tracking investigations. Plasmids can provide clinically relevant information if they carry virulence factors and antibiotic resistance genes in outbreak isolates. Sequence markers related to plasmids and phages were recently discovered as highly informative predictors of livestock sources of serotype Typhimurium and may be used for zoonotic source attribution of the pathogen (46). Our study showed that plasmids in reference genomes can distort the phylogenetic signal when carrying out SNP subtyping. Therefore, excluding SNPs originating from these MGEs is required specifically for constructing SNP phylogenies and measuring pairwise SNP distances, both of which were routinely used to generate evidence and clues for outbreak investigation and source tracking investigations. Plasmid identification tools, such as the MOB-suite used in this study, may be incorporated into existing SNP typing pipelines to allow appropriate treatment of plasmid sequences in subtyping and source tracking investigations.

MATERIALS AND METHODS

Genomes.A total of 1,511 Salmonella enterica genomes were analyzed (Table S1). The serotypes of the genomes were confirmed by SeqSero (47).

To obtain an overview of MGE abundance in S. enterica, we sampled 50 genomes from each of the following 21 serotypes that are common in human infections according to U.S. surveillance data (37), including Agona, Anatum, Bareilly, Berta, Braenderup, Dublin, Enteritidis, Heidelberg, Infantis, Javiana, Kentucky, Montevideo, Muenchen, Newport, Oranienburg, Paratyphi B, Saintpaul, Senftenberg, Thompson, Typhimurium, and Weltevreden. To represent the phylogenetic diversity of a selected serotype, available genomes of the serotype at GenomeTrakr (48) (NCBI BioProject PRJNA183844) as of September 2018 were collected. Pairwise mutation distances among these genomes were estimated by Mash using raw sequencing reads (49). A neighbor-joining tree was constructed from the distances using Mashtree (50) to approximate the phylogeny of the analyzed genomes. From each tree, 50 genomes were selected to evenly span the tree and represent all major clusters. After draft genome assembly were conducted as detailed below, a total of 60 genomes were excluded from further analyses due to suboptimal assembly quality as evaluated by N50 (N50 < 120,000). The final data set included 990 genomes representing 21 major Salmonella serotypes.

To evaluate how MGEs affect SNP subtyping, we collected 461 genomes from nine outbreak clusters detected by PulseNet USA in 2018. These clusters were selected due to suspected presence of plasmids in outbreak isolates. Each cluster contained genomes of a specific serotype, including Agbeni (n = 45), Chailey (n = 14), Heidelberg (n = 94), Litchfield (n = 8), Newport (n = 29), Paratyphi B (n = 25), Saintpaul (n = 110), Thompson (n = 119), and Typhimurium (n = 17).

Sequencing read trimming, filtering, and de novo assembly.Raw sequencing reads were trimmed and filtered by Trimmomatic v0.36 (51). The leading three and the trailing three nucleotides were removed from the reads, and a 4-nucleotide sliding window was used to remove nucleotides from the 3′ ends when the average Phred score dropped below 20. Reads shorter than 50 bp were discarded. Trimmed and filtered reads were assembled into draft genomes (contigs) using SPAdes v3.9.0 (52) with the “-careful” option. The quality of the draft genome assembly was evaluated with QUAST v4.5 (53). Only assemblies with an N50 contig size of >120,000 were used for further analysis.

SNP calling and the construction of phylogenetic tree.High-quality SNP were identified using Lyve-SET (17) with default settings and CFSAN SNP Pipeline v2.0.2 (18) with modifications to default settings to include clustered SNPs when needed. Specifically, the maximum number of SNP allowed in a window size (1,000) was set to 1,000 to allow the detection of clustered SNPs. SNP identified by the CFSAN SNP pipeline were used to build maximum-likelihood phylogenetic trees using PhyML (54). For SNP subtyping, a draft genome assembly of an isolate from each cluster was used as the reference for reads mapping and SNP calling. To study the impact of plasmids on SNP typing of an outbreak cluster, in-cluster genomes with the most- and least-detected plasmids by total plasmid length were used as reference genomes.

MGE identification.Five major categories of MGEs, including prophages, IS, integrons, ICEs, and plasmids, were identified from draft genome assemblies. Intact prophages were identified by PHASTER (32). Insertion sequences were identified by ISEScan v1.5.4 (55). Integrons were identified by Integron Finder v2.0 with the –local_max option (56). ICEs were detected by using CONJScan of MacSyFinder (57). Default settings were used for these programs. Contigs of plasmid origin were identified, clustered, and typed by MOB-suite using default settings (44). The plasmid counts are based on the number of plasmid clusters, and the plasmid lengths are based on the sum of plasmid-like contigs. For example, if three contigs were identified as plasmid sequences and they all belonged to the same plasmid cluster, the number of plasmid would be counted as one. Prophages, IS, integrons, and ICEs that were not on plasmid contigs were assigned as chromosomal MGEs.

MGE clustering.Predicted prophages, IS, integrons, and ICE sequences were grouped into the single cluster if they share >75% sequence identity as determined by VSEARCH analysis (58). Clustering analysis of plasmids was performed as part of the plasmid characterization analysis by MOB-suite. Single-linkage clustering was performed using default distance thresholds that had been heavily optimized for publicly available Enterobacteriaceae plasmids (44). Each cluster was considered to be a distinct MGE.

Data availability.Accession numbers of genomes used in this study are summarized in Table S1.

ACKNOWLEDGMENT

This study was funded by Nestlé Research.

FOOTNOTES

    • Received 28 August 2019.
    • Accepted 30 September 2019.
    • Accepted manuscript posted online 4 October 2019.
  • Supplemental material for this article may be found at https://doi.org/10.1128/AEM.01985-19.

  • Copyright © 2019 American Society for Microbiology.

All Rights Reserved.

REFERENCES

  1. 1.↵
    1. Scallan E,
    2. Hoekstra RM,
    3. Angulo FJ,
    4. Tauxe RV,
    5. Widdowson MA,
    6. Roy SL,
    7. Jones JL,
    8. Griffin PM
    . 2011. Foodborne illness acquired in the United States: major pathogens. Emerg Infect Dis 17:7–15. doi:10.3201/eid1701.p11101.
    OpenUrlCrossRefPubMedWeb of Science
  2. 2.↵
    1. Deng X,
    2. den Bakker HC,
    3. Hendriksen RS
    . 2016. Genomic epidemiology: whole-genome-sequencing-powered surveillance and outbreak investigation of foodborne bacterial pathogens. Annu Rev Food Sci Technol 7:353–374. doi:10.1146/annurev-food-041715-033259.
    OpenUrlCrossRef
  3. 3.↵
    1. Jackson BR,
    2. Tarr C,
    3. Strain E,
    4. Jackson KA,
    5. Conrad A,
    6. Carleton H,
    7. Katz LS,
    8. Stroika S,
    9. Gould LH,
    10. Mody RK,
    11. Silk BJ,
    12. Beal J,
    13. Chen Y,
    14. Timme R,
    15. Doyle M,
    16. Fields A,
    17. Wise M,
    18. Tillman G,
    19. Defibaugh-Chavez S,
    20. Kucerova Z,
    21. Sabol A,
    22. Roache K,
    23. Trees E,
    24. Simmons M,
    25. Wasilenko J,
    26. Kubota K,
    27. Pouseele H,
    28. Klimke W,
    29. Besser J,
    30. Brown E,
    31. Allard M,
    32. Gerner-Smidt P
    . 2016. Implementation of nationwide real-time whole-genome sequencing to enhance listeriosis outbreak detection and investigation. Clin Infect Dis 63:380–386. doi:10.1093/cid/ciw242.
    OpenUrlCrossRefPubMed
  4. 4.↵
    1. Carleton H,
    2. Gerner-Smidt P
    . 2016. Whole-genome sequencing is taking over foodborne disease surveillance. Microbe Wash DC 11:311–317.
    OpenUrl
  5. 5.↵
    1. Rouzeau-Szynalski K,
    2. Barretto C,
    3. Fournier C,
    4. Moine D,
    5. Gimonet J,
    6. Baert L
    . 2019. Whole-genome sequencing used in an industrial context reveals a Salmonella laboratory cross-contamination. Int J Food Microbiol 298:39–43. doi:10.1016/j.ijfoodmicro.2019.03.007.
    OpenUrlCrossRef
  6. 6.↵
    1. Frost LS,
    2. Leplae R,
    3. Summers AO,
    4. Toussaint A
    . 2005. Mobile genetic elements: the agents of open source evolution. Nat Rev Microbiol 3:722–732. doi:10.1038/nrmicro1235.
    OpenUrlCrossRefPubMedWeb of Science
  7. 7.↵
    1. Bennett PM
    . 2008. Plasmid encoded antibiotic resistance: acquisition and transfer of antibiotic resistance genes in bacteria. Br J Pharmacol 153(Suppl 1):S347–S357. doi:10.1038/sj.bjp.0707607.
    OpenUrlCrossRefPubMedWeb of Science
  8. 8.↵
    1. Silva C,
    2. Puente JL,
    3. Calva E
    . 2017. Salmonella virulence plasmid: pathogenesis and ecology. Pathog Dis 75:ftx070. doi:10.1093/femspd/ftx070.
    OpenUrlCrossRef
  9. 9.↵
    1. Marcus SL,
    2. Brumell JH,
    3. Pfeifer CG,
    4. Finlay BB
    . 2000. Salmonella pathogenicity islands: big virulence in small packages. Microbes Infect 2:145–156. doi:10.1016/S1286-4579(00)00273-2.
    OpenUrlCrossRefPubMedWeb of Science
  10. 10.↵
    1. Siguier P,
    2. Gourbeyre E,
    3. Chandler M
    . 2014. Bacterial insertion sequences: their genomic impact and diversity. FEMS Microbiol Rev 38:865–891. doi:10.1111/1574-6976.12067.
    OpenUrlCrossRefPubMedWeb of Science
  11. 11.↵
    1. Gillings MR
    . 2014. Integrons: past, present, and future. Microbiol Mol Biol Rev 78:257–277. doi:10.1128/MMBR.00056-13.
    OpenUrlAbstract/FREE Full Text
  12. 12.↵
    1. Johnson CM,
    2. Grossman AD
    . 2015. Integrative and conjugative elements (ICEs): what they do and how they work. Annu Rev Genet 49:577–601. doi:10.1146/annurev-genet-112414-055018.
    OpenUrlCrossRefPubMed
  13. 13.↵
    1. Alikhan NF,
    2. Zhou Z,
    3. Sergeant MJ,
    4. Achtman M
    . 2018. A genomic overview of the population structure of Salmonella. PLoS Genet 14:e1007261. doi:10.1371/journal.pgen.1007261.
    OpenUrlCrossRef
  14. 14.↵
    1. Cody AJ,
    2. Bray JE,
    3. Jolley KA,
    4. McCarthy ND,
    5. Maiden M
    . 2017. Core genome multilocus sequence typing scheme for stable, comparative analyses of Campylobacter jejuni and C. coli human disease isolates. J Clin Microbiol 55:2086–2097. doi:10.1128/JCM.00080-17.
    OpenUrlAbstract/FREE Full Text
  15. 15.↵
    1. Moura A,
    2. Criscuolo A,
    3. Pouseele H,
    4. Maury MM,
    5. Leclercq A,
    6. Tarr C,
    7. Bjorkman JT,
    8. Dallman T,
    9. Reimer A,
    10. Enouf V,
    11. Larsonneur E,
    12. Carleton H,
    13. Bracq-Dieye H,
    14. Katz LS,
    15. Jones L,
    16. Touchon M,
    17. Tourdjman M,
    18. Walker M,
    19. Stroika S,
    20. Cantinelli T,
    21. Chenal-Francisque V,
    22. Kucerova Z,
    23. Rocha EP,
    24. Nadon C,
    25. Grant K,
    26. Nielsen EM,
    27. Pot B,
    28. Gerner-Smidt P,
    29. Lecuit M,
    30. Brisse S
    . 2016. Whole genome-based population biology and epidemiological surveillance of Listeria monocytogenes. Nat Microbiol 2:16185. doi:10.1038/nmicrobiol.2016.185.
    OpenUrlCrossRef
  16. 16.↵
    1. Leache A,
    2. Oaks J
    . 2017. The utility of single nucleotide polymorphism (SNP) data in phylogenetics. Annu Rev Ecol Evol Syst 48:69–84. doi:10.1146/annurev-ecolsys-110316-022645.
    OpenUrlCrossRef
  17. 17.↵
    1. Katz LS,
    2. Griswold T,
    3. Williams-Newkirk AJ,
    4. Wagner D,
    5. Petkau A,
    6. Sieffert C,
    7. Van Domselaar G,
    8. Deng X,
    9. Carleton HA
    . 2017. A comparative analysis of the Lyve-SET phylogenomics pipeline for genomic epidemiology of foodborne pathogens. Front Microbiol 8:375. doi:10.3389/fmicb.2017.00375.
    OpenUrlCrossRef
  18. 18.↵
    1. Davis S,
    2. Pettengill JB,
    3. Luo Y,
    4. Payne J,
    5. Shpuntoff A,
    6. Rand H,
    7. Strain E
    . 2015. CFSAN SNP Pipeline: an automated method for constructing SNP matrices from next-generation sequence data. PeerJ Comput Sci 1:e20. doi:10.7717/peerj-cs.20.
    OpenUrlCrossRef
  19. 19.↵
    1. Pightling AW,
    2. Pettengill JB,
    3. Luo Y,
    4. Baugher JD,
    5. Rand H,
    6. Strain E
    . 2018. Interpreting whole-genome sequence analyses of foodborne bacteria for regulatory applications and outbreak investigations. Front Microbiol 9:1482. doi:10.3389/fmicb.2018.01482.
    OpenUrlCrossRef
  20. 20.↵
    1. Kwong JC,
    2. Mercoulia K,
    3. Tomita T,
    4. Easton M,
    5. Li HY,
    6. Bulach DM,
    7. Stinear TP,
    8. Seemann T,
    9. Howden BP
    . 2016. Prospective whole-genome sequencing enhances national surveillance of Listeria monocytogenes. J Clin Microbiol 54:333–342. doi:10.1128/JCM.02344-15.
    OpenUrlAbstract/FREE Full Text
  21. 21.↵
    1. Katz LS
    . 2017. LyveSET FAQ. https://github.com/lskatz/lyve-SET/blob/master/docs/FAQ.md#how-do-i-choose-a-reference-genome. Accessed 28 February 2019.
  22. 22.↵
    1. Luna S,
    2. Krishnasamy V,
    3. Saw L,
    4. Smith L,
    5. Wagner J,
    6. Weigand J,
    7. Tewell M,
    8. Kellis M,
    9. Penev R,
    10. McCullough L,
    11. Eason J,
    12. McCaffrey K,
    13. Burnett C,
    14. Oakeson K,
    15. Dimond M,
    16. Nakashima A,
    17. Barlow D,
    18. Scherzer A,
    19. Sarino M,
    20. Schroeder M,
    21. Hassan R,
    22. Basler C,
    23. Wise M,
    24. Gieraltowski L
    . 2018. Outbreak of Escherichia coli O157:H7 infections associated with exposure to animal manure in a rural community—Arizona and Utah, June-July 2017. MMWR Morb Mortal Wkly Rep 67:659–662. doi:10.15585/mmwr.mm6723a2.
    OpenUrlCrossRef
  23. 23.↵
    1. Luna S,
    2. Taylor M,
    3. Galanis E,
    4. Asplin R,
    5. Huffman J,
    6. Wagner D,
    7. Hoang L,
    8. Paccagnella A,
    9. Shelton S,
    10. Ladd-Wilson S,
    11. Seelman S,
    12. Whitney B,
    13. Elliot E,
    14. Atkinson R,
    15. Marshall K,
    16. Basler C
    . 2018. Outbreak of Salmonella Chailey infections linked to precut coconut pieces—United States and Canada, 2017. MMWR Morb Mortal Wkly Rep 67:1098–1100. doi:10.15585/mmwr.mm6739a5.
    OpenUrlCrossRef
  24. 24.↵
    1. Paszkiewicz K,
    2. Studholme DJ
    . 2010. De novo assembly of short sequence reads. Brief Bioinformatics 11:457–472. doi:10.1093/bib/bbq020.
    OpenUrlCrossRefPubMed
  25. 25.↵
    1. Carattoli A,
    2. Zankari E,
    3. García-Fernández A,
    4. Voldby Larsen M,
    5. Lund O,
    6. Villa L,
    7. Møller Aarestrup F,
    8. Hasman H
    . 2014. In silico detection and typing of plasmids using PlasmidFinder and plasmid multilocus sequence typing. Antimicrob Agents Chemother 58:3895–3903. doi:10.1128/AAC.02412-14.
    OpenUrlAbstract/FREE Full Text
  26. 26.↵
    1. Kunin V,
    2. Goldovsky L,
    3. Darzentas N,
    4. Ouzounis CA
    . 2005. The net of life: reconstructing the microbial phylogenetic network. Genome Res 15:954–959. doi:10.1101/gr.3666505.
    OpenUrlAbstract/FREE Full Text
  27. 27.↵
    1. Andam CP,
    2. Gogarten JP
    . 2011. Biased gene transfer in microbial evolution. Nat Rev Microbiol 9:543–555. doi:10.1038/nrmicro2593.
    OpenUrlCrossRefPubMed
  28. 28.↵
    1. Okoro CK,
    2. Kingsley RA,
    3. Connor TR,
    4. Harris SR,
    5. Parry CM,
    6. Al-Mashhadani MN,
    7. Kariuki S,
    8. Msefula CL,
    9. Gordon MA,
    10. de Pinna E,
    11. Wain J,
    12. Heyderman RS,
    13. Obaro S,
    14. Alonso PL,
    15. Mandomando I,
    16. MacLennan CA,
    17. Tapia MD,
    18. Levine MM,
    19. Tennant SM,
    20. Parkhill J,
    21. Dougan G
    . 2012. Intracontinental spread of human invasive Salmonella Typhimurium pathovariants in sub-Saharan Africa. Nat Genet 44:1215–1221. doi:10.1038/ng.2423.
    OpenUrlCrossRefPubMed
  29. 29.↵
    1. Croucher NJ,
    2. Harris SR,
    3. Fraser C,
    4. Quail MA,
    5. Burton J,
    6. van der Linden M,
    7. McGee L,
    8. von Gottberg A,
    9. Song JH,
    10. Ko KS,
    11. Pichon B,
    12. Baker S,
    13. Parry CM,
    14. Lambertsen LM,
    15. Shahinas D,
    16. Pillai DR,
    17. Mitchell TJ,
    18. Dougan G,
    19. Tomasz A,
    20. Klugman KP,
    21. Parkhill J,
    22. Hanage WP,
    23. Bentley SD
    . 2011. Rapid pneumococcal evolution in response to clinical interventions. Science 331:430–434. doi:10.1126/science.1198545.
    OpenUrlAbstract/FREE Full Text
  30. 30.↵
    1. Deng X,
    2. Desai PT,
    3. den Bakker HC,
    4. Mikoleit M,
    5. Tolar B,
    6. Trees E,
    7. Hendriksen RS,
    8. Frye JG,
    9. Porwollik S,
    10. Weimer BC,
    11. Wiedmann M,
    12. Weinstock GM,
    13. Fields PI,
    14. McClelland M
    . 2014. Genomic epidemiology of Salmonella enterica serotype Enteritidis based on population structure of prevalent lineages. Emerg Infect Dis 20:1481–1489. doi:10.3201/eid2009.131095.
    OpenUrlCrossRefPubMed
  31. 31.↵
    1. Namouchi A,
    2. Didelot X,
    3. Schock U,
    4. Gicquel B,
    5. Rocha EP
    . 2012. After the bottleneck: genome-wide diversification of the Mycobacterium tuberculosis complex by mutation, recombination, and natural selection. Genome Res 22:721–734. doi:10.1101/gr.129544.111.
    OpenUrlAbstract/FREE Full Text
  32. 32.↵
    1. Arndt D,
    2. Grant JR,
    3. Marcu A,
    4. Sajed T,
    5. Pon A,
    6. Liang Y,
    7. Wishart DS
    . 2016. PHASTER: a better, faster version of the PHAST phage search tool. Nucleic Acids Res 44:W16–W21. doi:10.1093/nar/gkw387.
    OpenUrlCrossRefPubMed
  33. 33.↵
    1. Anonymous
    . 2019. Step-by-step workflow: Salmonella Agona, CFSAN SNP pipeline. https://snp-pipeline.readthedocs.io/en/latest/usage.html#step-by-step-workflow-salmonella-agona. Accessed 6 February 2019.
  34. 34.↵
    1. Majowicz SE,
    2. Musto J,
    3. Scallan E,
    4. Angulo FJ,
    5. Kirk M,
    6. O’Brien SJ,
    7. Jones TF,
    8. Fazil A,
    9. Hoekstra RM, International Collaboration on Enteric Disease “Burden of Illness” Study
    . 2010. The global burden of nontyphoidal Salmonella gastroenteritis. Clin Infect Dis 50:882–889. doi:10.1086/650733.
    OpenUrlCrossRefPubMedWeb of Science
  35. 35.↵
    1. Grimont P,
    2. Weil F
    . 2007. Antigenic formulae of the Salmonella serovars. https://www.pasteur.fr/sites/default/files/veng_0.pdf. Accessed 7 February 2019.
  36. 36.↵
    1. Deng X,
    2. Shariat N,
    3. Driebe EM,
    4. Roe CC,
    5. Tolar B,
    6. Trees E,
    7. Keim P,
    8. Zhang W,
    9. Dudley EG,
    10. Fields PI,
    11. Engelthaler DM
    . 2015. Comparative analysis of subtyping methods against a whole-genome-sequencing standard for Salmonella enterica serotype Enteritidis. J Clin Microbiol 53:212–218. doi:10.1128/JCM.02332-14.
    OpenUrlAbstract/FREE Full Text
  37. 37.↵
    1. Anonymous
    . 2018. National Salmonella Surveillance. https://www.cdc.gov/nationalsurveillance/salmonella-surveillance.html. Accessed 5 February 2019.
  38. 38.↵
    1. Aviv G,
    2. Tsyba K,
    3. Steck N,
    4. Salmon-Divon M,
    5. Cornelius A,
    6. Rahav G,
    7. Grassl GA,
    8. Gal-Mor O
    . 2014. A unique megaplasmid contributes to stress tolerance and pathogenicity of an emergent Salmonella enterica serovar Infantis strain. Environ Microbiol 16:977–994. doi:10.1111/1462-2920.12351.
    OpenUrlCrossRefPubMed
  39. 39.↵
    1. Guiney DG,
    2. Fierer J
    . 2011. The role of the spv genes in Salmonella pathogenesis. Front Microbiol 2:129. doi:10.3389/fmicb.2011.00129.
    OpenUrlCrossRefPubMed
  40. 40.↵
    1. diCenzo GC,
    2. Finan TM
    . 2017. The divided bacterial genome: structure, function, and evolution. Microbiol Mol Biol Rev 81:e00019-17. doi:10.1128/MMBR.00019-17.
    OpenUrlAbstract/FREE Full Text
  41. 41.↵
    1. Matamoros S,
    2. van Hattem JM,
    3. Arcilla MS,
    4. Willemse N,
    5. Melles DC,
    6. Penders J,
    7. Vinh TN,
    8. Thi Hoa N, COMBAT Consortium,
    9. de Jong MD,
    10. Schultsz C
    . 2017. Global phylogenetic analysis of Escherichia coli and plasmids carrying the mcr-1 gene indicates bacterial diversity but plasmid restriction. Sci Rep 7:15364. doi:10.1038/s41598-017-15539-7.
    OpenUrlCrossRef
  42. 42.↵
    1. Smillie C,
    2. Garcillan-Barcia MP,
    3. Francia MV,
    4. Rocha EP,
    5. de la Cruz F
    . 2010. Mobility of plasmids. Microbiol Mol Biol Rev 74:434–452. doi:10.1128/MMBR.00020-10.
    OpenUrlAbstract/FREE Full Text
  43. 43.↵
    1. Ilhan J,
    2. Kupczok A,
    3. Woehle C,
    4. Wein T,
    5. Hulter NF,
    6. Rosenstiel P,
    7. Landan G,
    8. Mizrahi I,
    9. Dagan T
    . 2019. Segregational drift and the interplay between plasmid copy number and evolvability. Mol Biol Evol 36:472–486. doi:10.1093/molbev/msy225.
    OpenUrlCrossRef
  44. 44.↵
    1. Robertson J,
    2. Nash J
    . 2018. MOB-suite: software tools for clustering, reconstruction, and typing of plasmids from draft assemblies. Microb Genom 4:000206. doi:10.1099/mgen.0.000206.
    OpenUrlCrossRef
  45. 45.↵
    1. Arredondo-Alonso S,
    2. Willems RJ,
    3. van Schaik W,
    4. Schürch AC
    . 2017. On the (im)possibility of reconstructing plasmids from whole-genome short-read sequencing data. Microb Genom 3:e000128. doi:10.1099/mgen.0.000128.
    OpenUrlCrossRef
  46. 46.↵
    1. Zhang S,
    2. Li S,
    3. Gu W,
    4. den Bakker H,
    5. Boxrud D,
    6. Taylor A,
    7. Roe C,
    8. Driebe E,
    9. Engelthaler DM,
    10. Allard M,
    11. Brown E,
    12. McDermott P,
    13. Zhao S,
    14. Bruce BB,
    15. Trees E,
    16. Fields PI,
    17. Deng X
    . 2019. Zoonotic source attribution of Salmonella enterica serotype Typhimurium using genomic surveillance data, United States. Emerg Infect Dis 25:82–91. doi:10.3201/eid2501.180835.
    OpenUrlCrossRef
  47. 47.↵
    1. Zhang S,
    2. Yin Y,
    3. Jones MB,
    4. Zhang Z,
    5. Deatherage Kaiser BL,
    6. Dinsmore BA,
    7. Fitzgerald C,
    8. Fields PI,
    9. Deng X
    . 2015. Salmonella serotype determination utilizing high-throughput genome sequencing data. J Clin Microbiol 53:1685–1692. doi:10.1128/JCM.00323-15.
    OpenUrlAbstract/FREE Full Text
  48. 48.↵
    1. Allard MW,
    2. Strain E,
    3. Melka D,
    4. Bunning K,
    5. Musser SM,
    6. Brown EW,
    7. Timme R
    . 2016. Practical value of food pathogen traceability through building a whole-genome sequencing network and database. J Clin Microbiol 54:1975–1983. doi:10.1128/JCM.00081-16.
    OpenUrlAbstract/FREE Full Text
  49. 49.↵
    1. Ondov BD,
    2. Treangen TJ,
    3. Melsted P,
    4. Mallonee AB,
    5. Bergman NH,
    6. Koren S,
    7. Phillippy AM
    . 2016. Mash: fast genome and metagenome distance estimation using MinHash. Genome Biol 17:132. doi:10.1186/s13059-016-0997-x.
    OpenUrlCrossRefPubMed
  50. 50.↵
    1. Katz LS
    . 2017. MashTree. https://github.com/lskatz/mashtree. Accessed 4 February 2019.
  51. 51.↵
    1. Bolger AM,
    2. Lohse M,
    3. Usadel B
    . 2014. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30:2114–2120. doi:10.1093/bioinformatics/btu170.
    OpenUrlCrossRefPubMedWeb of Science
  52. 52.↵
    1. Bankevich A,
    2. Nurk S,
    3. Antipov D,
    4. Gurevich AA,
    5. Dvorkin M,
    6. Kulikov AS,
    7. Lesin VM,
    8. Nikolenko SI,
    9. Pham S,
    10. Prjibelski AD,
    11. Pyshkin AV,
    12. Sirotkin AV,
    13. Vyahhi N,
    14. Tesler G,
    15. Alekseyev MA,
    16. Pevzner PA
    . 2012. SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing. J Comput Biol 19:455–477. doi:10.1089/cmb.2012.0021.
    OpenUrlCrossRefPubMed
  53. 53.↵
    1. Gurevich A,
    2. Saveliev V,
    3. Vyahhi N,
    4. Tesler G
    . 2013. QUAST: quality assessment tool for genome assemblies. Bioinformatics 29:1072–1075. doi:10.1093/bioinformatics/btt086.
    OpenUrlCrossRefPubMedWeb of Science
  54. 54.↵
    1. Guindon S,
    2. Dufayard JF,
    3. Lefort V,
    4. Anisimova M,
    5. Hordijk W,
    6. Gascuel O
    . 2010. New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0. Syst Biol 59:307–321. doi:10.1093/sysbio/syq010.
    OpenUrlCrossRefPubMedWeb of Science
  55. 55.↵
    1. Xie Z,
    2. Tang H
    . 2017. ISEScan: automated identification of insertion sequence elements in prokaryotic genomes. Bioinformatics 33:3340–3347. doi:10.1093/bioinformatics/btx433.
    OpenUrlCrossRef
  56. 56.↵
    1. Cury J,
    2. Jove T,
    3. Touchon M,
    4. Neron B,
    5. Rocha EP
    . 2016. Identification and analysis of integrons and cassette arrays in bacterial genomes. Nucleic Acids Res 44:4539–4550. doi:10.1093/nar/gkw319.
    OpenUrlCrossRefPubMed
  57. 57.↵
    1. Abby SS,
    2. Neron B,
    3. Menager H,
    4. Touchon M,
    5. Rocha EP
    . 2014. MacSyFinder: a program to mine genomes for molecular systems with an application to CRISPR-Cas systems. PLoS One 9:e110726. doi:10.1371/journal.pone.0110726.
    OpenUrlCrossRefPubMed
  58. 58.↵
    1. Rognes T,
    2. Flouri T,
    3. Nichols B,
    4. Quince C,
    5. Mahe F
    . 2016. VSEARCH: a versatile open source tool for metagenomics. PeerJ 4:e2584. doi:10.7717/peerj.2584.
    OpenUrlCrossRefPubMed
PreviousNext
Back to top
Download PDF
Citation Tools
Implications of Mobile Genetic Elements for Salmonella enterica Single-Nucleotide Polymorphism Subtyping and Source Tracking Investigations
Shaoting Li, Shaokang Zhang, Leen Baert, Balamurugan Jagadeesan, Catherine Ngom-Bru, Taylor Griswold, Lee S. Katz, Heather A. Carleton, Xiangyu Deng
Applied and Environmental Microbiology Nov 2019, 85 (24) e01985-19; DOI: 10.1128/AEM.01985-19

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
Print

Alerts
Sign In to Email Alerts with your Email Address
Email

Thank you for sharing this Applied and Environmental Microbiology article.

NOTE: We request your email address only to inform the recipient that it was you who recommended this article, and that it is not junk mail. We do not retain these email addresses.

Enter multiple addresses on separate lines or separate them with commas.
Implications of Mobile Genetic Elements for Salmonella enterica Single-Nucleotide Polymorphism Subtyping and Source Tracking Investigations
(Your Name) has forwarded a page to you from Applied and Environmental Microbiology
(Your Name) thought you would be interested in this article in Applied and Environmental Microbiology.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Implications of Mobile Genetic Elements for Salmonella enterica Single-Nucleotide Polymorphism Subtyping and Source Tracking Investigations
Shaoting Li, Shaokang Zhang, Leen Baert, Balamurugan Jagadeesan, Catherine Ngom-Bru, Taylor Griswold, Lee S. Katz, Heather A. Carleton, Xiangyu Deng
Applied and Environmental Microbiology Nov 2019, 85 (24) e01985-19; DOI: 10.1128/AEM.01985-19
del.icio.us logo Digg logo Reddit logo Twitter logo CiteULike logo Facebook logo Google logo Mendeley logo
  • Top
  • Article
    • ABSTRACT
    • INTRODUCTION
    • RESULTS
    • DISCUSSION
    • MATERIALS AND METHODS
    • ACKNOWLEDGMENT
    • FOOTNOTES
    • REFERENCES
  • Figures & Data
  • Info & Metrics
  • PDF

KEYWORDS

Salmonella
mobile genetic elements
plasmid
subtyping
whole-genome sequencing
SNP
WGS

Related Articles

Cited By...

About

  • About AEM
  • Editor in Chief
  • Editorial Board
  • Policies
  • For Reviewers
  • For the Media
  • For Librarians
  • For Advertisers
  • Alerts
  • RSS
  • FAQ
  • Permissions
  • Journal Announcements

Authors

  • ASM Author Center
  • Submit a Manuscript
  • Article Types
  • Ethics
  • Contact Us

Follow #AppEnvMicro

@ASMicrobiology

       

ASM Journals

ASM journals are the most prominent publications in the field, delivering up-to-date and authoritative coverage of both basic and clinical microbiology.

About ASM | Contact Us | Press Room

 

ASM is a member of

Scientific Society Publisher Alliance

 

American Society for Microbiology
1752 N St. NW
Washington, DC 20036
Phone: (202) 737-3600

Copyright © 2021 American Society for Microbiology | Privacy Policy | Website feedback

 

Print ISSN: 0099-2240; Online ISSN: 1098-5336