Genetic Diversity of Clostridium sporogenes PA 3679 Isolates Obtained from Different Sources as Resolved by Pulsed-Field Gel Electrophoresis and High-Throughput Sequencing

ABSTRACT Clostridium sporogenes PA 3679 is a nonpathogenic, nontoxic model organism for proteolytic Clostridium botulinum used in the validation of conventional thermal food processes due to its ability to produce highly heat-resistant endospores. Because of its public safety importance, the uncertain taxonomic classification and genetic diversity of PA 3679 are concerns. Therefore, isolates of C. sporogenes PA 3679 were obtained from various sources and characterized using pulsed-field gel electrophoresis (PFGE) and whole-genome sequencing. The phylogenetic relatedness and genetic variability were assessed based on 16S rRNA gene sequencing and whole-genome single nucleotide polymorphism (SNP) analysis. All C. sporogenes PA 3679 isolates were categorized into two clades (clade I containing ATCC 7955 NCA3679 isolates 1961-2, 1990, and 2007 and clade II containing PA 3679 isolates NFL, UW, FDA, and Campbell and ATCC 7955 NCA3679 isolate 1961-4). The 16S maximum likelihood (ML) tree clustered both clades within proteolytic C. botulinum strains, with clade I forming a distinct cluster with other C. sporogenes non-PA 3679 strains. SNP analysis revealed that clade I isolates were more similar to the genomic reference PA 3679 (NCTC8594) genome (GenBank accession number AGAH00000000.1) than clade II isolates were. The genomic reference C. sporogenes PA 3679 (NCTC8594) genome and clade I C. sporogenes isolates were genetically distinct from those obtained from other sources (University of Wisconsin, National Food Laboratory, U.S. Food and Drug Administration, and Campbell's Soup Company). Thermal destruction studies revealed that clade I isolates were more sensitive to high temperature than clade II isolates were. Considering the widespread use of C. sporogenes PA 3679 and its genetic information in numerous studies, the accurate identification and genetic characterization of C. sporogenes PA 3679 are of critical importance.

F ood-borne botulism is a neuroparalytic disease that results from the ingestion of botulinum neurotoxin (BoNT) produced by clostridia, including Clostridium botulinum and rare strains of Clostridium baratii and Clostridium butyricum. C. botulinum is a heterogeneous species comprised of four distinct groups (groups I to IV) of strains that differ based on genetic and phenotypic properties (1). Group I (proteolytic) C. botulinum strains can also be uniquely differentiated from the other groups based on their ability to form highly heat-resistant endospores. Spores of this organism are ubiquitously found in the environment and are generally presumed to be a natural contaminant of foods that come in contact with the soil (2). The geographical distribution of C. botulinum varies by serotype. For example, in the United States, spores of C. botulinum strains producing type B toxin are typically found in the soils east of the Mississippi River, and type A strains are found in the western region of the United States. Serotype E strains are commonly found in the soil sediments of the Great Lakes regions and in Alaska (2,3).
Spore germination and outgrowth of group I C. botulinum can result in media and foods that have a pH greater than 4.6 (4-6). As a result, commercial processing technologies are designed to deliver a process that eliminates the threat of toxin production by group I C. botulinum spores, the target pathogen, in shelf-stable low-acid canned food products (7). However, production of C. botulinum spore crops required to validate these processes is exceedingly challenging due to potentially low spore yields and variable resistance to high temperature between batches, thus highlighting the need for a nontoxigenic surrogate that produces spores of a consistent yield and that are highly heat resistant.
Clostridium sporogenes PA 3679 is a nonpathogenic, nontoxic, Gram-positive putrefactive anaerobe that produces spores that are more heat resistant than spores of group I C. botulinum (8). Therefore, PA 3679 was widely adopted and used as the nontoxigenic model organism for proteolytic group I C. botulinum for validating thermal processing technologies for low-acid canned foods. PA 3679 was first isolated in 1927 from spoiled canned corn by E. J. Cameron at the National Canner's Association (NCA) (9), and the origin and identity of PA 3679 have been thoroughly reviewed by Brown et al. (10). Although PA 3679 is classified as a strain of C. sporogenes, its true identity remains uncertain. On the basis of serological testing, McClung (8) argued that PA 3679 was not a nontoxic C. botulinum strain but an undescribed species. Gross et al. (11) reported a private communication from Cameron that although PA 3679 resembled C. sporogenes, it is likely not identical to that species. That study also showed that PA 3679 was serologically identical to two other putrefactive anaerobes, S2 and 9x, isolated from spoiled canned meats; however, all three strains were culturally and serologically distinct from C. sporogenes strain Spray (11). The nontoxigenic counterpart for group I C. botulinum is considered to be C. sporogenes (12,13). Studies investigating the genetic relationship of C. sporogenes and C. botulinum that employed molecular techniques such as multiple-locus variablenumber tandem-repeat analysis (MLVA) (14,15), DNA microarrays and analysis of the flagellar glycosylation island (FGI) (16), DNA-rRNA hybridization (17), 16S rRNA gene sequencing (18,19), and multilocus sequence typing (MLST) (20,21) have been reported and included other strains of C. sporogenes, but not PA 3679. A DNA-DNA hybridization study by Lee and Riemann (22) revealed that PA 3679 was 100% homologous to C. botulinum strain 62A, yet in a similar study, PA 3679 shared only 83% homology to C. botulinum strain A190 and 68% homology with C. sporogenes strain J-53 (23). Most recently, in a study by Weigand et al. (24), whole-genome sequencing (WGS) was employed to show that strains of C. sporogenes and group I C. botulinum were distinctly grouped into two clades based on 2,016 core orthologues shared between these organisms. The PA 3679 strain included in that study was C. sporogenes PA 3679 (NCTC8594) that was sequenced by Bradbury et al. (25).
Since 1927, PA 3679 has been globally distributed and studied in numerous laboratories; it was also deposited into the American Type Culture Collection (ATCC) in 1941 or earlier (10) and is listed as ATCC 7955. Additionally, the draft genome sequence of Clostridium sporogenes PA 3679 (National Collection of Type Cultures [NCTC] 8594) has been completed and is available through GenBank (accession number AGAH00000000.1) (25). According to their analysis, the 16S rRNA gene sequence of NCTC8594 shared 99 to 100% nucleotide similarity with that of group I C. botulinum and C. sporogenes strains, and MLST using seven group I C. botulinum housekeeping genes showed that NCTC8594 clustered with C. botulinum type A strain A207 (25). In this study, isolates of PA 3679 were collected from five different sources, where they were maintained by various laboratory culturing techniques over the past 88 years. Pulsed-field gel electrophoresis (PFGE) and whole-genome sequencing were used to examine the genetic diversity of these isolates. A whole-genome single nucleotide polymorphism (SNP)-based approach to assess the relatedness among the isolates and with C. sporogenes PA 3679 (NCTC8594) and C. sporogenes ATCC 15579 (GenBank accession number NZ_ABKW00000000.2) has been performed.

MATERIALS AND METHODS
Bacterial strains. Spore crops of C. sporogenes PA 3679 were obtained from five different sources, and the details are presented in Table 1. Three distinct freeze-dried spore crop lots of ATCC 7955 were received from the American Type Culture Collection (ATCC) and labeled for the year the crop was preserved (1961, 1990, and 2007). ATCC 15579 was also purchased from the ATCC and included in this study because it is a non-PA 3679 C. sporogenes strain and the nucleotide sequence is available in GenBank (accession number NZ_ABKW00000000.2). The freeze-dried vials received from ATCC were opened aseptically according to the instructions provided by ATCC, inoculated into sterile TPGY (Trypticasepeptone-glucose-yeast extract) broth and incubated anaerobically at 37°C. The Campbell isolate was received from the Campbell's Soup Company as a liquid culture containing ϳ10 8 spores/ml, harvested in December 2003. The NFL isolate was received as a liquid spore culture from the National Food Laboratory (NFL). The UW isolate was received from the Eric A. Johnson laboratory at the University of Wisconsin-Madison as a 72-h TPGY broth culture revived from a freeze-dried sample from 13 June 1972 labeled "PA 3679-66." The FDA isolate was obtained from the U.S. Food and Drug Administration (FDA) culture collection (Bedford Park, IL) as a spore crop prepared in 2006 and suspended in water. All cultures from the various sources were inoculated (1:100) into 10 ml of TPGY broth and incubated anaerobically at 37°C for 24 to 48 h. Each culture was subsequently streaked for isolation onto egg yolk agar (EYA) and incubated anaerobically at 37°C for 24 to 48 h to assess the purity of the cultures by examining colony morphology.
Spore crop production. Spore crops were cultivated by inoculating 1 to 2 liters of TPGY broth (1:100) with an 18-to 24-h culture of each strain. The TPGY broth was incubated at 37°C anaerobically for up to 28 days. Spores were harvested by centrifuging the culture at 10,000 ϫ g for 15 min at 4°C using a Sorvall Evolution RC (Thermo Scientific, Waltham, MA) centrifuge fitted with an SLA-3000 rotor. The spore pellet was washed four times with sterile deionized water. To eliminate vegetative cells, the spore samples were sonicated using a Vibracell VC-750 sonicator (Sonics and Materials, Inc., Newton, CT) at 50% amplitude for 20 min in an ice-water bath. The sonicated spore suspensions were centrifuged at 10,000 ϫ g for 15 min at 4°C and washed six times in deionized water. The final spore pellet was resuspended in 10 ml of sterile deionized water. Spore crops were heat shocked at 80°C for 10 min and enumerated by pour plating using peptone-yeast ex- tract-glucose-starch (PYGS) agar medium without glucose (26). The plates were incubated anaerobically at 37°C for 5 days, and then sporeforming units (SFU)/ml were counted. Heat resistance studies. The thermal resistance of selected spore crops was evaluated at various temperatures (94 to 121°C). Spore crops were diluted to approximately 10 5 to 10 6 spores/ml in phosphate buffer (0.067 M, pH 7.0), and 1-ml quantities of diluted spores were dispensed into duplicate sterile nuclear magnetic resonance (NMR) tubes (Kimble Chase, Portsmouth, NH). The NMR tubes were heat sealed and completely submerged in a calibrated, heated Fluke high-precision bath (model 7103; Fluke, Everett, WA) with silicone oil as the heat transfer fluid. The spores were processed at 94°C, 97°C, 100°C, 103°C, 105°C, 108°C, 111°C, 114°C, 117°C, and 121°C for 5 min. The tubes were immediately placed in ice water to stop the heat treatment. The spore suspensions were removed from the tubes using a sterile blunt needle syringe, and the contents were transferred to duplicate 10-ml TPGY broth tubes. The samples were incubated anaerobically at 37°C for 5 days and visually examined for growth as exhibited by turbidity.
Isolates ATCC 7955 NCA3679 1961-2 and 2007, as well as isolate FDA, were selected for additional thermal studies to determine the D-values (the time required to reduce a population by 90%) at the highest temperatures at which survivors were obtained for each particular isolate (i.e., D 97°C [the D-value at 97°C, which was the highest temperature at which survivors were found] for isolate 1961-2; D 100°C for isolate 2007; D 117°C for isolate FDA). The spore crops of the C. sporogenes isolates 1961-2, 2007, and FDA were diluted to 10 6 to 10 7 spores/ml in phosphate buffer (0.067 M, pH 7.0) and placed into sterile NMR tubes (Kimble Chase). The NMR tubes containing spores of selected isolates were processed at 97°C Ϯ 0.1°C for 2, 4, 6, 8, 10, 12, and 14 min (isolate 1961-2), 100°C Ϯ 0.1°C for 2, 3, 4, 5, 6, 8, and 10 min (isolate 2007), or 117°C Ϯ 0.1°C for 2, 4, 5, 6, and 8 min (isolate FDA). After each time interval, NMR tubes containing spores were immediately placed in ice water to halt further heating of the samples. The spore suspensions were removed from the tubes as described above, serially diluted in 0.1 M Butterfield's phosphate buffer (BPB) (pH 7.2), and pour plated using PYGS agar. The plates were incubated anaerobically for 5 to 7 days at 37°C prior to enumeration. Survivors were enumerated as log CFU per milliliter. The D-values of each spore crop isolate were calculated from the negative slope of the regression lines using the linear portions of the survivor curves. At least six replicates of each experiment were conducted.
PFGE. Isolated colonies of each isolate were selected and inoculated into fresh 10 ml of TPGY broth and incubated anaerobically at 37°C until an optical density at 600 nm (OD 600 ) of 0.6 was reached. Pulsed-field gel electrophoresis (PFGE) plug samples were prepared using 1.5-ml culture aliquots, and restriction digestion of the DNA samples embedded in agarose plugs was performed using restriction endonucleases XhoI (at 37°C) and SmaI (at 25°C) as previously described (26). The sample plugs were rinsed with 0.5ϫ Tris-borate-EDTA (TBE) (45 mM Tris-borate, 1 mM EDTA [pH 8.0]), loaded onto a 1% agarose gel, and electrophoresed in 0.5ϫ TBE containing thiourea (4.5 g/ml) using a clamped homogenous electric field system (CHEF Mapper; Bio-Rad, Hercules, CA) with the following parameters: initial switch time of 0.5 s, final switch time of 40 s, voltage of 6 V/cm at 14°C for 22 h. The gels were stained with ethidium bromide (1 g/ml) for 20 min, destained in ultrapure water for 1 h, and visualized using a GelDocXRSϩ camera. The restriction banding patterns generated using XhoI and SmaI were analyzed using the BioNumerics software, version 6.5 (Applied Maths, Sint-Martens-Latem, Belgium), according to the standardized PulseNet protocol (27,28). Similarity between the restriction banding patterns was determined using the Dice coefficient correlation. The dendrograms were constructed using the average of the composite data sets generated by SmaI and XhoI and the unweighted pair group method using average linkages (UPGMA) using the average UPGMA linkage clustering tool. The optimization value and position tolerance were 1.5%.
Genomic DNA extraction. C. sporogenes isolates were inoculated (1/ 100) into 10 ml of TPGY broth and incubated anaerobically overnight at 37°C. From the 18-h cultures, 3.0 ml was centrifuged using an Eppendorf microcentrifuge 5415R (Eppendorf, Hauppauge, NY) at 10,000 ϫ g for 10 min to pellet the bacteria. The cell pellet was washed twice with TES (50 mM Tris-HCl, 25 mM EDTA [pH 8.0], 6.7% sucrose) buffer and stored at Ϫ80°C prior to DNA extraction using the MagAttract high-molecularweight DNA kit for Gram-positive bacteria (Qiagen, Valencia, CA) according to the manufacturer's instructions, with some modifications. The cell pellets were thawed in a 37°C water bath and resuspended in 300 l of buffer P1 (50 mM Tris, 10 mM EDTA [pH 8.0]). Lysozyme (100 mg/ml) was prepared in water, and 40 l of the lysozyme solution was added to the cell suspension. The cells were incubated at 37°C for 2 h using a Thermo-Mixer C (Eppendorf) set at 900 rpm. After cell lysis, 40 l of proteinase K, as supplied, was added and incubated at 56°C and 900 rpm for 1 h. RNase A (2 mg/ml) was added to the cell suspension, which was subsequently incubated at 25°C for 2 min. Next, 300 l of the kit buffer AL was added, followed by 30 l of the MagAttract suspension G and 560 l of buffer MB. The samples were mixed by pulse vortexing and incubated at 25°C at 1,400 rpm for 3 min. The remaining steps of the DNA extraction were followed according to the manufacturer's instructions. The DNA was eluted in 50 l of ultrapure water. The concentration and quality of each DNA sample were evaluated using a NanoDrop 2000 spectrophotometer (Thermo Scientific, Wilmington, DE) and the Qubit ds-DNA BR assay kit (Life Technologies, Carlsbad, CA) using a Qubit 2.0 fluorometer (Life Technologies). The DNA was stored at 4°C prior to further analysis.
Whole-genome sequencing (WGS) and phylogenetic analysis. Paired-end libraries were prepared using the Nextera XT DNA sample preparation kit version 2 (Illumina, San Diego, CA). Tagmentation of the input DNA (1 ng) and PCR amplification to add indexing primers were performed according to the manufacturer's instructions. The library insert size (range, 300 to 1,000 bp) was analyzed on an Agilent 2100 bioanalyzer using the Agilent high-sensitivity DNA chip (Agilent, Santa Clara, CA). The "Library Normalization" step was omitted, and instead the individual libraries were diluted to 4 nM and pooled into a single tube. The pooled amplicon libraries (5 l) were denatured into single strands by adding an equal volume of freshly prepared 0.2 N NaOH. The DNA sample was mixed briefly by vortexing, centrifuged at 280 ϫ g for 1 min using an Eppendorf centrifuge 5430R (Eppendorf), and incubated at 25°C for 5 min. Meanwhile, the PhiX DNA control library was prepared by mixing 2 l of PhiX control DNA with 3 l of 10 mM Tris-HCl (pH 8.5) and 0.1% Tween 20. The PhiX control DNA library was denatured by adding 5 l of 0.2 N NaOH, briefly vortexed, centrifuged at 280 ϫ g for 1 min, and incubated at 25°C for 5 min. The PhiX control DNA library was diluted to 12.5 pM using cold HT1 buffer, and the sample DNA library was diluted to 15 pM prior to loading onto the MiSeq reagent cartridge. Sequencing by synthesis of the paired-end 250-bp reads was conducted on a MiSeq benchtop sequencing platform (Illumina).

FIG 1
Dendrogram of nine C. sporogenes PA 3679 isolates and one C. sporogenes ATCC 15579 strain. Similarity between the restriction banding patterns was determined using the Dice coefficient correlation. The dendrogram was constructed using the average of the composite data sets generated by SmaI and XhoI and the unweighted pair group method using average linkages (UPGMA) using the average UPGMA linkage clustering tool. The optimization value and position tolerance were 1.5%.
Reads were analyzed using FastQC 0.11.2 (http://www.bioinformatics .babraham.ac.uk/projects/fastqc/), and NexteraXT transposon sequences were found to be present in less than 0.14% of the reads across all data sets. Low-quality bases were filtered out with sickle 1.290 (a slidingwindow, adaptive, quality-based trimming tool for FastQ files developed by N. A. Joshi and J. N. Fass and available at https://github.com /najoshi/sickle) using the quality options -q 33 and -t Sanger. Reads were then trimmed to the minimum length of 150 nucleotides (nt) with sickle using the option -l 150. Both trimmed and raw reads were each assembled using two de novo assemblers: Spades 3.1 (29) and Ray 2.3.1 (30). For the Ray assemblies, kmer size was determined using Kmergenie 1.6663 (31). Assemblies were compared using Quast 2.3 (32). For each genome, the best assembly was selected based on the number of contigs, N 50 , largest contig size, and total assembly size (see Table S1 in the supplemental material). Additional assembly statistics were calculated with the assemblathon_stats.pl script (33). Draft assemblies excluded contigs below 5 kbp. Draft annotation statistics were generated using Prokka 1.10 (34), and tRNA quantities were amended using tRNAscan-SE 1.3.1 (35).
For final genome assemblies, the minimum contig size was set at 1 kbp. Assemblies were reannotated with Prokka 1.10, and tRNAs were verified and amended using tRNAscan-SE 1.3.1. Coding sequence (CDS) annotations were checked using InterProScan 5.3-46.0 (36) and manually corrected for improper product names. Consecutive product duplicates were verified with BLAST (37) protein homology searches against the NCBI nonredundant database to check for potential pseudogenes, which were then verified by read mapping in Geneious 7.1 (38). Genuine pseudogenes were then reannotated as such manually.
None of the eight isolates had a complete 16S rRNA gene sequence in its assembly. To generate these sequences, the contigs from each isolate's assembly were mapped to a reference 16S rRNA gene (Paenibacillus sp. strain 32O-W) using Geneious 7.1. A subset of these contigs which together spanned the full 16S rRNA genes were selected and de novo assembled using Geneious 7.1 (minimum overlap of 25 bp). The original sequencing reads were then mapped to this assembly to verify complete coverage, and a consensus sequence was then generated in Geneious 7.1 and appended to the final genome assembly. 16S rRNA gene sequences for each isolate were concatenated with related sequences retrieved from GenBank (as of 5 September 2014), the data set was aligned with MUSCLE 3.8.31 (39), and the resulting alignment was manually inspected for sequences presenting a wrong reverse complement orientation with SeaView 4.5.2 (40). Incorrectly oriented sequences were removed and reverse complemented with the FASTX-Toolkit 0.0.14 (http://hannonlab.cshl .edu/fastx_toolkit), reintroduced in the proper orientation, and then realigned with MUSCLE. Ambiguous regions in the alignment were removed with BMGE 1-1 (41) using the default parameters.   model of nucleotide substitution (four gamma categories) using a total of 100 bootstraps. Reads from each data set were mapped on the ATCC 15579 genomic reference with BWA-SW 0.7.9a (43) and converted from SAM (sequence alignment) files to BAM (binary form of SAM) files with SAMtools 0.1.19 (44). SNPs were inferred with VarScan 2.3.7 (45) and converted to binary files (0 for absence; 1 for presence) with custom Perl scripts. SNP sliding windows were inferred with a step and window size of 1,000 bp using scripts from Pombert et al. (46) and displayed using Circos (47) (see Fig. 3).
Nucleotide sequence accession number. The sequences of all eight PA 3679 genomes were submitted to DBBJ/EMBL/GenBank under the BioProject accession number PRJNA278445.

PFGE.
Pulsed-field gel electrophoresis (PFGE) was performed on the ATCC 7955 NCA3679 1961 spore crop prior to streaking for isolation to ascertain the purity of the culture. Both PFGE and culturing methods indicated that this crop contained at least two nodes present in all bootstraps are indicated with an asterisk. Species/strain names are indicated before the @ symbol, and accession numbers are indicated after the @ symbol. The Clostridium sporogenes strains from this study are colored light orange. The previous Clostridium sporogenes, Clostridium botulinum, Clostridium perfringens, Clostridium sardiniense, Clostridium butyricum, Clostridium cellulovorans, and Clostridium tetani species are colored dark orange, navy, green, pink, gold, cyan, and purple, respectively. OTUs, operational taxonomic units. different isolates, which were designated 1961-2 and 1961-4. Isolate 1961-2 produced large, pale, irregularly shaped colonies on EYA, whereas isolate 1961-4 was characterized by small, white, irregularly shaped colonies with an iridescent sheen indicating lipase activity, a common phenotypic trait for C. sporogenes. These isolates also displayed distinct restriction banding patterns (Fig.  1). PFGE analysis of the C. sporogenes PA 3679 isolates revealed three clusters based on the restriction banding patterns generated for each isolate (Fig. 1). All of the ATCC 7955 NCA3679 isolates except isolate 1961-4 formed one cluster. Within this cluster, isolate 1961-2 showed 80% similarity to the 1961 spore crop and only 77% similarity to the 1990 and 2007 spore crops, which shared an identical pulsotype. The restriction banding pattern of isolate 1961-4 shared 95% similarity to the second cluster comprised of isolates Campbell, FDA, NFL, and UW, which shared the same pulsotype. C. sporogenes ATCC 15579 was used as a genomic reference to understand the similarity between PA 3679 and another strain of C. sporogenes. Strain ATCC 15579 was only 57% similar to the other PA 3679 isolates examined and formed its own cluster.
Whole-genome sequencing. The genomes of C. sporogenes ATCC 7955 NCA3679 isolates 1961 -2, 1961-4, 1990, and 2007, PA 3679 isolates FDA, UW, NFL, and Campbell, and strain ATCC 15579 were sequenced, and the genome assembly and annotation statistics are presented in Tables 2 and 3. The average genome size of clade I organisms was 4.15 Mb, and the %GC content was 27.85. Clade II isolates shared a slightly smaller average genome size of ϳ3.98 Mb and a %GC content of 28.
A maximum likelihood (ML) tree of 16S rRNA gene sequences of Clostridium (Fig. 2) was constructed, and all C. sporogenes isolates analyzed in this study were included among group I (proteolytic) Clostridium botulinum strains. However, isolates 1961-2, 2007, and 1990 were identified as C. sporogenes based on 16S rRNA gene sequencing because they were located within a tight cluster comprised of other C. sporogenes strains including McClung 2004 (NR_029231), ATCC 15579, and C. sporogenes PA 3679 NCTC8594 (25). The 16S sequences NR_029231 and X68189 were both derived from strain ATCC 3584 L. S. McClung 2004 (18), which was deposited into ATCC by I. C. Hall (http://www.atcc.org/products /all/3584.aspx#history). Isolates FDA, UW, NFL, Campbell, and 1961-4 were not included within this group but instead were clustered with other group I (proteolytic) C. botulinum strains.
Single nucleotide polymorphism (SNP) profiling of the sequenced C. sporogenes isolates against the ATCC 15579 reference genome (GenBank accession number NZ_ABKW00000000.2) is illustrated in Fig. 3. The total numbers of SNPs, SNPs/kilobase, and indels inferred from VarScan 2.3 between each isolate and the genomic references ATCC 15579 and PA 3679 (NCTC8594) (GenBank accession number AGAH00000000.1) are presented in Table 4. Similar to the results obtained by PFGE analysis, the isolates were categorized into two clades. Clade I included isolates ATCC 15579, 1961-2, 1990, and 2007. The ATCC 15579 sequenced in this study contained only 85 SNPs and 31 indels compared with the reference genome sequence for this strain. These SNPs are likely due to assembly or sequencing errors. Similarly, when the reads were mapped to their own assembly for any particular isolate in either clade I or II, an average of 100 to 200 SNPs was typically found, except for isolate 1990, which had 421 SNPs. Isolates 1961Isolates -2, 1990, and 2007 were found to have a higher number of SNPs compared to ATCC 15579 than when these isolates were compared to PA 3679 (NCTC8594) ( Table 4). The isolates in clade II included isolates 1961-4, Campbell, FDA, NFL, and UW and contained an average of 189,260 SNPs compared to ATCC 15579 and an average of 202,292 SNPs compared to PA 3679 (NCTC8594) ( Table 4). Unlike the pairwise comparison of isolates in clade I, strains within clade II showed very low numbers of SNPs and indels, suggesting that they are highly similar to each other. Comparison of the FDA and NFL isolates showed the highest number of SNPs at 198 (0.05 SNPs/kb) but only 4 indels.
Heat resistance studies. The thermal resistance of isolates 1961-2, 2007, 1961-4, FDA, NFL, UW, and Campbell was assessed at temperatures ranging from 94°C to 121°C, and the growth of survivors in TPGY broth was recorded (Table 5). Isolates representing clade I, isolates 1961-2 and 2007, were more sensitive to high temperature than the other isolates because no survivors were detected after 5 min at 100°C (isolate 1961-2) and 105°C (isolate 2007) ( Table 5). Isolates of clade II were more resistant to high temperature, as isolates Campbell, FDA, and NFL survived heat treatment at 121°C, which is typical of the thermal resistance of C. sporogenes PA 3679. Isolates UW and 1961-4 were slightly less heat resistant, because survivors were obtained after heat treatment for 5 min at 117°C but not at 121°C.
Additional thermal studies of isolates 1961-2, 2007, and FDA were conducted at 97°C, 100°C, and 117°C, respectively, for the calculation of D-values (the time required to reduce a population by 90%) as a measure of heat resistance of these isolates representative of the two genetic clades identified via PFGE and wholegenome sequencing (Fig. 4). Initially, thermal studies of isolate 2007 were conducted at 103°C; however, enumeration of spores after 4 min was consistently below our level of detection using the plate count method. Therefore, thermal studies on this isolate were carried out at 100°C. Isolate 1961-2 had a D 97°C of 2.97 min, whereas isolate 2007 was found to have a D 100°C of 2.28 min. Isolate FDA was much more heat resistant than the other two strains, with a D 117°C of 2.97 min.

DISCUSSION
The strict USDA/CDC regulations required for working with C. botulinum and its toxins and the extreme biohazardous risk of this organism obstruct the routine use of this organism to validate thermal processing technologies for shelf-stable low-acid canned foods in industry and research facilities despite it being the primary health hazard for these products. C. sporogenes PA 3679 has been adopted as a suitable surrogate for group I (proteolytic) C. botulinum because it is phenotypically and genetically similar to C. botulinum, it is nontoxic, and its spores are more resistant to high temperature.
Over the last 8 decades, C. sporogenes PA 3679 has been distributed to and shared with a multitude of laboratories worldwide and subcultured routinely using various techniques and growth media. It is likely that the genetic makeup of this organism has drifted from the original isolate discovered in 1927. Furthermore, other  (Fig. 2). SNP comparisons between isolates 1990 and 2007, which number in the hundreds (Table 4), suggest that these isolates may be clonal, since a similar number of SNPs is observed for a particular strain when the reads are mapped against its own assembly. Approximately 7,000 SNPs were found between isolate 1961-2 and either isolate 1990 or isolate 2007, suggesting that this isolate is of similar origin but has since diverged slightly. Interestingly, clade I isolates are more similar to the genomic reference PA 3679 (NCTC8594) than clade II isolates, since an almost 10-fold increase in SNPs was observed between clade II isolates and NCTC8594. However, the levels of divergence observed between clade I isolates and NCTC8594 suggest that they have been drifting apart for multiple generations. Although clade I isolates were more genetically similar to NCTC8594, the thermal screening data generated in this study showed that they are sensitive to high temperature and thus, by definition, should not be classified as PA 3679.
In a study compiling 911 D-values of proteolytic C. botulinum and C. sporogenes PA 3679 collected from 38 studies, PA 3679 was found to have an estimated D 121°C of 1.28 min and a z-value (the temperature increase needed to achieve a 10-fold decrease in Dvalues) of 11.1°C (48). Comparatively, the mean D 121°C and zvalue for C. botulinum were found to be 0.19 min and 11.3°C, respectively (48). Isolates 1961-2 and 2007 were significantly sensitive to high temperatures and had a D 97°C value and a D 100°C value of 2.97 min and 2.28 min, respectively. Production of spores that are resistant to high temperatures is the defining trait for C. sporogenes PA 3679 and an essential requirement for this strain to be used as a model organism for group I C. botulinum to validate thermal processing technologies. The thermal resistance of C. sporogenes PA 3679 (NCTC8594) was not determined in this study, and thermal resistance data for this particular strain were not available in the literature. However, a study by Bull et al. (49) examined the resistance of NCTC8594 spores to combined high temperature and pressure in various food matrices and compared it with that of spores of C. botulinum strains 7273B, 62-A, 213B, and NCTC2916 under the same conditions. They found that NCTC8594 was more sensitive to combined high temperature and pressure than the majority of C. botulinum spores tested and concluded that NCTC8594 may not be a suitable surrogate for C. botulinum, at least for high-pressure thermal process validation studies (49).
Some of the factors that contribute to the overall resistance of bacterial spores include the type, pH, and mineral content of sporulation medium used (50)(51)(52)(53), heating menstruum (54), and sporulation temperature (55,56). It is also well established that the resistance of spore crops of a particular strain to high temperature and combined high temperature and pressure can vary from batch Ϫ a The growth of isolates was assessed by turbidity of TPGY broth tubes in duplicate. The results are shown as follows: ϩ/ϩ, growth was observed in both TPGY broth tubes; Ϫ/Ϫ, growth was not observed in either TPGY broth tube; ϩ/Ϫ, growth was observed in only one TPGY broth tube; NT, not tested. b The initial concentration is shown in parentheses after the isolate. Non-heat-treated controls were heat shocked at 80°C for 10 min and are reported as spore-forming units per milliliter. to batch as well as being strain dependent (9,26,49,57,58). Based on the results of this study, a reexamination and reevaluation of the thermal destruction data reported in the literature based on the "isolate" of PA 3679 used in a particular study are warranted. The second isolate purified from the ATCC 7955 1961 spore crop, isolate 1961-4, shared the most similarity with other clade II isolates, including isolates NFL, UW, Campbell, and FDA. The spores produced by clade II isolates showed a higher degree of heat resistance based on the thermal studies performed, and it is likely that clade II isolates more accurately represent E. J. Cameron's PA 3679 from 1927 (9). C. sporogenes PA 3679 was deposited into the ATCC around 1941 (9) and used in various laboratories as a model organism for group I (proteolytic) C. botulinum. It is possible that the 1961 crop contained E. J. Cameron's PA 3679 strain, but that the crop was contaminated and the strain was outcompeted over time. The later 1990 and 2007 spore crops of ATCC 7955 may have been distributed to various laboratories, and data published using subcultures of those spore crops may not adequately represent PA 3679.
Weigand et al. (24) identified a subset of C. sporogenes cladespecific genes. We are currently in the process of screening clade II C. sporogenes PA 3679 isolates for C. sporogenes clade-specific genes identified by Weigand et al. (24). A thorough genomic comparison between clade II isolates of C. sporogenes PA 3679 and group I C. botulinum strains to highlight the gene content differences between these organisms, especially genes that may be involved in increased spore heat resistance, is also in progress. Understanding the genetic mechanisms of spore heat resistance is paramount to the design and development of adequate countermeasures to ensure the production of wholesome and safe, minimally processed foods.

FUNDING INFORMATION
Y.W. was supported by an appointment to the Research Participation Program at the Center for Food Safety and Applied Nutrition administered by the Oak Ridge Institute for Science and Education via an interagency agreement between the U.S. Department of Energy and the FDA. The sponsor had no role in the study design, data collection and analysis, the decision to publish, or the preparation of the manuscript.