ABSTRACT
Genomes of extremely thermophilic Caldicellulosiruptor species encode novel cellulose binding proteins, called tāpirins, located proximate to the type IV pilus locus. The C-terminal domain of Caldicellulosiruptor kronotskyensis tāpirin 0844 (Calkro_0844) is structurally unique and has a cellulose binding affinity akin to that seen with family 3 carbohydrate binding modules (CBM3s). Here, full-length and C-terminal versions of tāpirins from Caldicellulosiruptor bescii (Athe_1870), Caldicellulosiruptor hydrothermalis (Calhy_0908), Caldicellulosiruptor kristjanssonii (Calkr_0826), and Caldicellulosiruptor naganoensis (NA10_0869) were produced recombinantly in Escherichia coli and compared to Calkro_0844. All five tāpirins bound to microcrystalline cellulose, switchgrass, poplar, and filter paper but not to xylan. Densitometry analysis of bound protein fractions visualized by SDS-PAGE revealed that Calhy_0908 and Calkr_0826 (from weakly cellulolytic species) associated with the cellulose substrates to a greater extent than Athe_1870, Calkro_0844, and NA10_0869 (from strongly cellulolytic species). Perhaps this relates to their specific needs to capture glucans released from lignocellulose by cellulases produced in Caldicellulosiruptor communities. Calkro_0844 and NA10_0869 share a higher degree of amino acid sequence identity (>80% identity) with each other than either does with Athe_1870 (∼50%). The levels of amino acid sequence identity of Calhy_0908 and Calkr_0826 to Calkro_0844 were only 16% and 36%, respectively, although the three-dimensional structures of their C-terminal binding regions were closely related. Unlike the parent strain, C. bescii mutants lacking the tāpirin genes did not bind to cellulose following short-term incubation, suggesting a role in cell association with plant biomass. Given the scarcity of carbohydrates in neutral terrestrial hot springs, tāpirins likely help scavenge carbohydrates from lignocellulose to support growth and survival of Caldicellulosiruptor species.
IMPORTANCE The mechanisms by which microorganisms attach to and degrade lignocellulose are important to understand if effective approaches for conversion of plant biomass into fuels and chemicals are to be developed. Caldicellulosiruptor species grow on carbohydrates from lignocellulose at elevated temperatures and have biotechnological significance for that reason. Novel cellulose binding proteins, called tāpirins, are involved in the way that Caldicellulosiruptor species interact with microcrystalline cellulose, and additional information about the diversity of these proteins across the genus, including binding affinity and three-dimensional structural comparisons, is provided here.
INTRODUCTION
The natural capacity to utilize both the cellulose and hemicellulose content of plant biomass as microbial growth substrates is relatively rare, especially among extreme thermophiles growing optimally above 70°C (1). However, in pH-neutral, terrestrial hot springs and thermal features, species from the genus Caldicellulosiruptor, all of which utilize hemicellulose, can be isolated, but only some can hydrolyze microcrystalline cellulose (2, 3). To degrade plant material, Caldicellulosiruptor species draw from an inventory of intracellular, surface (S)-layer-associated, and secreted glycoside hydrolases (GHs) with complementary modes of action (4, 5). Cellulosomal and “free” enzyme systems, which can be modular, are commonly found in other cellulolytic organisms, such as Clostridia (6) and Trichoderma (7), respectively. Many Caldicellulosiruptor carbohydrate-active enzymes (CAZymes [8]) are also modular, consisting of combinations of catalytic and noncatalytic (e.g., carbohydrate binding module [CBM]) domains connected by proline/threonine-rich linkers in various arrangements. The best-studied example is the multifunctional cellulase CelA, which is arranged as GH9-CBM3-CBM3-CBM3-GH48 domains, where the numbers refer to specific protein families (9–15). The synergy between the endoglucanase (GH9) and exoglucanase (GH48) domains contributes to a novel mode of action for CelA that involves physically burrowing into cellulose fibers, thereby creating cavities for further enzymatic access to the carbohydrate content of plant biomass (9).
While not directly responsible for lignocellulose (switchgrass) degradation, the noncatalytic domains in CelA also play an important role. In general, CBMs improve the efficacy of GHs by ensuring proximity to the substrate, as well as contributing to thermostability in some cases (16, 17). CBM3s, in particular, are specific to cellulose and allow enzymes such as CelA to attach to their substrates such that their GH domains are proximate to their substrate (9). Other noncatalytic protein features in Caldicellulosiruptor also play a role in orienting cells to lignocellulosic carbohydrates. S-layer homology (SLH) domains are associated with certain modular GHs in these bacteria (18, 19). For instance, Calkro_0402, a xylanase with GH10, CBM22, and CBM9 domains, is anchored to the cell surface of the strongly cellulolytic species Caldicellulosiruptor kronotskyensis and the gene encoding this enzyme is highly transcribed during growth on lignocellulose (switchgrass). When inserted into the genome of Caldicellulosiruptor bescii, Calkro_0402 improved the attachment of cells to xylan and significantly increased xylan degradation, despite the fact that wild-type C. bescii produces other xylanases (18).
In addition to CBMs within modular enzymes, Caldicellulosiruptor species also use noncatalytic proteins to bind to lignocellulosic substrates. Transcriptomic and proteomic analysis of cellulose-bound Caldicellulosiruptor cultures identified the presence of carbohydrate binding proteins (20), including the recently characterized “tāpirins” (21). Tāpirins typically have a Mr of approximately 70 kDa as a single polypeptide; recombinant versions from C. kronotskyensis have a specific affinity for cellulose fibers in plant material and an affinity for Avicel similar to that of CBM3s, despite the absence of significant structural homology (21). While the full-length structure has not been resolved, the cellulose-binding, 38.4-kDa C-terminal domain from C. kronotskyensis tāpirin 0844 (Calkro_0844) was successfully crystallized and shown to be novel within the current protein database. Hydrophobic and aromatic residues present on the face of a β-helix likely make up the binding pocket with a flexible loop overhanging and, potentially, protecting access to it (21). Interestingly, tāpirin genes are located near the type IV pilus (T4P) locus in Caldicellulosiruptor species, suggesting a potential functional connection.
In the present study, comparative assessment of tāpirins across the genus Caldicellulosiruptor was conducted (see Fig. 1). Structural data for two additional tāpirins from less-cellulolytic species are provided, as is an assessment of relative binding capacities. Additionally, the role of the tāpirins was further explored through deletion of genes in C. bescii and analysis of the resulting impact on binding.
Genomic organization of type IV pili, tāpirins, and glucan degradation loci in examined Caldicellulosiruptor species. Numbers refer to gene locus tags in sectioned loci. The tāpirins that were examined in this study are highlighted in purple. Abbreviations are as follows: Athe, C. bescii; Calhy, C. hydrothermalis; Calkr, C. kristjanssonii; Calkro, C. kronotskyensis; NA10, C. naganoensis.
RESULTS AND DISCUSSION
Tāpirins are necessary for rapid binding to cellulosic substrates.Degradation of lignocellulosic substrates by Caldicellulosiruptor populations is likely predicated on substrate attachment, as genes coding for key cellulases from the glucan degradation locus (GDL) (22) are genomic neighbors of a T4P locus and tāpirins (Fig. 1) (21). To assess the importance of the tāpirins and/or T4P in binding to cellulose, knockouts (KOs) of tāpirins (Athe_1870–1871) and the entire pilus locus plus tāpirins (Athe_1870–1885) were generated in C. bescii. After incubation with Avicel for 1 h, both knockouts, unlike the parent strain, showed no propensity for binding, based on changes on planktonic cell densities (Fig. 2). Statistically insignificant changes were noted in the planktonic cell density for the tāpirins and tāpirins/T4P KOs grown in the presence of microcrystalline cellulose, while the planktonic cell density in the parent strain was reduced by more than 7 × 107 cells/ml. This suggests that the tāpirins play a role in cell adherence to cellulosic substrates, at least during initial exposure, which could be critical for scavenging carbohydrates in otherwise nutrient-limited hot springs.
Whole-cell binding assay of the parent Caldicellulosiruptor bescii strain (MACB1018) versus C. bescii tāpirin and pilus deletion (ΔAthe_1870–1885; RKCB135) and tāpirin-only deletion (ΔAthe_1870–1871; RKCB136) strains. Cell concentration values refer to cells not attached to Avicel and/or the test tube wall after 1 h of incubation. Binding assays included 5 × 108 to 1 × 109 cells/ml in 671d medium incubated with or without 10 mg of Avicel for 1 h at 70°C. Planktonic cell concentrations were quantified using epifluorescence microscopy. Error bars represent standard errors of results from triplicate samples (the asterisk [*] indicates statistical significance; n.s., not significant).
Tāpirins are ubiquitous in the genus Caldicellulosiruptor.Putative tāpirins have been found in all genome-sequenced Caldicellulosiruptor species and belong to one of two groups (class 1 and class 2), based on their localization with genes encoding the glucan degradation locus and T4P; class 1 tāpirins are closest to the T4P locus and class 2 tāpirins (if there are two or more tāpirins) closest to the GDL (Fig. 1) (21). The GDL encodes up to seven GHs, which collectively contain GH5, GH9, GH10, GH12, GH44, GH48, GH74, and CBM3 domains, and these enzymes play an essential role in microcrystalline cellulose hydrolysis (22). The most cellulolytic Caldicellulosiruptor species (e.g., C. bescii, C. kronotskyensis, and C. naganoensis) produce six to seven GDL GHs, while genomes from the less cellulolytic species possess an incomplete set of the GDL enzymes, lacking modular enzymes with a GH48 domain. For example, C. kristjanssonii, which is less cellulolytic, produces two of the GDL enzymes, while C. hydrothermalis, which minimally degrades microcrystalline cellulose, lacks all of the GDL GHs (Fig. 1). Amino acid sequence analysis indicates that while some tāpirins are closely related (e.g., Calkro_0844 and NA10_0869 from two highly cellulolytic species, C. kronotskyensis and C. naganoensis, are 85% identical at the amino acid level), Calhy_0908 from the weakly cellulolytic C. hydrothermalis species is less than 18% identical to the other tāpirins (Table 1). Interestingly, the N termini of these tāpirins appear to share a high number of identical and conserved amino acids across the whole protein sequence (seen with residues 1 to 298 in Fig. S1 in the supplemental material, on the right side of the indicated linker), with the exception of Calhy_0908; in fact, Athe_1870 shares 86% amino acid identity (%ID) in this N-terminal range with both Calkro_0844 and NA10_0869 (see Table S1A in the supplemental material). Overall, the N terminus of the tāpirin may be responsible for how tāpirins, in general, associate with the cell surface whereas the C terminus (see Table S1B) establishes the binding function, the latter of which was determined to be the case for Calkro_0844 (21).
Characteristics of selected tāpirins from Caldicellulosiruptor speciesa
In vitro binding assays with tāpirins and plant-based substrates.Tāpirins were initially characterized from C. kronotskyensis (Calkro_0844 and Calkro_0845) and from Caldicellulosiruptor saccharolyticus (Csac_1073) (21). Binding assays showed that tāpirins from these two species preferentially adhered to cellulose. To confirm that this substrate specificity was consistent across the genus Caldicellulosiruptor, four additional tāpirins from weakly to strongly cellulolytic species were recombinantly produced to examine their binding to plant biomass-related substrates (see Fig. 3). Athe_1870, Calhy_0809, Calkr_0826, and NA10_0869 from C. bescii, C. hydrothermalis, C. kristjanssonii, and C. naganoensis, respectively, along with the previously characterized Calkro_0844, were incubated with Avicel, filter paper, and xylan, as well as with lignocellulose (switchgrass and poplar). After incubation, the samples were split into “unbound” and “substrate-bound” fractions, where the substrate-bound protein was released upon denaturation in Laemmli sample buffer, after which both fractions were visualized with SDS-PAGE. No tāpirins bound to xylan to any significant extent, but they consistently adhered to the other substrates. It was interesting that tāpirins from weakly cellulolytic species (Calkr_0826 and Calhy_0908) bound to poplar and switchgrass to a greater extent than the tāpirins from more strongly cellulolytic species. In fact, the tāpirin from the least cellulolytic species tested, Calhy_0908, appears to adhere to more binding sites on cellulosic materials overall.
SDS-PAGE gel of tāpirin binding assay to assess binding to plant component substrates. Recombinant strains Athe_1870, Calhy_0908, Calkr_0826, Calkro_0844, and NA10_0869 cloned and produced from C. bescii, C. hydrothermalis, C. kristjanssonii, C. kronotskyensis, and C. naganoensis, respectively, were incubated with cellulose (Avicel and filter paper), lignocellulose (switchgrass and poplar), and xylan, along with a no-substrate control. Proteins were separated into bound (B) and unbound (U) fractions and visualized with SDS-PAGE; bands shown are representative of triplicate trials.
Densitometry analysis of the gels from the tāpirin binding assays supported the conclusions reached by visual inspection (see Fig. 4). All tāpirins tested had a binding preference for purified cellulose (to filter paper more than to Avicel, despite their similar crystallinities [23]). The larger particle size of filter paper (5/16-in.-wide circular disks) than of Avicel (50-μm-diameter particles), in addition to a higher protein absorption capacity (24, 25), may have been responsible for this difference. Interestingly, even though the tāpirins from the highly cellulolytic C. kronotskyensis and C. naganoensis species share 85% amino acid identity, NA10_0869, in contrast to Calkro_0844, bound equally well to switchgrass and to purified cellulose. Aside from C. naganoensis, the overall level of tāpirin binding to lignocellulosic substrates was lower than that seen with filter paper and Avicel, likely because of inaccessibility to microcrystalline cellulose in the plant biomasses. As indicated by the density of SDS-PAGE bands, tāpirins from both weakly cellulolytic species (Calhy_0908 and Calkr_0826) bound better to Avicel, poplar, switchgrass, and filter paper. Calhy_0908 bound better than Calkr_0826 to all substrates except switchgrass (Fig. 3). In fact, the bands corresponding to Calhy_0908 and Calkr_0826 were approximately 3 times darker than those corresponding to Athe_1870 and were 10-fold to 13-fold more intense than those corresponding to Calkro_0844 and NA10_0869 on filter paper (Table S2). Similar trends were noted, but to a lesser extent, on Avicel, with Calhy_0908 bound more than Calkr_0826 and Athe_1870 (2- to 3-fold) and significantly more than NA10_0869 (8-fold) and Calkro_0844 (16-fold) (Table S2). It is possible that since the less cellulolytic Caldicellulosiruptor species, such as C. kristjanssonii and C. hydrothermalis, cannot hydrolyze cellulose as well as other species (2), proximity to these substrates allows these species to exploit the collective hydrolytic capacity of cellulolytic communities in their natural environments.
Densitometry of recombinant proteins bound to cellulosic substrates as visualized with SDS-PAGE. Bound bands of recombinant Athe_1870 (gray), Calhy_0908 (orange), Calkr_0826 (yellow), Calkro_0844 (light blue), and NA10_0869 (dark blue) from C. bescii, C. hydrothermalis, C. kristjanssonii, C. kronotskyensis, and C. naganoensis, respectively, were quantified and normalized by the use of a 70-kDa Benchmark ladder band on the associated SDS-PAGE gels. Error bars represent the standard deviations of the intensities of triplicate samples.
Structural comparisons of tāpirins from Caldicellulosiruptor hydrothermalis, Caldicellulosiruptor kristjanssonii, and Caldicellulosiruptor kronotskyensis.The C-terminal domains of tāpirins from C. hydrothermalis and C. kristjanssonii, Calhy_0908C and Calkr_0826C, respectively, exhibited the same structural architecture as Calkro_0844C from C. kronotskyensis, with the core of the domain being a β-helix with a characteristic long loop connecting the ends of the helix (Fig. 5C). This fold was observed in the Calkro_0844C structure (21) and seems to be a common fold for tāpirins.
Crystal structures of Calkr_0826, Calhy_0908, and Calkro_0844. (A) Calkr_0826 and Calkro_0844 superimposed. (B) Calhy_0908 and Calkr_0844 superimposed. (C) Calhy_0908, Calkr_0826, and Calkro_0844 superimposed. (D) A 90° rotation of Calhy_0908, showing β-helix face designations (‘A’, ‘B’, and ‘C’). The colors correspond to the tāpirins as follows: green, Calhy_0908 (C. hydrothermalis); blue, Calkr_0826 (C. kristjanssonii); magenta, Calkro_0844 (C. kronotskyensis). Abbreviations are as follows: α, alpha helix; N, N terminus; C, C terminus. Colors of labels correspond to the colors of the peptides; black font is used to represent features shared across all tāpirins.
The Calkr_0826C structure was superimposed onto that of Calkro_0844C, with a root mean square deviation (RMSD) of 0.854 Å over 1,442 atoms, with the two domains having core β-helices of the same size (Fig. 5A). The β-helix in both cases has 11 complete turns, with the longest β-sheet (designated “face A”; see Fig. 5D) having 14 β-strands. A few differences, however, are evident: a loop between β8 and β9 in Calkro_0844C is 6 residues longer, β-strand β26 is missing in Calkro_0844C, and the α-helix in Calkro_0844C before β27 is not present in Calkr_0826C. The long loop connecting the ends of the β-helix (β28 to β29 [Calkro_0826C]) is of the same length (40 residues) in both Calkr_0826C and Calkro_0844C and features an α-helix located at the same position in the middle. The loop connecting the α2 helix to the β32 strand is 3 amino acid residues shorter in Calkr_0826C, which is compensated by the neighboring loop between β33 and β34, which is 9 residues longer in Calkr_0826C. Also, the loop between β36 and β37 is again 3 residues shorter in Calkr_0826. It is worth noting that the connecting loops of different lengths (β8 to β9, α2 to β32, β33 to β34, and β36 to β37) are adjacent and represent about half of the edge between face B (the next face after face A following the direction of the polypeptide; see Fig. 5D) and face C (the face following face B; see Fig. 5D) of the β-helix, opposite the connecting loop. Similarly to Calkro_0844C, Calkr_0826C has a hydrophobic surface on face A covered by the connecting loop with multiple flat-on-a-surface aromatic sidechains lined up along the β-sheet (Fig. 5A).
The C-terminal domain of Calhy_0908C is longer than that of Calkro_0844C and shares the least amino acid sequence homology (Table S1) to the other tāpirins examined here. The lack of homology actually led to an incorrect NCBI-BLAST sequence alignment (26), which in turn translated into an incorrect homology model that was unsuccessfully employed for molecular replacement. When the molecular replacement attempt failed, the structure of Calhy_0908C was determined via single-wavelength anomalous diffraction (SAD) using the anomalous signal of iodine atoms incorporated into the crystal after a short soak. Once the structure was determined, a structure-based sequence alignment was found to be a more reliable basis for comparing different tāpirins (Fig. 5B and C).
Calhy_0908C exhibits the same overall architecture as the other two tāpirins with known structures (Fig. 5C and D): the β-helix as the core of the domain and a long loop connecting the ends on the β-helix. The major difference between Calhy_0908C and two other structures is that the β-helix of Calhy_0908C is three turns longer than that of Calkro_0844C and Calkr_0826C, which is made possible by a massive 62-residue single insertion in the Calhy_0908C sequence. The rest of Calhy_0908C (residues 170 to 378 and 443 to 578) could be superimposed reasonably well onto Calkro_0844C, with an RMSD of 1.97 Å over 1,020 atoms.
Aside from additional turns of the β-helix, Calhy_0908C has other differences from Calkro_0844C (Fig. 5B). Similarly to the Calkr_0826C versus Calkro_0844C contrast (Fig. 5A), the set of loops connecting faces B and C of the β-helix is different. The loop connecting β5 and β6 is 7 residues longer in Calhy_0908C, and four neighboring loops are shorter in Calhy_0908C (the β8-β9 loop, β11-β12 loop, β39-β40 loop, and β42-β43 loop are 7, 2, 2, and 5 residues shorter, respectively). Another difference is the orientation of helix α3, which, while still present in the Calhy_0908C structure, is oriented almost perpendicularly to the corresponding helix of Calkro_0844C. To complement that rearrangement, the connecting loop in Calhy_0908C goes straight to the β36 without the 8-residue “detour” that is present in Calkro_0844C. Also, the β29 strand of Calkro_0844C is not present in Calhy_0908C, with its place taken by the repositioned α3 helix. Similarly to both Calkro_0844C and Calkr_0826C, in hydrophobic face A of Calhy_0908C, the β-helix features a line of aromatic residues (Fig. 6).
Surface features of tāpirins. (A) Calhy_0908 (C. hydrothermalis). (B) Calkro_0844 (C. kronotskyensis). (C) Calkr_0826 (C. kristjanssonii). The connecting loop and N and C termini were removed. Aromatic sidechains are highlighted in red.
Can the observed differences in tāpirin structures explain the differences in cellulose binding?As shown in Fig. 4, Calhy_0908 bound to more sites on cellulose than the other tāpirins examined here. Comparing the structural features of three of these tāpirins (Fig. 5C), there are two regions where differences are apparent. First, the sets of the loops connecting β-strands on the edge between faces B and C of the β-helix are different in these three proteins, varying in size and chemistry. However, it should be pointed out that the loops corresponding to Calhy_0908C are the least extensive, such that most of these are shorter than those in the corresponding regions in Calkr_0826C and Calkro_0844C. If these loops were responsible for the protein interaction with the cellulose, this difference would leave lower amounts of exposed surface area overall for the possible protein-cellulose interactions. However, that cannot be the case, as Calhy_0908 does indeed seem to bind well to cellulose despite the differences in the loop sets.
Still, another area of interest is the hydrophobic surface found on face A of the β-helix that is protected by the connecting loop in the crystallized conformations. We again tried to observe an interaction between the tāpirins and soluble cellooligosaccharides (C2 to C6); however, similarly to Calkro_0844C (21), neither Calkr_0826C nor Calhy_0908C cocrystallized or interacted with the oligosaccharides. Regardless, a possible cellulose-binding mechanism could involve repositioning of the connecting loop upon mechanical contact with the cellulose surface, exposing the line of aromatic sidechains positioned flat on the surface and spaced 5 Å apart, which corresponds to the distance between sugar units in the cellulose chain. This can be seen in Fig. 6, where Calhy_0908C has the largest hydrophobic surface of the three tāpirins as well as the largest number of aromatic side chains lined up on that surface.
Localization of tāpirins on the cell surface of Caldicellulosiruptor.There is evidence for tāpirin localization on the cell surface, based on immunofluorescence microscopy performed using antibodies directed against these proteins (Fig. 7). Tāpirins are most evident at the poles of the cell, although they also appear to decorate the cellular surface. Given their proposed role as cellulose-binding proteins, especially for initial attachment to substrates, this is consistent with that hypothesis.
Fluorescence microscopy of Caldicellulosiruptor kronotskyensis performed using tāpirin antibodies. C. kronotskyensis cells (in orange) grown on filter paper were incubated with primary antibody targeting the Calkro_0844 tāpirin and were visualized (in green) with fluorescence microscopy and acridine orange staining (done as described in reference 18).
Summary.It interesting that assays of binding to cellulosic substrates indicated that Calhy_0908 and Calkr_0826 from the weakly cellulolytic species C. hydrothermalis and C. kristjanssonii, respectively, appeared to bind in higher quantities to cellulose than those tāpirins from prolific microcrystalline cellulose degraders. Structures from the C termini of both Calhy_0908 and Calkr_0826, compared to the previously characterized Calkro_0844 (21), identified clear differences between the tāpirins, including a much longer potential binding platform in Calhy_0908, which contains the largest number of hydrophobic and aromatic residues among the three. Whether the densities of tāpirins on the cell surface differ across Caldicellulosiruptor species is not yet clear. But if they are somewhat equivalent within the genus, C. hydrothermalis may use tāpirin-based adhesion as a mechanism to support survival of a less-cellulolytic species in thermal environments.
While fluorescence microscopy indicates the presence of tāpirins on the outer side of the Caldicellulosiruptor cells, their exact cellular location needs to be resolved, especially as this relates to possible association with type IV pili. Another interesting issue going forward is whether tāpirins represent a uniquely Caldicellulosiruptor feature or whether counterparts are used by other microorganisms to attach to substrates or surfaces.
MATERIALS AND METHODS
Bacterial strains, plasmids, and substrates.The wild-type strain of Caldicellulosiruptor bescii was obtained from Leibniz Institute DSMZ—German Collection of Microorganisms and Cell Cultures, and C. bescii strain MACB1018 and genetic vector pGL0100 were developed previously (27). Escherichia coli strains 5-alpha (New England BioLabs) and Rosetta (Millipore Sigma, Merck) were used for plasmid replication and protein production, respectively. Genes of interest were PCR amplified from extracted genomic DNA, as described previously (28), and Gibson assembly (29) (Gibson assembly master mix; New England BioLabs) or a KLD (kinase, ligase, DpnI) reaction (KLD Enzyme Mix, New England BioLabs) was used to insert the fragments into plasmids that had been extracted with ZymoPure midiprep and Zymo Research plasmid miniprep classic kits (Zymo Research). Additionally, Athe_1870 (GenBank accession number WP_015908253.1), Calhy_0908 (GenBank accession number WP_041723111.1), Calkr_0826 (GenBank accession number WP_013432170.1), NA10_0869 (GenBank accession number WP_045165102.1), and Calkro_0844 (GenBank accession number WP_013429865.1) genes were E. coli codon optimized (without transmembrane domains or signal peptides) with the Integrated DNA Technologies (IDT) Codon Optimization Tool and synthesized by the DOE JGI on a pET-45 plasmid. All proteins produced included an N-terminal hexahistidine affinity tag. Sequences of all plasmids and edited portions of final strains were confirmed with Sanger sequencing (Genewiz). The substrates used for growth and binding included Avicel PH-101 (FMC BioPolymer), Cave-in-Rock switchgrass (Panicum virgatum L. from fields in Monroe County, IA, retrieved by the National Renewable Energy Laboratory and ground and sieved using a Wiley mill [Thomas Scientific] and 40/80 mesh, respectively), beechwood xylan (Sigma-Aldrich), and poplar (Populus trichocarpa) (obtained from Vincent Chiang [30]).
Production of tāpirin proteins in E. coli.Expression plasmids (pET-45; see above) with synthesized Athe_1870, Calhy_0908, Calkr_0826, NA10_0869, and Calkro_0844 genes were transformed into E. coli (strain 5-alpha with 50 μg/ml carbenicillin section and strain Rosetta with both 50 μg/ml carbenicillin and 34 μg/ml chloramphenicol selection) and cultured on Luria-Bertani (LB; 10 g/liter sodium chloride, 10 g/liter tryptone, and 5 g/liter yeast extract) liquid medium or agar (1.5% [wt/vol]) plates at 37°C. For the production of protein, the cultures were grown in ZYM-5052 autoinduction medium (31) with chloramphenicol and carbenicillin in up to 1-to-3-liter volumes at 37°C and 250 rpm for 18 to 24 h and harvested by centrifugation at 6,000 × g for 10 min. Cells were then resuspended in 100 ml of a mixture containing 20 mM sodium phosphate (pH 7.4), 0.5 mM sodium chloride, and 5 mM imidazole; lysed with a French press at 16,014 lb/in2; subjected to heat treatment at 65°C for 30 min; and centrifuged at 25,000 × g for 30 min. Protein in soluble fractions was purified with 5-ml HisTrap HP nickel-Sepharose immobilized metal affinity chromatography column (GE Healthcare) (operated according to the manufacturer's instructions) using a Biologic DuoFlow fast protein liquid chromatography FPLC instrument (Bio-Rad) and then stored at 4°C. Protein concentration was determined by the Bradford assay (32), and protein purity was visualized along with a Benchmark protein ladder (Life Technologies) by SDS-PAGE using 4% to 15% Mini-Protean TGX stain-free precast Gels (Bio-Rad).
Production and purification of C-terminal tāpirins Calkr_0826C and Calhy_0908C.Purified protein was subjected to buffer exchange into reaction buffer (0.4 mM calcium chloride, 0.15 M sodium chloride, 50 mM Tris-chloride) (pH 8.0) and treated with thermolysin (Promega) (1 mg/ml in the same buffer) as described previously (21). A thermolysin-to-protein ratio of 1:500 and treatment times of 5 to 30 min at 70°C were used to effectively lyse protein to produce the C-terminal portion of the tāpirin (i.e., Calkr_0826C for C. kristjanssonii and Calhy_0809C for C. hydrothermalis). Reactions were halted by storing samples on ice and then long term at 4°C. Samples were imaged with SDS-PAGE as described above to verify cleavage. For the crystallization of Calkr_0826C and Calhy_0908C, cleaved products were further purified using an ÄKTA protein purification system (GE Healthcare Life Sciences) and a Superdex 75 pg (16/60) size exclusion chromatography column with 20 mM Tris (pH 7.5) and 100 mM sodium chloride.
Crystallization.The crystals of Calkr_0826C and Calhy_0908C were initially obtained using sitting drop vapor diffusion and a 96-well plate with a polyethylene glycol (PEG) ion HT screen from Hampton Research (Aliso Viejo, CA). A 50-μl volume of well solution was added to the reservoir, and drops were made with 0.2 μl of well solution and 0.2 μl of protein solution using a Phoenix crystallization robot (Art Robbins Instruments, Sunnyvale, CA). The Calkr_0826C crystals were grown at 20°C using an optimization screen containing 0.1 M citric acid (pH 3.0 to 4.0) and 10% to 15% (wt/vol) PEG 3350 (the best crystals appeared in the pH range of 3.1 to 3.2 with PEG 3350 at 14% to 15%). The protein solutions contained 12 mg/ml of protein, 20 mM Tris (pH 7.5), 100 mM sodium chloride, 2% Hampton Research Tacsimate mix (pH 4), and a 5 mM concentration (each) of zinc acetate, potassium chloride, magnesium chloride, and calcium chloride. The Calhy_0908C crystals were grown at 20°C using an optimization screen containing 5 mM to 35 mM zinc acetate and 15% to 24% (wt/vol) PEG 3350 (the best crystals appeared with 0.015 M zinc acetate and a PEG 3350 concentration of 17% to 18%). The protein solutions contained 7.5 mg/ml of protein, 20 mM Tris (pH 7.5), 100 mM NaCl, and 2% Hampton Research Tacsimate mix (pH 7).
All crystals were soaked in well solution, with the PEG 3350 concentration increased to 25% along with 5% to 10% ethylene glycol added for cryo-protection. For the purpose of structure determination, an iodine derivative was obtained for Calhy_0908C by quick soaking of the crystals in the cryoprotectant described above with 0.1 M potassium iodide added.
Crystallography data collection and processing.The crystals of Calkr_0826C and Calhy_0908C were flash-frozen in a nitrogen gas stream at 100 K before home source data collection was performed using an in-house Bruker X8 MicroStar X-ray generator with Helios mirrors and a Bruker Platinum 135 charge-coupled-device (CCD) detector. Data were indexed and processed with the Bruker Suite of programs, version 2014.9 (Bruker AXS, Madison, WI).
Crystal structure solution and refinement.Intensities were converted into structure factors, and free sets of the reflections (5% of the reflections for Calkr_0826C and 2% for Calhy_0908C) were flagged for Rfree calculations using F2MTZ, Truncate, CAD, and Unique software from the CCP4 package of programs (33). The structure of Calkr_0826C was solved by MOLREP (34) using Calkro_0844_C (21) (PDB identifier 4WA0) as a search model. Crank2 (35) was used to solve the structure of Calhy_0908C by utilizing iodine single-wavelength anomalous dispersion (36). Refinement and manual correction were performed using REFMAC5 (37) version 5.8.158, PHENIX (38) version 1.11, and Coot (39) version 0.8.8. The MOLPROBITY method (40) was used to analyze the Ramachandran plot, and RMSDs of bond lengths and angles were calculated from ideal values of Engh and Huber stereo chemical parameters (41). Wilson B-factor values were calculated using CTRUNCATE version 1.15.10 (33). The data collection and refinement statistics are shown in Table S3 in the supplemental material.
Tāpirin binding assays.Recombinant proteins (both full length and truncated) were tested for attachment to various substrates in triplicate as described previously (21). All substrates were initially soaked with a mixture of 100 ml of 50 mM MES (morpholineethanesulfonic acid) and 3.9 mM sodium chloride at pH 7.2 (“binding buffer”) overnight and were then subsequently dried overnight (both steps were performed at 70°C). A 9-mg volume of the washed substrates was mixed with 40 μg of purified tāpirin protein and incubated in a thermomixer (Eppendorf) at 70°C and 500 rpm for 1 h. Samples were then centrifuged at 13,000 × g and separated into unbound (supernatant) and substrate-bound (pellet) fractions. The bound fraction was then washed (by resuspending the substrate in binding buffer followed by vortex mixing, centrifuging the mixture at 13,000 × g, and discarding the supernatant) four times before being finally resuspended in 250 μl of buffer prior to SDS-PAGE. Equal volumes of bound and unbound sample were mixed with 2× Laemmli sample buffer and 5% 2-mercaptoethanol and were boiled for 30 min. Samples were then loaded on a SDS-PAGE gel as described above. Densitometry was completed by using ImageJ (42) to analyze band intensity (keeping the contrast across gels consist) and by normalizing all bands to the 70-kDa Benchmark protein ladder (Life Technologies) band present on each gel.
Deletion of tāpirin and/or pilus genes in C. bescii.Knockout vectors were constructed with pyrE (Athe_1382) using Gibson assembly with pGL100 (27) as the backbone. Flanking regions outside Athe_1870–1871 and Athe_1870–1885 were PCR amplified using C. bescii MACB1018 genomic DNA as a template, while the vector backbone and kanamycin resistance gene (HTK) and SLP promoter (Athe_2303) and were PCR amplified from a template plasmid before Gibson assembly was used to assemble the knockout vectors (see below). A new genetic vector, pLLL023, was also generated using a KLD reaction and pGL100 to create a plasmid without Pslp-HTK on the backbone, such that Pslp-HTK could be inserted into the genome. Details of the primers used for vector construction and genetic screening are included in Table 2. The resulting vectors were generated for the indicated strains as follows: pLLL024 for RKCB136 (Athe_1870–1871 KO and Pslp-HTK knock-in [KI]) and pLLL012 for RKCB135 (Athe_1870–1885 KO). After construction, plasmids were methylated with purified recombinant CbeI methyltransferase (M.CbeI) as described previously (27 and 43).
Primers used in this study
C. bescii genetic strains were all cultured anaerobically in either low-osmolality defined (LOD) medium or low-osmolality complex (LOC) medium (44) with cellobiose in liquid cultures at 75°C with a nitrogen headspace and on 1.5% (wt/vol) agar plates in an anaerobic chamber (Coy Laboratory) at 65°C. To transform KO plasmids into C. bescii strain MACB1018 (27), 1 μg of plasmid DNA was added to 50-μl aliquots of competent cells (prepared as described previously [27] in LOD media supplemented with 1 × 19 amino acid solution [45]) at room temperature. Cells were then electroporated with a Gene Pulser II system with a Pulse Controller Plus module (Bio-Rad) in 1-mm-gap cuvettes (USA Scientific) at 25 μF, 200 Ω, and 2 kV before being passaged to 10 ml of preheated LOC medium for recovery at 75°C. After 1 h, all recovery media were passaged to selective media (LOC with 50 μg/ml kanamycin) and incubated at 75°C until growth was noted (typically 1 to 3 days). Growing transformants were screened with PCR and then passaged multiple times into liquid selective medium and/or plated on selective media for purification. Second crossovers were then selected by plating transformed cells onto LOD medium containing 4 mM 5-fluoroorotic acid (5-FOA), 50 μg/ml kanamycin (for HTK KI strains), and 40 mM uracil. Successful PCR-screened colonies were plate purified on the same second-crossover plating media without 5-FOA.
Whole-cell binding assay.Caldicellulosiruptor cells were cultured at 75°C in 100 ml of modified 671 medium with 5 g/liter Avicel, as described previously. Cultures were measured for cell density with acridine orange epifluorescence microscopy, as described previously (46), before the entire sample was initially centrifuged at 400 × g for 5 min to lightly pellet Avicel. The supernatant was removed, and cells were harvested by centrifugation at 6,000 × g for 10 min before being concentrated to 5 × 108 to 1 × 109 cells/ml in 671d medium. A 1-ml volume of cells was added to 10 mg of Avicel for 1 h in a thermomixer (Eppendorf) at 70°C; a control was also completed without any substrate (only media and cells), and all samples were completed in triplicate. The supernatants with unbound cells were then separated, and the cells were enumerated again with acridine orange epifluorescence microscopy and the average planktonic cell densities compared using a t test.
ACKNOWLEDGMENTS
This work was supported by the BioEnergy Science Center (BESC), a U.S. Department of Energy Bioenergy Research Center supported by the Office of Biological and Environmental Research in the DOE Office of Science. L.L.L. acknowledges support from a National Science Foundation Graduate Research Fellowship and a NIH T32 Biotechnology Traineeship (GM008776-11).
Synthetic genes for the production of the tāpirins were provided by the U.S. Department of Energy Joint Genome Institute, a DOE Office of Science User Facility, which is supported by the Office of Science of the U.S. Department of Energy under contract no. DE-AC02-05CH11231.
FOOTNOTES
- Received 13 August 2018.
- Accepted 18 November 2018.
- Accepted manuscript posted online 26 November 2018.
Supplemental material for this article may be found at https://doi.org/10.1128/AEM.01983-18.
- Copyright © 2019 American Society for Microbiology.