Enrichment of Clinically Relevant Organisms in Spontaneous Preterm-Delivered Placentas and Reagent Contamination across All Clinical Groups in a Large Pregnancy Cohort in the United Kingdom

Preterm birth is associated with both psychological and physical disabilities and is the leading cause of infant morbidity and mortality worldwide. Infection is known to be an important cause of spontaneous preterm birth, and recent research has implicated variation in the “placental microbiome” in the risk of preterm birth. Consistent with data from previous studies, the abundances of certain clinically relevant species differed between spontaneous preterm- and nonspontaneous preterm- or term-delivered placentas. These results support the view that a proportion of spontaneous preterm births have an intrauterine-infection component. However, an additional observation from this study was that a substantial proportion of sequenced reads were contaminating reads rather than DNA from endogenous, clinically relevant species. This observation warrants caution in the interpretation of sequencing outputs from low-biomass samples such as the placenta.

IMPORTANCE Preterm birth is associated with both psychological and physical disabilities and is the leading cause of infant morbidity and mortality worldwide. Infection is known to be an important cause of spontaneous preterm birth, and recent research has implicated variation in the "placental microbiome" in the risk of preterm birth. Consistent with data from previous studies, the abundances of certain clinically relevant species differed between spontaneous preterm-and nonspontaneous preterm-or term-delivered placentas. These results support the view that a proportion of spontaneous preterm births have an intrauterine-infection component. However, an additional observation from this study was that a substantial proportion of sequenced reads were contaminating reads rather than DNA from endoge-nous, clinically relevant species. This observation warrants caution in the interpretation of sequencing outputs from low-biomass samples such as the placenta. KEYWORDS contamination, infection, microbiome, pregnancy, preterm birth P reterm birth (PTB), defined as any delivery before 37 completed weeks of gestation, affects between 5 and 18% of pregnancies. It is the leading cause of neonatal morbidity and mortality worldwide and inflicts substantial physical, psychological, and economic costs upon affected families and wider society. Epidemiological studies implicate factors such as maternal ethnicity (1)(2)(3), age (4), body mass index (BMI) (5)(6)(7), and smoking (8). However, a detailed etiological understanding of spontaneous PTB (sPTB) remains limited.
An association between bacterial infection and sPTB has long been hypothesized, and evidence for its involvement continues to grow. It is estimated that between 25% and 40% of sPTBs involve intrauterine infection (8). This proportion increases steadily as the gestational age (GA) at birth decreases and may be a mediator in as many as 79% of births at 23 weeks of gestation (9). Infection is hypothesized to lead to PTB by eliciting an inflammatory response in the mother and/or fetus, triggering early labor and/or membrane rupture (10).
The majority of studies reporting the presence of bacteria in intrauterine tissues have been designed to investigate the etiology of adverse pregnancy outcomes. However, the recovery of bacteria in healthy pregnancies has also been reported (14,18,19,23). Some authors have interpreted those observations as evidence that a "maternal microbiome" may be a functional component of normal human pregnancy (14,24). This has sparked considerable discussion regarding the meaning, reliability, and frequency of such nonpathogenic colonization (24)(25)(26)(27)(28)(29). It is a technical and statistical challenge to reliably differentiate endogenous, clinically meaningful bacterial DNA from contamination picked up during delivery or sample collection/preparation (30,31). Indeed, a recent study observed no difference in the microbial signatures from contamination in controls obtained at all stages of collection and sampling and DNA extracted from placental samples (32).
We conducted a nested case-control study of placental samples from term and preterm deliveries to explore the nature of intrauterine bacterial colonization in healthy and adverse pregnancies. It was hypothesized that placental samples taken from pregnancies culminating in sPTB would harbor microbial profiles distinct from those from placental samples taken from pregnancies culminating in nonspontaneous preterm birth (nsPTB) or term birth. Using samples from a large, United Kingdom-based pregnancy cohort, the Baby Bio Bank (BBB), the composition and structure of bacterial communities recovered from samples from preterm and term births were assessed in order to test this hypothesis, using targeted 16S amplicon sequencing. The impact of reagent contamination on the sequencing data set was also investigated.

RESULTS
Subject demographics and sampling types. A total of 400 samples from 256 pregnancies were sequenced for this study, 50 of which were nonspontaneous preterm deliveries, 41 of which were spontaneous preterm deliveries, and 165 of which were term deliveries. The majority (89%) of biological samples were obtained from paren-chyma tissue, and a small number of pregnancies also had matching villous tissue, from the maternal side of the placenta, available for sequencing. A summary of key maternal characteristics for the whole cohort is outlined in Table 1.
Significant proportions of total reads are reagent contaminants. We attempted to differentiate sequences amplified from original, endogenous bacterial DNA present in placental samples at delivery from contaminating reads. A total of 136 operational taxonomic units (OTUs) from 44 genera (see Table S1 in the supplemental material for a full list of contaminating genera identified) were flagged by using our definition of potential contaminants (see Materials and Methods). A total of 32 (73%) of these genera were reported previously to be reagent contaminants (30).
Some flagged contaminants present in negative extractions may have originated from the experimental samples themselves, rather than extraction kits. Sample-to-control crossover due to false index pairings during PCR was reported previously (33,34). On the basis of previous evidence and comparisons of relative abundances between negative and experimental samples, potential contaminant OTUs mapping to Lactobacillus spp., Veillonella spp., and Mycoplasma spp. were considered erroneously flagged and were not removed from samples.
The remaining "potentially contaminating" OTUs were removed from downstream analyses. This approach led to a substantial reduction in the size of the sample data set, discarding 933,083 reads, 20.3% of the total, and 132 OTUs from experimental samples. Following error checking and contaminant filtering, any sample with Ͻ500 reads was removed from the data set, and technical replicates were merged. A total of 3,590,138 reads were retained, mapping to 261 unique biological samples from 199 pregnancies (40 nsPTB, 33 sPTB, and 126 term samples). Prior to filtering, the median number of reads per sample was 2,831 (interquartile range [IQR] ϭ 976.5 to 8,741), and following filtering, this value was 2,526 (IQR ϭ 1,166 to 8,479). There was no difference in the distributions of subject groupings following the removal of contaminant OTUs (chisquared value ϭ 0.17; P ϭ 0.92). Following filtering, 146 pregnancies retained 1 biological replicate, 44 pregnancies had 2, and 9 pregnancies had 3.
The removal of contaminant reads and low-abundance samples led to an observable shift in the taxonomic composition of the data set. This can be seen by comparing the rank abundance curves for the 20 most widely abundant OTUs before and after filtering, between which only 4 OTUs were shared (Fig. 1). The most widely abundant OTU in the nonfiltered sample set mapped to the skin commensal Propionibacterium acnes and was present in 90.8% of the samples. In contrast, the most widely abundant OTU in the filtered data set mapped to Lactobacillus crispatus and was present in only 59.4% of samples. Eight of the 20 top OTUs in the filtered data set mapped to the Lactobacillus genus.
Delivery method is influential for certain highly abundant genera. The 20 most widely abundant OTUs in the filtered data set were all present in Ͼ20% of samples ( Fig.  1). However, when the taxonomic makeups of vaginally delivered and caesarean section (CS)-delivered placentas were compared, clear differences were observed. The seven genera with the highest mean relative abundances from CS deliveries combined with the seven genera with highest mean relative abundances from vaginal deliveries are shown in Fig. 2. A number of these highly abundant genera were shared between CS delivery and vaginal delivery groups. However, when the relative abundances of these genera were compared between the two groups, 6 were significantly differentially abundant by delivery method (Table 2). Two common vaginal genera, Lactobacillus and Bacteroides, were significantly more abundant in samples from vaginal deliveries. In contrast, the common skin flora genera Streptococcus and Corynebacterium were present at significantly higher abundance in CS samples. sPTB is associated with novel and established genera from placental tissue. The primary hypothesis of this study was that certain organisms would be differentially abundant in placental tissue according to pregnancy outcome. Therefore, we compared the abundances of OTUs and genera in placental tissues from sPTB pregnancies with those from term or nsPTB pregnancies. Univariate analyses were first run on the 261 remaining filtered samples (41 sPTB, 47 nsPTB, and 173 term samples). Six genera had a significantly higher abundances (P Ͻ 0.01) in sPTB tissue than in nsPTB tissue (Ureaplasma, Prevotella, Salinicoccus, Mycoplasma, Capnocytophaga, and Anaerococcus), and seven genera were more highly abundant in sPTB than in term samples (Tepidimonas, Salinicoccus, Capnocytophaga, Mycoplasma, Anaerococcus, Truepera, and Coprobacillus) (see Tables S2 and S3 in the supplemental material for full results of unadjusted comparisons).
Models were then adjusted for the potential confounding effects of delivery method, recruiting hospital, maternal ethnicity, BMI, smoking behavior, and tissue type.  20 OTUs in filtered and unfiltered data sets. The 20 most widely abundant OTUs in the data sets before and after filtering for negative contaminants were identified, and their percent presences across the cohort are compared. Only 4 OTUs were shared between the filtered and unfiltered data sets, and many OTUs that were removed mapped to known contaminants.
These confounders were selected on the basis of previous evidence of associations with both gestation length and microbiome profiles. This adjusted cohort was smaller than that used for univariate analyses (total, n ϭ 219; sPTB, n ϭ 41; nsPTB, n ϭ 47; term birth, n ϭ 131) due to missing data on delivery method and maternal BMI.  The genera of those OTUs with high abundances in sPTB placentas, compared to either nsPTB or term deliveries following adjustment, and with P values of Ͻ0.01 are listed in Table 3 (full results are listed in Tables S4 and S5 in the supplemental material). When this comparison was repeated with all OTUs pooled at the genus level, 4 genera were found at significantly higher abundances in sPTB than in nsPTB placentas at a threshold P value of Ͻ0.01 (Mycoplasma, Ureaplasma, Mogibacterium, and Salinicoccus) ( Table 4). Eight genera were more highly abundant in the sPTB-versus-term comparisons at the same threshold (Anaerococcus, Capnocytophaga, Coprobacillus, Erwinia, Mycoplasma, Salinicoccus, Turicibacter, and Tepidimonas) ( Table 5).
Beta diversity does not differentiate between pregnancy outcomes. Recent work has suggested that a recognizable shift in structure of an overall "placental microbiome" may be observable in beta diversity comparisons between the different outcome groups (14). In order to investigate this hypothesis in our cohort, weighted UniFrac, unweighted UniFrac, and Bray-Curtis distance matrices were produced from the data to estimate beta diversity. Distances were then plotted by using principalcomponent analysis (PCoA) to visualize the first two axes and colored by pregnancy outcome (Fig. 3). Samples did not clearly cluster by pregnancy outcome with any of the methods used. Results from analyses conducted to quantify differences in beta diversity between outcomes produced very similar R 2 values, all of which were significant at a P value of 0.001 (see Table S6 in the supplemental material). However, only a very small proportion of the variance between samples (ϳ2%) was accounted for by these groupings using this method.

DISCUSSION
The results from this study show that bacterial DNA from a variety of organisms was present in placental samples taken from a cohort of both normal and complicated pregnancies in the United Kingdom. This observation of a low-level, relatively diverse placental microbial signature is supported by data from other recent molecular studies  working with the same tissues (14,16,18,23,35). A number of organisms of potential clinical relevance were also shown to be enriched in sPTB placentas compared to nsPTB and term tissues. However, we also observed widespread contamination across all placental samples, regardless of clinical outcome. These observations implicated both the delivery method and reagent contamination as significant contributors to our overall sequencing output. In addition, beta diversity analyses did not support the existence of a "unique preterm microbiome" in terms of a structured community shared among this particular obstetric group and distinguishable from other outcome groups. The very subtle differences observed in overall taxonomic compositions between clinical groups, in conjunction with the unclustered data in the PCoA plots, imply that such variation is unlikely to be clinically relevant.
Sequencing of a tissue with a low microbial biomass, such as the placenta, is a significant methodological and statistical challenge. In order to understand the clinical significance of those organisms identified by 16S sequencing, establishing the provenance of the sequenced DNA is critical. The presence of nonendogenous transcripts in sequencing outputs could have been the result of true bacterial contamination picked up during sample collection and experimental preparation processes or a function of PCR and sequencing artifacts. It was clear from our analyses that a significant proportion of the DNA identified was likely acquired during sample preparation, rather than being representative of true endogenous colonization. As a result, many samples were removed after filtering, as they did not yield sufficient noncontaminant DNA for analysis. Only four of the most widely abundant OTUs in the nonfiltered data set were also present in the filtered data set. Overall, the filtered OTUs were spread less widely across the sample sets, which is in line with a conceptualization of contaminants as organisms that are likely to affect all samples equally. These data demonstrate the need to account for contaminants in microbiome studies and the potentially erroneous conclusions that could be drawn from the data if their impact is ignored, as has been noted previously by other authors (29,32).
Delivery method is a potentially important confounder in associations between GA at birth and placental microbial profiles. In our data, the highest proportion of vaginal deliveries was in the sPTB group, and the lowest proportion was in the nsPTB group. This trend may reflect the higher incidence of clinical indications, such as preeclampsia among nsPTB pregnancies, which require swift delivery to protect the health of the mother and baby. In our study, many OTUs were shared across placental samples, which could be interpreted as support for the existence of a common placental microbiome across pregnancies. However, it is notable that many of these shared OTUs mapped to skin and vaginal commensals that clearly varied by delivery method. Such comparisons in our study support the hypothesis that much of the signal observed in our data set may reflect contamination picked up during delivery (vaginal or CS), rather than a truly endogenous placental microbiome. These observations highlight the importance of accounting for delivery method in differential-abundance comparisons.
This study provided evidence for the enrichment of a number of organisms within sPTB placental samples, independent of the mode of delivery. Although the majority of OTU and genus groupings did not reach statistical significance once adjusted for multiple testing (Q Ͼ 0.1), some of the most significantly differentiated genera, such as Mycoplasma and Ureaplasma spp., were reported previously to be opportunistic intrauterine pathogens highly correlated with the incidence of sPTB (18,(36)(37)(38)(39)(40)(41), supporting the output from our study. Others, such as Capnocytophaga, have been less well studied with respect to sPTB pathogenesis. Interestingly, the main organisms that were associated with PTB in the most widely cited placental microbiome study by Aagaard et al. (14), such as Burkholderia spp., did not overlap those found here.
Capnocytophaga was present across a number of samples at a relatively low total abundance compared to genera such as Mycoplasma. However, it was one of only two genera, along with the well-known PTB-associated organism Ureaplasma, that remained significantly associated with sPTB placentas following adjustment for multiple testing. This anaerobic organism is usually isolated from the oral cavity and is rarely isolated from the genital tract. However, it was previously associated with intrauterine infection in a few reports (42)(43)(44). Outside pregnancy, Capnocytophaga infections tend to be most commonly reported in immunocompromised children and are known to be involved in periodontitis (45), which is interesting given the immunosuppressed state of pregnancy.
Dissection and extraction of samples for this study were carried out by using rigorously controlled, sterile procedures. However, our placental samples were not originally collected with the intention to be used in microbial analyses. We acknowledge that this is a limitation of our study. Similarly, the storage and cleaning reagents RNAlater and phosphate-buffered saline (PBS), which were used by the collection team, were not available for use as comparative sequencing controls, as the extraction reagents were. Therefore, the estimation of contaminating reads carried out during the data cleanup stage may have been an underestimate of the true burden of exogenous bacterial reads in our placental data set. However, adjustment in final models for hospital collection site would account for at least some site-specific contamination patterns.
The specific study of the placental microbiome remains in its infancy. This research and other similar data suggest that there may be a low-level nonpathogenic placental microbiome present in many, if not all, placentas. However, differentiating this from organisms picked up at delivery or during experimental handling is an ongoing challenge. In addition, analyses of the overall community structure in our samples did not reveal convincing evidence for the existence of a reproducible "preterm placental microbiome." Our study provides one of the largest cohorts of 16S-sequenced placental tissue from sPTBs in the literature. This study gathered novel data on a tissue that remains relatively unexplored, from an unbiased microbiological perspective. The cohort consisted of a large number of spontaneous, early preterm births, providing a powerful opportunity to detect colonization patterns relevant to adverse pregnancy outcomes. Furthermore, the use of a specifically defined "nonspontaneous" preterm birth group was a novel addition. These nsPTB placentas provided a comparison group that was essentially matched for GA with the sPTB cases but very likely had a different underlying etiology. Further work is required to elucidate the clinical significance of specific organisms identified here for sPTB initiation and develop more-targeted strategies to mitigate their pathogenic effect.

MATERIALS AND METHODS
Study design and recruitment. Samples used in this study were taken from the large United Kingdom-based resource for research into complications in pregnancy, the Baby Bio Bank (BBB) (https:// www.ucl.ac.uk/tapb/sample-and-data-collections-at-ucl/biobanks-ucl/baby-biobank). Recruitment and sample collection occurred at Queen Charlotte and Chelsea, Chelsea and Westminster, and St Mary's hospitals in London. Detailed descriptions of the contents of the bank and sample collection procedures were reported in a previous study (46). Written informed consent was obtained from participants in advance of delivery. Preterm births were defined as any delivery at Ͻ37 weeks of gestation. Subjects were further divided into sPTB and nsPTB subcategories by using available clinical data on labor and delivery. sPTB was defined as any delivery precipitated by spontaneous labor and/or spontaneous membrane rupture. Nonspontaneous events consisted of artificial or no membrane rupture, combined with induced or no labor events, and were often precipitated by maternal clinical events such as preeclampsia.
Placental sample collection and storage. All placental samples were collected by the hospital's maternity team and dissected by a BBB recruiter following delivery. For each placenta, 1-cm 3 specimens were excised from four points below the membrane on the chorionic plate (placental parenchyma), close to the umbilical cord entrance. Villous tissue pooled from 6 sites on the maternal basal plate and combined into one collection tube was also collected from a subset of placentas. All samples were rinsed in PBS to remove excess maternal blood, placed into barcoded cryogenic tubes along with 5 ml of RNAlater, and stored at Ϫ80°C.
Placental DNA extraction. A total of 20 to 50 mg of either villous or chorionic plate placental tissue was excised from stored samples in a sterile laminar flow tissue culture hood, using sterile disposable scalpels and petri dishes. Total DNA was then extracted by using the Qiagen DNeasy blood and tissue kit, according to the manufacturer's instructions, with an additional bead-beating step with MPBio Lysing Matrix B beads to minimize Gram-negative extraction bias (47,48). All DNA was stored at Ϫ20°C until required. A negative extraction control, in which no tissue was added to extraction reagents and the normal protocol was carried out, was produced for every round of extractions.
16S rRNA gene high-throughput sequencing. Libraries were prepared by using primers targeting the V5-V7 regions of the 16S rRNA gene (785F [5=-GGATTAGATACCCBRGTAGTC-3=] and 1175R [5=-ACG TCRTCCCCDCCTTCCTC-3=]) (16,35). Primers were adapted for high-throughput sequencing with the addition of Illumina P5 or P7 adapter sequences and barcoded dual-index forward and reverse sequences taken from a previous study (49). Ultrapure Taq DNA polymerase (Molzym) was used to minimize the chance of contamination of the sequencing library from bacteria present in PCR reagents, and a 32-cycle endpoint PCR was run to amplify the bacterial template (see Tables 6 and 7 for reaction components and specific cycling conditions, respectively). PCR products were double cleaned by using 0.8ϫ AMPure XP beads, according to the manufacturer's instructions. Custom primers were loaded onto an Illumina MiSeq instrument along with the cleaned and diluted library for a 500-cycle V2 kit.
Identification of contaminants. Negative extractions and PCR blanks from our study were examined for the presence of potential contaminants. Only negative samples with Ն500 reads were analyzed (n ϭ 19). Any OTUs with at least two reads in at least two of the negative extraction samples were considered potential contaminating OTUs.
Bioinformatic analyses. Paired-end 250-bp sequenced reads were merged by using FLASH v1.2.11. Reads were demultiplexed, quality filtered, and assigned taxonomic labels within the Quantitative Insights into Microbial Ecology (QIIME) 1.9.1 pipeline (50). The UCLUST algorithm (51) within QIIME was used to pick OTUs at 97% similarity against the Greengenes core reference database version 12.10 (52). Any sequences that failed to match at 97% similarity were clustered de novo by using UCLUST. De novo chimera removal was carried out by using UCHIME. A representative sequence was then chosen for each OTU, and this sequence was aligned to the Greengenes "Core Set" taxonomic alignment (53) by using PyNAST (54). These aligned sequences were used to build a phylogenetic tree by using FastTree 2.1.3 (55). Taxonomy was assigned by using RDP Classifier 2.2 (56) and the Greengenes taxonomy reference database, from which an OTU table was constructed.
Taxonomic labels were assigned to the highest possible level by using the QIIME pipeline, and data were converted to relative abundances for use in all subsequent analyses. The Phyloseq v1.22.3 package in R was used to combine OTU, clinical, taxonomic, and phylogenetic data into a single object suitable for relative-abundance comparisons and diversity analyses (57). Prior to analysis, samples with Ͻ500 reads were removed, as samples with low read depth do not capture the entire diversity of a sample, thus limiting the capacity to generate reliable diversity metrics.
To improve the power of regression models to identify differences in the placental microbiota between groups of interest, all biological replicates from both villous and parenchymal tissue were analyzed together in mixed-effect models that included a multilevel intercept for participant identifications. All analyses were carried out in R 3.4.3. Additional R packages and versions used are listed in the supplemental material.
Differential-abundance testing: Limma. Raw abundance data were normalized by using the variance-stabilizing transformation (VST) approach in DESeq2 v1.18.1 (58) within Phyloseq. These normalized data were then transformed into a Limma v3.34.6 (59) object to utilize the multilevel functions available in this package. Any OTUs unassigned at the level of genus and with Յ10 reads were removed. Adjusted models were corrected for the potential confounding influences of delivery method, maternal ethnicity, collection hospital, maternal BMI, tissue type, and maternal smoking. All models were run with  a random intercept to account for correlations between biological replicates. P values were corrected for multiple testing by using the Benjamini-Hochberg procedure to produce Q values (60). OTUs were also merged to the level of genus, and models were then rerun, to investigate genus-level, rather than OTU-level, associations. Calculation of beta diversity. Three common methods for assessing distance or dissimilarity between samples or groups of samples were performed by using the VST-normalized data of filtered OTU counts: weighted UniFrac (61), unweighted UniFrac (62), and the Bray-Curtis dissimilarity metric (63). VST matrices can include negative values, which represent zero or very low original counts and are not permitted by certain distance metrics. To mitigate this, any negative value was replaced with a zero, under the assumption that these cases were of very low, or near-zero, abundance and thus of negligible importance to the hypotheses under investigation. Following the computation of the three metrics, differences between the outcome groups of interest were visually explored by using PCoA. The adonis function in the Vegan v2.4.6 package (64) was used to quantify differences in beta diversity values between outcomes of interest. Significance tests were performed by using F-tests, from 999 permutations of the raw data.
Accession number(s). Sequencing data have been deposited within the EMBL-EBI European Nucleotide Archive under study accession no. PRJEB25986.

SUPPLEMENTAL MATERIAL
Supplemental material for this article may be found at https://doi.org/10.1128/AEM .00483-18.