ABSTRACT
The purpose of this study was to examine host distribution patterns among fecal bacteria in the order Bacteroidales, with the goal of using endemic sequences as markers for fecal source identification in aquatic environments. We analyzed Bacteroidales 16S rRNA gene sequences from the feces of eight hosts: human, bovine, pig, horse, dog, cat, gull, and elk. Recovered sequences did not match database sequences, indicating high levels of uncultivated diversity. The analysis revealed both endemic and cosmopolitan distributions among the eight hosts. Ruminant, pig, and horse sequences tended to form host- or host group-specific clusters in a phylogenetic tree, while human, dog, cat, and gull sequences clustered together almost exclusively. Many of the human, dog, cat, and gull sequences fell within a large branch containing cultivated species from the genus Bacteroides. Most of the cultivated Bacteroides species had very close matches with multiple hosts and thus may not be useful targets for fecal source identification. A large branch containing cultivated members of the genus Prevotella included cloned sequences that were not closely related to cultivated Prevotella species. Most ruminant sequences formed clusters separate from the branches containing Bacteroides and Prevotella species. Host-specific sequences were identified for pigs and horses and were used to design PCR primers to identify pig and horse sources of fecal pollution in water. The primers successfully amplified fecal DNAs from their target hosts and did not amplify fecal DNAs from other species. Fecal bacteria endemic to the host species may result from evolution in different types of digestive systems.
The animal gastrointestinal (G.I.) tract maintains a rich microbial community in which many of the inhabitants are engaged in mutualistic association with one another and/or with the host (21, 37, 46). G.I. systems provide ideal models for investigating evolutionary relationships between hosts and their resident microbes. However, the microbial diversity and community structure of G.I. systems still are not well characterized. Analyses of 16S rRNA gene sequences from fecal DNAs suggest 60 to 80% of the human intestinal microflora remains uncultivated (20, 41). Studies of bacterial communities in other animal hosts reveal even greater uncultivated representation (8, 26, 32, 47). Molecular studies are beginning to identify differences in bacterial species composition in the G.I. tract dependent on diet and age as well as spatial, temporal, and host differences (14, 16, 17, 22, 27, 30).
Enteric bacteria may coevolve with their hosts (21, 37), and if so, host-specific clustering may be evident in phylogenetic analyses of gene sequences recovered from different host species. Staley and Gosink (40) used the term “cosmopolitan” to describe bacteria with global distribution patterns, while endemism refers to the tendency to inhabit specific geographic regions or hosts. Hedlund and Staley (19) reported that Simonsiella strains isolated from the oral cavities of humans, cats, dogs, and sheep were endemic to the host species. However, bacteria in the G.I. tract often do not demonstrate host endemism. Multilocus enzyme electrophoresis analysis of Escherichia coli isolates from 16 mammalian species in Australia demonstrated that only 6% of the genetic variation could be attributed to host species differences (16). The same study found host differences explained 20% of the variation in Klebsiella pneumoniae.
Patterns of microbial diversity in the G.I. tract have important implications for human and environmental health. These communities become a source of pathogens when released into aquatic environments as fecal pollution. Community members that exhibit host-specific distributions represent a valuable resource as potential markers for fecal pollution from specific sources.
Several studies have suggested that some species in the genus Bacteroides have host-specific distributions (2, 11, 25). Bernhard and Field used length heterogeneity PCR and terminal restriction fragment length polymorphism (T-RFLP) to identify human- and ruminant-specific 16S rRNA genetic markers for fecal anaerobes in the order Bacteroidales (3). They used the markers to design PCR primers for fecal source identification in aquatic environments (4). Characterization of additional markers should expand the applicability of this method to other animal hosts.
Members of the phylum Bacteroidetes (formerly Cytophaga/Flavobacter/Bacteroides) are phenotypically and ecologically diverse (5, 14). They are present in aquatic and terrestrial environments and are numerically important constituents of the animal oral cavity and intestinal microflora (8, 20, 26, 36, 41, 45). The cultivated fecal members of this phylum are in the order Bacteroidales and the genera Bacteroides and Prevotella (13).
Culture and culture-independent methods have revealed that Bacteroides species account for 20 to 52% of the human fecal flora (9, 12, 20, 22, 38, 41). Analyses of 16S rRNA gene clone libraries suggest slightly less abundance in nonhuman hosts. Leser and colleagues found 11.2% of the phylotypes in a pig fecal clone library were related to Bacteroides or Prevotella (26). Representatives of the Bacteroidales accounted for 18% of the recovered sequences in a clone library of horse fecal DNAs (8). Analyses of feces-derived 16S rRNA gene sequences have also found large contributions from the Bacteroidales in cattle (3, 45).
The objectives of this study were to establish the extent to which host-specific distributions (endemism) occur in the Bacteroidales, thus providing potential new host-specific markers for fecal pollution. Understanding the evolutionary relationships and host distributions of enteric microorganisms is critical to evaluating their practical application as fecal source indicators in aquatic ecosystems. We report the results of our 16S rRNA gene sequence analysis of Bacteroidales from the feces of eight hosts: human, bovine, elk, pig, dog, cat, gull, and horse.
MATERIALS AND METHODS
Sample collection and DNA extraction.We collected individual fecal samples from 20 dogs, 20 cats, 10 elk, 20 pigs, and 10 gulls. Samples came from Oregon State University animal research facilities, local animal shelters, farms, pet owners, hunters, and colleagues. Ten samples each were collected from human and bovine sources in a previous study (3), and sequences from that analysis were used in this study. Horse sequences were recovered from a manure pile beside a creek in Cincinnatti, Ohio, also in a previous study (39). Samples were collected in sterile containers and stored at −80°C. Approximately 300 mg feces (wet weight) from each sample were used in separate DNA extractions. The FastDNA kit for Soils (Q-Biogene, Carlsbad, CA) was used by following the manufacturer's protocol, with an additional wash using the SEWS-M reagent to reduce PCR inhibition due to phenolic compounds in feces.
PCR and clone library construction.We amplified 700-bp partial 16S rRNA gene sequences from individual fecal extracts with the Bacteroidales-specific PCR primers Bac32F (AACGCTAGCTACAGGCTT) (4) and Bac708R (CAATCGGAGTTCTTCGTG) (4). Each 50-μl PCR mixture contained 1× Taq polymerase buffer, 10 μM concentrations of each primer, 200 μM concentrations of each deoxynucleoside triphosphate, 1.25 U of Taq polymerase, 0.06% bovine serum albumin, and 1.5 mM MgCl2. Reactions were carried out for 30 cycles of 95°C for 1 min, 53°C for 45 s, and 72°C for 1 min. To obtain equal representation from individual hosts, separate PCR products from each sample were pooled in equal amounts based on ImageQuant analysis (Molecular Dynamics, Inc., Sunnyvale, CA). The pooled, host-specific amplicons were gel purified with the QiaQuick gel purification kit (QIAGEN, Inc., Valencia, CA) and cloned into TOPO TA vectors (Invitrogen, Inc., Carlsbad, CA) according to the manufacturer'sprotocol. The vectors were transformed into competent E. coli cells (One Shot TOP 10; Invitrogen, Corp., Carlsbad, CA). Ninety-six transformants were randomly selected from each host-specific library, inoculated into 100 μl LB-ampicillin broth in 96-well culture plates, and incubated overnight at 37°C. Replica plates were made from each original plate.
T-RFLP analysis.DNAs from the replica plates were amplified using 6-FAM-labeled Bac32F and Bac708R. Ninety-six clones from each host species library were digested with the restriction enzyme HaeIII, chosen based on previous work (3). The products (20 fmols) were separated electrophoretically on an ABI 377 automated sequencer (PE Applied Biosystems, Foster City, CA), and fragment sizes were estimated using GeneScan software (PE Applied Biosystems). Unique restriction patterns within each host-specific library were identified, and clones representing each pattern were chosen for sequencing. Duplicate clones were sequenced for dominant restriction patterns. Sequencing was bidirectional with the T7 promoter and M13R primers (Davis Sequencing, Davis, CA).
Phylogenetic analysis.Elk, pig, dog, cat, and gull sequences were aligned with human, bovine, and horse sequences from previous studies (3, 4, 39, 45) using the ARB software program (28). The shorter, cloned sequences were added to a tree of full-length sequences using the parsimony insertion tool in ARB. We removed potential chimeras based on both the CHECK_CHIMERA program (7) and on trees inferred from each half of the sequence data (24). Sequences with different placement in the trees made from the two halves were removed from the analysis. Phylogenetic analysis used PAUP* 4.0 beta 6 (42), with ambiguous regions of alignment removed. Trees were inferred from 526 sequence positions using three tree-building methods: neighbor joining with a Kimura-2 parameter correction, maximum parsimony with a heuristic search, and maximum likelihood with a heuristic search. Bootstrap values were obtained from a consensus of 1,000 neighbor-joining and parsimony trees.
Marker identification and PCR primer design.Host-specific sequences were identified for pig and horse samples, and PCR primers were designed using the Primer Design and Probe Match functions in the ARB software program.
Primer specificity and sensitivity trials.Primers were tested for cross-reactivity against host pools of fecal DNAs representing 10 to 30 individual hosts from each species, with the exception of sheep (5 hosts) and deer (4 hosts). The pools were normalized to 3 ng/μl by Picogreen assay (Molecular Probes, Inc., Eugene, OR). Primer specificity was optimized by manipulation of annealing temperature, Mg2+ concentration, and cycle number. Marker distributions within target host species were tested using pig and horse feces from hosts not used in the clone library analysis and from different geographic regions.
The sensitivity of each primer set was estimated using serial dilutions of plasmid DNAs of known copy number. A theoretical detection limit was determined for the primer sets in pure water, creek water (Beaver Creek, Alsea, OR), and seawater (5 miles off the central Oregon coast). Three nanograms of total genomic DNA from one of the two natural water sources was added to each PCR mixture.
RESULTS
Restriction and sequence analysis.We distinguished between 10 and 20 restriction patterns in each of the host clone libraries analyzed with T-RFLP. Ten to 24 clones per host species were selected for sequencing. Thirty-seven sequences (about 28%) were identified as potential chimeras and removed from the analysis. The final analysis contained 97 cloned sequences from the eight host species. There was a high degree of branching order agreement among the 3 tree-building methods, although bootstrap support varied.
Host distributions.Figure 1 is a diagrammatic representation of the host distributions from a phylogenetic analysis of the sequences. Members of the Bacteroidales were not strictly monophyletic with respect to host species; however, several host-specific clades could be identified. Ruminant, pig, and horse bacterial sequences tended to form unique clusters apart from other hosts, while human, dog, cat, and gull bacterial sequences clustered together almost exclusively.
Schematic representation of host distributions based on a neighbor-joining tree inferred from 16S rRNA gene sequences from Bacteroidales bacteria. Host clades containing cultivated species are depicted as solid wedges. Open wedges represent clusters with no cultivated representatives. Prevotella brevis and Prevotella ruminicola are of ruminant origin; all other cultivated species in the tree are of human origin. Sequences in bold, this study.
Bacteroides-related sequences.A branch of the tree containing cultivated species from the genus Bacteroides was supported in 78% of the neighbor-joining bootstrap resamplings (Fig. 2). It included all the fecal Bacteroides species identified in a phylogenetic analysis of the genera Bacteroides, Prevotella, and Porphyromonas by Paster and colleagues (31). Within this branch, a major well-supported gene cluster contained Bacteroides vulgatus and fecal sequences from human, dog, and gull sources. The remaining fecal Bacteroides species and fecal sequences from human, cat, dog, pig, and gull sources clustered together with low bootstrap support.
Phylogenetic relationships of partial Bacteroidales 16S rRNA gene sequences (526 positions) with cultivated members of the genus Bacteroides. Sequences from cultivated species are shown in italics. Environmental sequences were recovered from Tillamook Bay on the Oregon coast (3). Trees were inferred using three tree-building algorithms: neighbor joining with a Kimura-2 parameter correction, maximum parsimony, and maximum likelihood. Values at the nodes were obtained by bootstrap analysis based on 1,000 resamplings of both neighbor-joining and parsimony trees (above and below the line, respectively). Bootstrap values over 70% are shown. The closed and open circles at the nodes represent branching orders observed in all treeing methods and those observed in two of the three methods, respectively. The scale bar represents 0.5% estimated sequence divergence. Sequences in bold, this study.
Many of the human, dog, cat, and gull bacterial sequences were closely related to known Bacteroides species (97 to 99% identity). B. vulgatus, B. uniformis, and B. stercoris clusters included fecal sequences from two or more of these four host species. Some of the cloned sequences from different hosts appeared identical or almost identical to each other and to sequences from cultivated species. For example, a gull sequence was >99% identical to B. thetaiotaomicron; 694 of 695 bases matched the published B. thetaiotaomicron sequence. Similarly, cat, dog, and human bacterial sequences were 98% identical to B. stercoris, and gull, dog, and human sequences were 99% identical to B. vulgatus. Since sequence artifacts may be introduced during PCR due to Taq polymerase errors, these small differences may not be significant.
Two pig bacterial sequences also clustered with the genus Bacteroides but had only 93% sequence identity with the closest cultivated species.
Ruminant sequences.We recovered no ruminant bacterial sequences in the branch that included the cultivated Bacteroides species. Two phylogenetic studies of the rumen also did not recover sequences from the genus Bacteroides (33, 44); however, isolates very closely related to B. thetaiotaomicron and B. uniformis have been recovered from sheep rumen (GenBank accession numbers AF139524 and AF139525 ) (personal communication, K. Gregg, Rumen Biotech, Murdoch University, Murdoch, Australia). Interestingly, one of the sheep rumen isolates was 100% identical to a gull sequence in our library.
Most ruminant (cow and elk) bacterial sequences clustered together, separate from the sequences from other hosts, and they were distantly related to cultivated relatives (87 to 91%). The largest ruminant cluster formed a sister group to the branch containing the genus Bacteroides (Fig. 2). A second small cluster, containing only sequences from elk, formed a sister group to the branch containing the genus Prevotella (Fig. 3). The placement of a third small cluster containing elk and cow sequences could not be resolved (Fig. 3).
Phylogenetic relationships of partial 16S rRNA gene sequences from fecal members of the Bacteroidales with cultivated members of the genus Prevotella. See the legend to Fig. 2 for explanation.
Prevotella-related sequences.A branch containing sequences from the genus Prevotella was supported in 95% of the bootstrap resamplings (Fig. 3). None of the cloned sequences was closely related to any cultivated Prevotella species. A tight gene cluster of human, cat, and dog bacterial sequences exhibited 97 to 99% intraclade identity and an average 92% identity with the oral bacterium Prevotella oulorum. The range of pig sequence identities with the closest cultivated species was 88 to 93%. All but one of the pig bacterial sequences were most closely related to oral Prevotella species.
Horse bacterial sequences produced two unique clusters, one within the cluster containing Prevotella species and the second related to it.
Design of new host-specific markers.Host-specific sequences were identified from the 16S rRNA gene sequences for pig and horse fecal Bacteroidales bacteria. They were used to design PCR primers to identify pig (PF163F, GCGGATTAATACCGTATGA) and horse (HoF597F, CCAGCCGTAAAATAGTCGG) sources of fecal pollution in water. Both primers were paired with the Bacteroidales-specific reverse primer Bac708R (4). The primers were highly specific using fecal DNA pools from target and nontarget host species (Fig. 4). The new primers also amplified fecal DNAs from pig and horse sources not used to construct the clone libraries and from geographically distant sources. The host-specific markers were present in all 19 individual pig samples and 9 of 10 individual horse fecal samples tested (data not shown).
Host DNA pools amplified with host-specific Bacteroidales PCR primers. Pools consisted of 4 to 30 individual host fecal DNAs, with each pool normalized to 3 ng/μl total DNA (Picogreen assay). (A) Primer PF163F distinguished pig fecal DNA from other host sources. (B) Primer HoF597F distinguished horse fecal DNA from other host sources.
The theoretical limit of detection for both HoF597 and PF163 was 100 copies, and detection was not reduced by the presence of creek water or seawater DNA in PCR mixtures (Fig. 5).
Theoretical sensitivity (detectable copy number) of host-specific primer sets using serial dilutions of template inserted into plasmids. One hundred template copies were detected with primer HoF597F in pure water (A) and in creek water (B). Primer PF163F detected 100 copies in pure water (C) and in seawater (D). Three nanograms of total DNA from seawater or creek water extracts was added to each reaction mixture.
DISCUSSION
A notable feature of this and other clone library analyses from G.I. environments is the number of sequences that do not cluster with any known species. Sequences from nonruminant feces fell within the genera Bacteroides and Prevotella, while most sequences from ruminant hosts did not cluster with any known species. Both the range of sequence identity with the closest known species (87 to 91%) and the interclade identity range (81 to 94%) suggest that taxonomic diversity exists among ruminant sequences. Representatives of these groups must be isolated before phenotypic characterization or classification can be determined.
Within the genus Bacteroides, cloned sequences were closely related to known species, reflecting the greater number of cultivated representatives in this genus. This in turn reflects a greater emphasis placed on the study of the human fecal flora relative to other animal hosts. Many cultivated Bacteroides species, particularly B. vulgatus, B. uniformis, B. thetaiotaomicron, and B. stercoris, may not provide useful targets for fecal source discrimination assays because of the many similar or identical sequences from nonhuman hosts.
Phylogenetic resolution may be limited by the use of partial 16S rRNA gene sequences in a comparative analysis (23). However, since the primers used here were designed for aquatic fecal source identification, they were constrained by the need to exclude amplification of closely related aquatic bacteria such as Cytophaga spp. (3). Bootstrap values compared favorably with those in the earlier study by Paster and colleagues (31), in which full-length sequences were used.
Twenty-eight percent of the cloned sequences were identified as potential chimeras, fewer than the 32% identified by Wang and Wang (43) in a study analyzing chimera frequency in 16S rRNA gene sequences from a mixed genomic population. They used PCR cycling parameters similar to those used in this study (30 cycles). Robison-Cox and colleagues (35) proposed comparing trees inferred from half sequences to identify chimeric sequences. This method and the CHECK_CHIMERA program reliably identify chimeras formed between distantly related sequences (23, 35); we eliminated such chimeras from our analyses. Any chimeras remaining in our analyses were formed between sequences too closely related to perturb placement of the chimera halves in trees; these would not affect our conclusions on host distributions. In addition, the probability of formation of identical chimeras in different gene libraries is low (10). Thus, recovery of clusters containing closely related or identical sequences from different libraries and hosts provides evidence that these clusters are not of chimeric origin.
Several evolutionary hypotheses could explain the existence of both cosmopolitan and endemic host distributions of fecal bacteria. Endemic distributions could occur among host species with limited physical contact and, thus, no horizontal transmission of fecal bacteria. Several studies have provided evidence for endemism when geographic barriers inhibit dispersal of bacterial populations (18, 40). If fecal bacteria within a host species had diversified in the time since host species diverged, each host species might contain unique types. In this case, we would expect to see types endemic to distantly related hosts such as humans and gulls. Instead, we see that humans and gulls share closely related Bacteroidales sequences.
A striking example of endemic distribution of fecal Bacteroidales is provided by ruminants, which share unusual clades and do not have the common groups shared by other host species. The unique ruminant digestive system may provide a different way for these organisms to make a living than those inhabiting nonruminant hosts. Populations of fecal bacteria endemic to host species may be the result of evolution in different types of digestive systems (29). Alternatively, the evolution of different G.I. systems may have been influenced by the microbial populations themselves (34, 37).
Cosmopolitan distributions could occur with frequent horizontal transmission of fecal bacteria among hosts with similar digestive systems. Sequences from multiple hosts were nearly identical to B. vulgatus, B. thetaiotaomicron, B. uniformis, B. fragilis, and B. stercoris, suggesting that these Bacteroides species are cosmopolitan with respect to host species. The “bushy tips” at branch termini within the genus Bacteroides also suggest a cosmopolitan population structure, in which adaptations to new host environments represent species “ecotypes” (1, 6, 15). Humans share proximity with domestic pets, making frequent occurrence of horizontal transmission of fecal bacteria likely. Gulls inhabit beaches, picnic areas, and landfills, where contact with human and domestic pet excrement may occur.
Prevotella-related clones exhibited more varied host distribution patterns than Bacteroides-related clones. A cluster of human and domestic pet Prevotella sequences was not closely related to any known species but again showed a cosmopolitan host distribution. Prevotella-related pig sequences clustered together, separate from other host species, as did the horse sequences, suggesting a more endemic distribution pattern. Pig-only and horse-only clades were used to identify unique sequences for these hosts, and PCR primers designed from these clades did not amplify DNA from nontarget hosts. However, this approach was not always successful, most likely because coverage of clone libraries was low. For example, a primer designed from a clade that appeared to be elk-specific amplified cattle fecal sequences (data not shown), probably because cattle clone library representation was lacking in that group. The overlap in domestic pet and human sequences also precluded the development of a unique dog or cat marker. The representation issue when using clone library sequences has led us to look for supplemental methods, including subtractive hybridization, to identify host-specific fecal markers in new hosts (8a).
The 16S rRNA gene sequence analysis of fecal Bacteroidales revealed both endemic and cosmopolitan distributions among the eight hosts used in the study. The evolution of host-enteric bacterial interactions is complex and will certainly not be understood in the context of a single gene. Multiple interactions and sources of gene flow result in evolutionary trajectories that may not be predictable. However, the evolutionary consequences of these host distribution patterns have potential for practical application in the field of fecal source identification.
ACKNOWLEDGMENTS
We are grateful for assistance from Stephanie Connon, Mike Simonich, and Mike Rappé.
This work was partially supported by grant no. R82-7639 from the U.S. Environmental Protection Agency, grant 00-S1130-9818 from the U.S. Department of Agriculture, and grant NA76RG0476 (project no. R/ECO-04) from the National Oceanic and Atmospheric Administration to the Oregon Sea Grant College Program.
FOOTNOTES
- Received 7 July 2004.
- Accepted 22 December 2004.
- Copyright © 2005 American Society for Microbiology