Previous Article | Next Article ![]()
Applied and Environmental Microbiology, March 2009, p. 1427-1436, Vol. 75, No. 5
0099-2240/09/$08.00+0 doi:10.1128/AEM.01889-08
Copyright © 2009, American Society for Microbiology. All Rights Reserved.
,
Genetics and Microbiology Research Group, Department of Agrarian Production, Public University of Navarre, 31006 Pamplona, Spain,1 U.S. Department of Energy Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, California 945982
Received 14 August 2008/ Accepted 21 December 2008
|
|
|---|
|
|
|---|
The whole genome sequence of P. ostreatus is currently being assembled at the Joint Genome Institute (California). P. ostreatus is the first edible and the second lignin-degrading basidiomycete to be sequenced. The sequences of other basidiomycetes, such as Phanerochaete chrysosporium (48), Cryptococcus neoformans (44), Ustilago maydis (38), and Laccaria bicolor (47) have been published, and others (Postia placenta, Heterobasidion annosum, Agaricus bisporus, Serpula lacrymans, etc.) are in progress.
Telomeres are the protective DNA-protein complexes found at chromosome termini (6, 13, 76). In most eukaryotes, telomeric DNA consists of tandem arrays of 5- to 8-bp direct repeats where specific telomere-capping proteins bind to ensure chromosomal-end integrity. Telomeres are essential for genome stability, and their shortening (attrition) can lead to chromosome instability, replicative senescence, and apoptosis (43), while their loss causes activation of DNA damage responses (45, 66), cell cycle arrest (28), and chromosome fusions, such as nonreciprocal translocations (7, 32). Moreover, high recombination rates are frequent near telomeres (50).
Telomeres and subtelomeric regions are usually gene reservoirs that permit organisms to quickly adapt to new ecological niches (60). Two types of genes participate in this adaptive process: species-specific (18) and contingency genes (5). Species-specific genes are shorter than the core genes of the genomes in which they are present, contain fewer exons, exhibit a subtelomeric bias, and arise by duplication, diversification, and differential gene loss. The avirulence genes of some phytopathogenic fungi are contingency genes that appear near telomeres (15). Furthermore, it has recently been found in Fusarium species that pathogenicity-related genes cooccur with telomeric regions. In this case, chromosomal rearrangements (fusions) have maintained these structures. The Fusarium graminearum genome revealed a link between localized polymorphism and pathogen specialization (11). Among the genes frequently found in subtelomeric regions in Magnaporthe oryzae and Aspergillus sp., the presence of transposons, telomere-linked RecQ helicases, clusters of secondary-metabolite genes, cytochrome oxidases, hydrolases, molecular transporters, and genes encoding secreted proteins, among others, has been reported (18, 56).
RecQ helicases are highly conserved in evolution and are required for genome stability. Genes coding for these enzymes have been described in prokaryotes and eukaryotes (4, 9, 39, 71). There are a minimum of five RecQ helicase-like genes in humans, and three of them (BLM, WRN, and RECQL4) are mutated in the Bloom, Werner, and Rothmund-Thomson recessive autosomal syndromes, which exhibit genomic instability leading ultimately to cancer (9). Fungal RecQ helicase-like genes have been previously found associated with chromosome ends (23, 35, 56, 61).
In genome-sequencing projects, telomeres and subtelomeric regions are rarely present or assembled because of problems derived from their repetitive nature; therefore, it is necessary to perform direct cloning of the subtelomeric regions. The rice pathogen M. oryzae (56) is one of the few fungi with telomeric and subtelomeric regions characterized. Telomere-associated markers provide an accurate assessment of linkage group (LG) completeness and a better estimate of genetic size and help in establishing the synteny of LGs, especially in those organisms for which genetic-linkage maps are not available (34). Moreover, these markers inform us about the genome organization and the occurrence of species-specific and contingency genes (5, 18), as well as about the chromosome rearrangements that could have occurred in the evolution of the genome.
In this work, we mapped and studied the telomeric and subtelomeric regions of most of the P. ostreatus chromosomes, and we describe the main genes present in them. The study was carried out with a combination of genetic, molecular, and bioinformatics tools. The results obtained show the high complexity of these regions and confirm the presence of RecQ helicase-like, heterokaryotic incompatibility (het), and short-chain dehydrogenase genes that have also been found in other fungi. In addition, a laccase gene cluster is described for the first time in the subtelomeric region of chromosome 6. This study is the first step toward analyzing the effects that the subtelomeric positions of some fungal-species-specific genes (such as the laccases can be in white rot lignocellulolytic fungi) could have in the adaptation to new growing substrates and in the generation of large families of apparently redundant elements.
|
|
|---|
Vectors, probes, and primers.
pBluescript SK(+) (Fermentas Inc., Burlington, Canada) and the pGEM-T Easy Vector (Promega, Southampton, United Kingdom) were used as ligation vectors. The plasmid pTEL1 was constructed by cloning 132 repeats of the human telomeric hexanucleotide (5'-TTAGGG-3') into pBluescript SK(+). The primers used to amplify internal adjacent telomeric sequences are described in Table S1 in the supplemental material.
Bal31 digestion.
Bal31 exonuclease degrades the 3' and 5' ends of duplex DNA. Genomic DNA (18 µg) of strain N001 was digested with 8 units of Bal31 at 30°C in the buffer supplied by the manufacturer (GE Healthcare). Aliquots containing 3 µg of DNA were removed after 0, 5, 10, 20, 30, and 40 min. Nuclease digestion was terminated by the addition of 1/10 volume of 0.5 M EDTA. The DNA was then recovered by phenol-chloroform (1:1) extraction, followed by ethanol precipitation. Each aliquot was subsequently digested with MboI prior to electrophoresis on 0.8% agarose gels and Southern blotted onto nylon membranes (Biobond Plus; Sigma). Restriction enzyme-digested products were hybridized with a digoxigenin-labeled (Roche Diagnostic GmbH, Mannheim, Germany) human telomeric probe (TEL1).
RFLP analysis and linkage mapping.
For restriction fragment length polymorphism (RFLP) analyses, genomic DNA was purified from the dikaryotic N001 strain, the protoclones PC9 and PC15, and the 80 monokaryons (haploid progeny) as described elsewhere (41). DNA samples were digested with different restriction enzymes (BglII, EcoRI, HindIII, PstI, SalI, XbaI, and XhoI) according to the suppliers' specifications. The digested products were separated on 0.8% agarose gels, Southern blotted, and probed with digoxigenin-labeled probes. When a given enzyme-probe combination detected polymorphism between protoclones PC9 and PC15, it was used for mapping the corresponding RFLP marker in the progeny of 80 monokaryons. The linkage analysis was performed using the MAPRF program as previously described (41, 58, 59).
Isolation of P. ostreatus telomeric DNA by SSP-PCR.
The single-specific-primer PCR (SSP-PCR) procedure described by Shyamala and Ames (64) and modified by Sohanpal et al. (67) was used to isolate telomeric and subtelomeric fragments. The detailed experimental data are shown in Fig. S1 in the supplemental material. Thirty micrograms of genomic DNA of the dikaryotic strain was used as starting material. PCR amplification products ranging from 600 to 1,600 bp were isolated and cloned into the pGem-T Easy Vector (Promega, Southampton, United Kingdom) and further sequenced for evaluating the telomeric and subtelomeric sequences.
Sequence analysis.
DNA sequences were analyzed by pairwise comparison using the Blast2seq tool (69) at the National Center for Biotechnology Information (NCBI) site. The hidden Markov model-based program FGENESH was used for Web-based gene prediction (http://www.softberry.com/berry.phtml). The cloned sequences were used as queries in the nucleotide and protein sequence databases at the NCBI using different tools of the BLAST suite (2). Protein motifs were identified using the ExPASy database tools (24). To identify the repetition basic unit present in the telomeric sequences, we used the Tandem Repeats Database (25), a public repository of information on tandem repeats in genomic DNA, and the Tandem Repeats Finder program publicly available at http://tandem.bu.edu/trf/trf.html.
Raw data analysis.
P. ostreatus whole-genome sequence data produced by the Joint Genome Institute (http://www.jgi.doe.gov) and available on 27 May 2007 were used to search for telomeric sequences. This preliminary sequence consisted of a 4X (redundancy number) coverage draft sequence assembly containing 6,202 contigs, which were analyzed using the Tandem Repeat Finder program described above.
Screening of the genomic library of the N001 strain.
A lambda EMBL4 genomic-DNA library derived from the dikaryotic strain N001 (55) was used to screen for telomeric and subtelomeric sequences. PCR-amplified sequences corresponding to subtelomeric regions identified by cloning or by bioinformatics analysis were used as probes.
Nucleotide sequence accession numbers.
The DNA sequence data have been deposited in the EMBL database under accession numbers FM202435 (clone 21, containing the putative RecQ helicase gene) and FM202436 (clone 22, containing the putative het gene).
|
|
|---|
![]() View larger version (74K): [in a new window] |
FIG. 1. Southern hybridization of Bal31 exonuclease-treated P. ostreatus N001 genomic DNA. The DNA was digested with Bal31 for the indicated times, and MboI was subsequently used to digest the Bal31-restricted DNA to completion. The samples were then electrophoresed, blotted, and hybridized with probes. (A) TEL1 (human telomeric probe). (B) RFLP marker linked to the mating-type locus (matA).
|
Segregation analysis of TRFs in the progeny of N001.
In order to genetically map the telomeres of P. ostreatus, the RFLP segregation of the DNA fragments revealed with the TEL1 probe was studied. The number of haploid chromosomes of P. ostreatus N001 is 11 (40). This indicates that 22 TRFs would be expected in each monokaryon of the progeny. The fingerprint and hybridization analyses, however, showed more than 22 discrete telomeric bands per enzyme and haploid genome, indicating the presence of de novo bands (Fig. 2). Constant bands in the whole progeny, as well as others that exhibited a hybridization signal stronger than that of N001, could also be observed. The TEL1 RFLP pattern was enzyme specific, with HindIII yielding the largest fragments (up to 12 kb) (data not shown) and SalI the shortest (1.2 kb) (Fig. 2). For the mapping analysis, most TRF bands larger than 7 kb were discarded because of the presence of comigrating bands, their fuzzy nature, and the large length variation observed in the progeny in relation to N001. Restriction bands smaller than 1,250 bp, on the other hand, were also discarded because of their poor repetitiveness.
![]() View larger version (68K): [in a new window] |
FIG. 2. Segregation analysis of TRFs in P. ostreatus. The hybridization patterns of TEL1 to dikaryon N001 and to 14 assorted monokaryons of the offspring are shown. The restriction was done using SalI. Some of the de novo telomere fragments are indicated with filled circles, fragments that showed a TEL1 hybridization signal stronger than that of N001 are highlighted with open circles, and one constant band is indicated with a solid black arrow.
|
2 fit and the shortest linkage distance to the outermost molecular marker mapped to a given LG end were assigned and incorporated into the map (Table 1), whereas the rest of the TRFs were assigned but not incorporated into the graphic map output (see Table S2 in the supplemental material). |
View this table: [in a new window] |
TABLE 1. TRFs incorporated into P. ostreatus LGs
|
2 value and the shortest linkage distance to the outermost molecular marker in the chromosome (poxC; 12 centimorgans [cM]). Analysis of the physical distance of each of these five TRFs from poxC sorted them into two clusters, one formed by two TRFs (XhoI2100 and BglII2050) mapping at 12 and 17 cM, respectively, from poxC and the other formed by the three other TRFs (EcoRI3900, PstI3100, and XbaI1500), mapping at 31 to 36 cM from poxC (Table 1; see Table S2 in the supplemental material). Taking into account the physical-distance-to-linkage-distance ratio estimated for chromosome 6 (approximately 25 kb/cM), the linkage size differences between the two protoclones found at the LG6 upper end account for a 12% difference in the physical sizes of both homologous chromosomes. This value is quite similar to that observed by pulsed-field gel electrophoresis separation of the two chromosomes (40). A similar situation was observed for LG7. In addition, a cluster of ligninolytic enzymes mapped to the upper end of LG6 (55).
![]() View larger version (37K): [in a new window] |
FIG. 3. P. ostreatus genetic map containing the TRFs (blue boxes) and clones (green boxes) assigned to and incorporated into different LGs. Marker names are listed on the right. The dashes across the linkage lines indicate the locations of the markers. Map units (cM) are indicated on the left of dashes for each LG. Markers that deviated from the expected 1:1 segregation (P < 0.05) appear with an asterisk to the right of the marker name.
|
Bioinformatic isolation of telomeric and telomere-adjacent sequences in P. ostreatus.
A preliminary 4X assembly of the whole genome sequence of P. ostreatus was bioinformatically screened for telomeric and telomere-adjacent sequences using the Tandem Repeat Finder program (25). The analysis revealed 15 genomic regions (clones) that contained more than 20 repetitions of the basic telomere unit and that appeared in different sequence scaffolds (see Fig. S4 in the supplemental material).
Comparative analyses of the 28 isolated clones.
A total of 28 sequences (13 derived from the SSP-PCR approach and 15 derived from the bioinformatics study) were queried against the NCBI and ExPASy databases. In summary, the numbers of repetitions of the basic unit ranged from 7 (clones 10 and 11) to 54 (clone 8), and the lengths of the telomere-adjacent sequences varied from 105 bp (clones 5 and 8) to 6,087 bp (clone 21). Three clones (numbers 2, 3, and 7) that contained a sequence of 34 residues overlapping the U. maydis clone UT5 telomere-associated RecQ helicase-like gene (E value, 10–3, corresponding to 47% sequence identity) were detected to be multicopy.
Ten telomere clones (clones 15, 16, 17, 18, 20, 22, 23, 25, 26, and 28) were unique. They harbored between 20 and 54 repetitions of the telomeric unit and telomere-adjacent sequences ranging from 474 to 1,279 bp. Each telomere-adjacent sequence was amplified by PCR using primers specific for that sequence (see Table S1 in the supplemental material), and the occurrence of polymorphisms that permitted their linkage mapping was investigated. No amplification polymorphisms were detected in clones 17 (see Fig. S5 in the supplemental material) and 26, whereas different types of polymorphisms were detected in the other clones: differences in the intensity (clones 16 and 20) or size (clones 23, 25, and 28) of the amplified fragment, restriction polymorphism within the amplified product (clone 18), and presence versus absence of an amplified band (clone 22). The last polymorphism corresponded to a hemizygotic locus, as confirmed by RFLP analysis (see Fig. S6 in the supplemental material). Amplification monomorphic clones 17 and 26 were mapped by RFLP using the telomere-adjacent region as a probe (see Fig. S7 in the supplemental material). No genetic polymorphism was detected for clone 15, which in the end could not be mapped. The remaining 18 clones could be sorted into eight groups on the basis of sequence identity, as shown in Fig. 4. Four of these groups were based on SSP-PCR and one on bioinformatics evidence, and two groups were supported by both types of evidence (Table 2).
![]() View larger version (19K): [in a new window] |
FIG. 4. Schematic representation of alignment between clones with partially or totally identical nucleotide sequences. Clones identified by SSP-PCR and bioinformatics are shown in green and blue, respectively. The sequence differences are indicated by white bars. The number of each telomere clone is indicated at the right end of the bar, the size of the subtelomeric region is indicated at the bar's left end, and the number of telomeric basic units is indicated in the circle.
|
|
View this table: [in a new window] |
TABLE 2. Locations of clones obtained by using SSP-PCR and clones obtained from the Joint Genome Institute raw data
|
Two putative genes were detected in the phage identified using clone 21 as a probe: a RecQ helicase and a 3'-5' exonuclease (see Table S3 in the supplemental material). The putative RecQ helicase was further characterized as a member of the DEAD/DEXH helicase family (71) because it contained the conserved helicase motifs proposed by Gorbalenya et al. (29) (see Fig. S8 in the supplemental material). We have named this putative protein PoTAH (for P. ostreatus telomere-associated helicase). The second putative gene identified in this phage corresponded to a gene encoding a Werner syndrome ATP-dependent helicase homolog. Both sequences mapped to the LG8 upper end.
Four genes were identified in the phage positive to clone 22 (see Table S3 in the supplemental material). The one with the highest similarity corresponded to a protein similar to a predicted Physcomitrella patents protein (E value, 10–104, and 33% sequence identity in a 735-residue overlap) that contains 150 residues of the conserved Pfam06985 domain present in ascomycete HET proteins (heterokaryon incompatibility protein). In addition, a gene coding for a putative short-chain dehydrogenase family member (PTHR19410) and two genes coding for proteins similar to others predicted in Coprinopsis cinerea were found. This phage mapped to the lower end of LG5.
|
|
|---|
The identification of sequences with high and medium numbers of repetitions, such as telomeric and subtelomeric regions, is difficult in whole-genome-sequencing projects, since they are frequently missing in final sequence assemblies (57, 72). Moreover, different authors have described the difficulties in cloning telomeric and subtelomeric regions (21, 30).
In this paper, we describe the use of a combination of molecular and bioinformatics approaches to clone, map, and characterize 19 out of the 22 telomeres of P. ostreatus N001. We were unable, however, to identify any telomere-related sequence mapping to the lower ends of chromosomes 1, 2, and 11 (Fig. 3). This could be due to a deletion of the chromosomal terminal sequence, as has been described in M. oryzae (56). The telomere mapping to the lower end of LG1 was assigned but not incorporated because it distorted the segregation observed and produced a long linkage distance between the telomere and the nearest mapped marker. This result could be explained as a consequence of a gene conversion event at the telomeres, as has been observed to occur in Plasmodium falciparum (19).
The molecular approach used was based on PCR (SSP-PCR) (21) as described by Shyamala and Ames (64). We have estimated that the average length of Pleurotus telomeres ranges, at a minimum, from 25 to 150 repetitions of the basic telomeric unit TTAGGG. Because the strains used in this study have been maintained by successive subcultures and as the segregation of the telomere lengths in the progeny of the dikaryotic strain N001 has not been analyzed, we cannot rule out variations in telomere length, such as those described in F1 and F2 progeny of two Arabidopsis lines (63) or the changes detected in second- and third-generation M. oryzae FaMS 96-1 cultures (16). The telomere fingerprints obtained by restriction analysis of genomic DNA from dikaryon N001, as well as from several of its individual progeny, however, have shown the occurrence of several de novo telomere bands, as had been previously described in other systems (16, 68). The presence of these new telomeric bands can be explained by genome rearrangements at the telomeres, such as those described in a cross between two rice-pathogenic isolates (17), or by unequal crossovers between homologous telomeric repeats in chromosomes with large length polymorphisms (77).
The bioinformatics approach was based on the use of the open-access Tandem Repeat Finder program to look for repetitive telomeric sequences in more than 6,200 contigs of the 4X P. ostreatus preliminary genome assembly. The joint analysis of data containing sequences derived from both approaches showed that many SSP-PCR clones were included in clones identified bioinformatically. As an example, bioinformatically identified clone 19 is almost four times larger than the SSP-PCR-identified clones 10 and 11, while clone 21 is approximately 10 times the length of clone 14 (Fig. 4). We also identified four subtelomeric regions (chromosomes 4 and 10, and 8 and 9) that had high nucleotide similarity with slight differences due to transitions and transversions between chromosomes 4 and 10. We suggest that these common regions would facilitate the alignment of heterologous chromosomes, allowing gene conversions such as were described for the var genes of P. falciparum (19).
Telomeric repeats are not confined to chromosome termini but can also be found in interstitial and centromeric regions (46, 51, 53). The origin of the interstitial telomeric sequences is unknown (3, 22), although they could represent relics of chromosome rearrangements that occurred during karyotype evolution (51, 73). We have found an interstitial telomeric sequence (clone 18) mapping close to a putative recombination hot spot in LG3. At present, we cannot explain the mechanism for this sequence to have moved to this map position.
The analysis of the TRF-linked sequences revealed the presence, among others, of genes coding for laccases (26, 54). Several authors have reported that the genes involved in lignin degradation appeared to form clusters resulting from genome duplications (42). A cluster of laccase genes was mapped to the LG6 subtelomeric region at 250 kb, TRF XhoI2100 (Fig. 3). Considering that subtelomeric regions are genome areas where important rearrangements and gene regulation mechanisms occur, we suggest that the subtelomeric location of the laccase gene cluster in Pleurotus could have an evolutionary significance permitting the fungus to adapt rapidly to new lignocellulosic substrates, acting, then, as species-specific genes (18). A gene coding for the atypical peroxidase DYP (14) was also found approximately 200 kb from the lower subtelomeric region of LG9.
RecQ like-helicase (TLH) genes have been found in the telomere-adjacent DNA of filamentous fungi, such as U. maydis (61), M. oryzae (23), and Metarhizium anisopliae (35). Helicases are essential motor enzymes involved in processes requiring the separation of nucleic acid strands. They are defined by their directionality and classified into families (SF1, SF2, and SF3) according to the presence of certain conserved motifs. The SF1 family includes single-stranded DNA translocases, while the SF2 members are double-stranded DNA translocases (70). It has been proposed that this type of protein could participate in the protection of telomeres from accidental shortening (23) via recombination-mediated mechanisms (31), although this is still controversial (56). In this paper, we described a 3'-5' RecQ SF2 helicase mapping to the upper subtelomeric region of LG8 (PoTAH). CLUSTALW alignment (74) of PoTAH with other RecQ helicases showed that it contains all the domains described for these enzymes (see Fig. S8 in the supplemental material) in positions compatible with those of other examples (35). Because helicase genes belong to a large gene family, we have also found some other helicase genes mapping to the telomere-adjacent regions of LG3 and LG5 and to an internal region of LG7 (data not shown).
Until now, the Werner syndrome helicase (WRN) has been the only reported 3'-5' RecQ-helicase containing a 3'-5' exonuclease domain in the N-terminal region. This domain is separated from the helicase domain by about 200 residues that include some tandem-repeat sequences (20). In the vicinity of PoTAH, there is a similar 3'-5' exonuclease domain 4 kb from the exonuclease domain. No homology was found between the linker domains in PoTAH and WRN; consequently, we did not explore whether they both belong to a unique gene interrupted by a mobile sequence or whether they are two different linked genes.
A second sequence putatively coding for a short-chain dehydrogenase similar to that described in Aspergillus terreus has been found mapping to the lower end of LG5 (see Table S3 in the supplemental material). Short-chain dehydrogenases participate in different catabolic and anabolic pathways involving redox reactions of hydroxy or keto functions (36) and display a wide substrate spectrum, including alcohols, sugars, steroids, aromatic compounds, and xenobiotics (37). The presence of short-chain dehydrogenases in secondary-metabolism gene clusters mapping near the telomeres has been described in M. oryzae and Aspergillus fumigatus (18, 56).
A sequence putatively coding for a HET-like protein was found at the lower end of LG5 (see Table S3 in the supplemental material). HET proteins are coded by species-specific genes in ascomycetes, where they are involved in heterokaryon incompatibility. In Aspergillus species, the number of het genes varies between 7 and 38, and it has been suggested that different mechanisms could have given rise to them (18). The het-like gene found in P. ostreatus N001 is hemizygous and is present in only half of the segregating population. The gene coding for the short-chain dehydrogenase sequence and the het-like gene appear to form part of a bipartite conserved domain architecture with unknown function previously described in Pezizomicotina fungi, such as Neurospora crassa and A. terreus, but absent in basidiomycetes (see Fig. S10 in the supplemental material).
Telomeres are more than simple structural elements. The results described in this work suggest that high selective pressure would maintain the organization of telomeric and subtelomeric regions, as well as the reservoir of genes they contain. It is tempting to suggest that mechanisms such as position effect and major rearrangements taking place at the telomeres could play an important role in the adaptation of fungi to new environments. The analysis of other fungal genomes could shed some light on the mechanisms recruiting genes to the telomere-adjacent regions.
We thank Nerea Markina for her helpful technical assistance.
L.R. led and coordinated the project, G.P. did the experimental work and data analysis, and J.P. coordinated the 4X assembly of the genome. The manuscript was prepared by L.R., G.P., and A.G.P.
Published ahead of print on 29 December 2008. ![]()
Supplemental material for this article may be found at http://aem.asm.org/. ![]()
|
|
|---|
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»