Substitution Rates in a New Silene latifolia Sex-Linked Gene, SlssX/Y
http://www.100md.com
《分子生物学进展》
School of Biosciences, The University of Birmingham, Edgbaston, Birmingham, UK
Correspondence: E-mail: d.filatov@bham.ac.uk
Abstract
Dioecious white campion Silene latifolia has sex chromosomal sex determination, with homogametic (XX) females and heterogametic (XY) males. This species has become popular in studies of sex chromosome evolution. However, the lack of genes isolated from the X and Y chromosomes of this species is a major obstacle for such studies. Here, I report the isolation of a new sex-linked gene, Slss, with strong homology to spermidine synthase genes of other species. The new gene has homologous intact copies on the X and Y chromosomes (SlssX and SlssY, respectively). Synonymous divergence between the SlssX and SlssY genes is 4.7%, and nonsynonymous divergence is 1.4%. Isolation of a homologous gene from nondioecious S. vulgaris provided a root to the gene tree and allowed the estimation of the silent and replacement substitution rates along the SlssX and SlssY lineages. Interestingly, the Y-linked gene has higher synonymous and nonsynonymous substitution rates. The elevated synonymous rate in the SlssY gene, compared with SlssX, confirms our previous suggestion that the S. latifolia Y chromosome has a higher mutation rate, compared with the X chromosome. When differences in silent substitution rate are taken into account, the Y-linked gene still demonstrates significantly faster accumulation of nonsynonymous substitutions, which is consistent with the theoretical prediction of relaxed purifying selection in Y-linked genes, leading to the accumulation of nonsynonymous substitutions and genetic degeneration of the Y-linked genes.
Key Words: Silene latifolia ? sex chromosomes ? substitution rates ? spermidine synthase ? segregation analysis
Introduction
Sex chromosomes are not as common in plants as in animals, but they have been found in several phylogenetically distant groups, such as Rumex, Cannabis, and Silene (Westergaard 1958). The genus Silene contains over 500 species. Most of these are gynodioecious or hermaphroditic, but there are two clusters of dioecy that, apparently, evolved independently from each other (Desfeux et al. 1996). One such cluster is represented by five very closely related dioecious species (Silene latifolia, S. diclinis, S. dioica, S. heuffelii, and S. marizii), all of which have male heterogametic sex determination (Westergaard 1958). Silent nucleotide divergence of these species from nondioecious Silene species is about 15% (Filatov and Charlesworth 2002), suggesting that the common ancestor of these species diverged from a nondioecious ancestor about 10 to 20 MYA. The recent divergence of the sex chromosomes in S. latifolia provides an opportunity to study early stages of sex chromosome evolution, and so S. latifolia became a popular model species for such studies (Guttman and Charlesworth 1998; Filatov et al. 2000, 2001; Atanassov et al. 2002; Filatov and Charlesworth 2002; Moore et al. 2003; Matsunaga et al. 2003).
Despite their independent origins, the properties of sex chromosomes in different groups of organisms are quite similar: recombination is restricted between the Y and the X chromosomes, and Y chromosomes are usually genetically degenerate (Bull 1983), containing few functional genes (Skaletsky et al. 2003). Degeneration is thought to occur because of reduced the efficacy of selection on the nonrecombining Y chromosome (reviewed in Charlesworth and Charlesworth [2000]). Deleterious mutations may be carried to fixation by linked advantageous mutations ("selective sweeps," [Rice 1987]) or by "Muller's ratchet" (stochastic loss of chromosomes with the fewest mutations [Gordo and Charlesworth 2000]), and selective elimination of deleterious mutations ("background selection") may accelerate the stochastic fixation of mildly detrimental mutations (Charlesworth 1994). These processes should lead to reduced effective population size and sequence diversity in actively degenerating Y chromosomes (Charlesworth and Charlesworth 2000), which was indeed reported for several animal and plant species (Zurovcova and Eanes 1999; Yi and Charlesworth 2000; Filatov et al. 2000, 2001).
Although there is fairly good evidence for genetic degeneration of animal Y chromosomes, in plants the situation is less clear. The evidence for genetic degeneration of the Silene latifolia Y chromosome is based only on the finding of a degenerate Y-linked copy belonging to the MROS3 gene family (Guttman and Charlesworth 1998), which has functional copies on the X chromosome and autosomes (Kejnovsky et al. 2001), and on the fact that YY S. latifolia plants (having no X) are usually inviable (Ye et al. 1990). The degenerate Y-linked copy of the MROS3 gene could be a pseudogene that originated because of translocation or (retro)transposition of an autosomal copy, rather than a remnant of a functional Y-linked MROS3 gene. Inviability of the YY plants may be the result of a breakdown of gene regulation (e.g., in sex determination system), rather than caused by degeneracy of the Y chromosome. Thus, although suggestive, these findings cannot be taken as a solid evidence for Y chromosome degeneration in S. latifolia. Moreover, it is not clear whether the plant Y chromosomes can degenerate: active gene expression in haploid pollen (e.g., Engel et al. 2003) may help to efficiently eliminate deleterious mutations from plant Y-linked genes. This may explain why, despite the clear evidence for drastic reduction of genetic diversity on the S. latifolia Y chromosome (Filatov et al. 2000, 2001; Laporte et al. 2004), the previous study of substitution rates in S. latifolia sex-linked genes (Filatov and Charlesworth 2002) found no evidence for relaxation of purifying selection in the S. latifolia Y-linked SlY4 gene, compared with its X-linked homolog SlX4. The effect of reduced effective population size upon the nonsynonymous replacement rate depends on the distribution of the mutation selection coefficients, which is unknown (Keightley 1998). The SlX4/Y4 genes encode fructose-2,6-bisphosphatases (Atanassov et al. 2002), and selection against amino acid replacements in this housekeeping gene may be strong enough to remove most mutations, despite more than 10-fold reduction in the effective population size on the Y chromosome. Thus, more S. latifolia sex-linked genes have to be analyzed to detect the expected relaxation of purifying selection on the Y chromosome.
Despite substantial efforts by several laboratories to isolate S. latifolia sex-linked genes (reviewed in Filatov [2004]), only three genes with intact X-linked and Y-linked copies are known: SlX1/SlY1 (Delichère et al. 1999), SlX4/SlY4 (Atanassov et al. 2002), and DD44X/Y (Moore et al. 2003). In addition, an autosomal Slap3 gene was reported to have a functional Y-linked homolog (Matsunaga et al. 2003), and a MROS3 gene family has at least one active copy on the X chromosome (Guttman and Charlesworth 1998, Kejnovsky et al. 2001) and a degenerate nonfunctional copy on the Y chromosome (Guttman and Charlesworth 1998). The reasons for the low success rate with isolation of S. latifolia sex-linked genes are discussed in detail elsewhere (Filatov 2004).
Here, I report isolation of new S. latifolia sex-linked genes, using segregation analysis of random cDNA clones. Unlike previous approaches used to search for sex-linked genes (Desfeux et al. 1996; Atanassov et al. 2002; Moore et al. 2003) this method is technically simple and does not depend on X/Y divergence of the homologous copies. The analysis of silent and nonsilent substitution rates in the new X-linked and Y-linked genes demonstrated accelerated accumulation of nonsynonymous substitutions in the Y-linked gene and is consistent with the theoretical prediction of relaxed purifying selection on the Y chromosome.
Methods
Plant Material
Five families for use in segregation analyses were produced by crosses between three male S. latifolia plants (mSa9, mCB26, and mBM11), four S. latifolia females (fSa12, fBM4, fCB7, and fCB24), and one S. dioica female (fSd7): family 1 (fBM4xmCB26), family 2 (fCB7xmBM11), family 3 (fCB24xmBM11), family 4 (mSa9xfSa12), and family 5 (mSa9xfSd7). The parent S. latifolia and S. dioica plants were either grown from seeds collected by Dr. J. Ironside in Cluj botanic garden (CB plants), Romania and in Oradea, on the Czech/Romanian border (BM plants), or were provided by Prof. D. Charlesworth (Sa and Sd plants). The single S. vulgaris individual, which was used as an outgroup in molecular evolution analyses was grown from seeds provided by Prof. D. Charlesworth.
Genomic DNA was isolated from the leaves of S. latifolia parents and F1 progeny, as well as from a single S. vulgaris plant using Plant DNAzol reagent (Invitrogen), according to the manufacturer's instructions.
Segregation Analysis of S. latifolia cDNA Clones
Analysis of the previous work devoted to the isolation of plant sex-linked genes suggested that the most promising approach is segregation analysis of random genes (Filatov 2004). Currently, there are a limited number of S. latifolia genes available in GenBank, many of which have been tested for sex-linkage previously (Guttman and Charlesworth 1998; Laporte and Charlesworth 2001). Thus, as the first step in the search for new sex-linked genes, I sequenced 96 random cDNA clones taken from an S. latifolia male flower bud cDNA library (provided by Dr. S. Grant). Clones with homology to transposable elements, and without open reading frames (ORF), were excluded. Only 36 clones with long ORF (>150 amino acids) and encoding proteins with homology to known proteins were selected for further analysis. The sequences of these cDNA clones were used to design pairs of primers for PCR amplification of the corresponding genes from genomic DNA of the parents of the genetic families. Only primers yielding one or two PCR products in parents were used for further segregation analysis. If the PCR products amplified from parent genomic DNA differed in size (e.g., fig. 1), the size differences were used as markers in segregation analysis. Otherwise, the parental PCR products were sequenced and single-nucleotide polymorphisms (SNP) fixed between parents were used as genetic markers. In the former case, the segregation in the F1 progeny was tested in 1% to 2% agarose gels, whereas in the later case, PCR products of the F1 progeny were sequenced directly from one primer, and sex linkage was inferred from the pattern of SNP inheritance.
FIG. 1.— Segregation analysis of the SlssX/Y gene in the family 5. PCR amplification was conducted using c2B12+1 and c2B12–2 primer pairs. Lanes 8 and 10 correspond to male and female parents, respectively. Lanes 1 to 7 correspond to male F1 progeny, and lanes 11 to 18 correspond to female F1 progeny.
Isolation and Sequencing of SlssX and SlssY Genes
Amplification of shorter (1 kb) fragments of the SlssX/Y gene was conducted with c2B12+1 and c2B12–2 primers (table 1) using Promega Taq polymerase. The PCR products were separated on 2% agarose gels under low voltage (overnight) to ensure good separation of fast and slow bands in males. The bands were isolated from the gel using the Qiagen gel extraction kit and sequenced directly from c2B12+1 and c2B12–2 primers using the ABI BigDye version 3.1 sequencing kit and an ABI3700 automatic sequencer.
Table 1 Primers Used for PCR Amplification and Sequencing
The PCR amplification of the longer (4kb) regions was conducted using the c2B12+6deg and c2B12–4 primers and the ExpandTM Long-Range PCR System (Roche). PCR products were separated on 1% agarose gels, isolated from the gel using Qiagen gel extraction kit, cloned using the TOPO TA Cloning Kit (Invitrogen), and sequenced using the ABI BigDye version 3.1 sequencing kit and the ABI3700 automatic sequencer. The sequencing of the SlssX and SlssY genes was conducted using the following primers (table 1): universal M13F and M13R, c2B12+9, c2B12–13, c2B12+10, c2B12–12, c2B12–5, and c2B12+1. In addition, sequencing of the longer SlssY gene required c2B12–11 and c2B12+19 to cover the entire region. The same primers and conditions were used for isolation and sequencing of the Svss gene from S. vulgaris. GenGank accession numbers for the SlssX, SlssY, and Svss genes are AY705437, AY705438, and AY705436, respectively.
The sequence traces were checked, base calls were corrected, and contigs were assembled using ProSeq software (Filatov 2002). The sequences of SlssX, SlssY, and Svss genes were aligned by the mcalign program (Keightley and Johnson 2004).
The intron-exon structures and coding regions of the SlssX, SlssY, and Svss genes were inferred by aligning them with the sequences of the S. latifolia cDNA clone c2B12 and with cDNA sequences of spermidine synthases of Arabidopsis (NM_102230 [GenBank] ), Pisum (AF043108), Coffea (AB015599), Malus (AB072915), and Datura (Y08253).
Nucleotide Substitution Rates
Intronic, synonymous, and nonsilent pairwise divergence values were calculated using MEGA version 2 software (Kumar et al. 2000). MEGA2 was also used to conduct the Tajima's (1993) relative-rates test. Synonymous and nonsynonymous divergence was estimated using the Nei-Gojobori method (Nei and Gojobori 1986) with Jukes-Cantor correction (Jukes and Cantor 1969). The Kimura's two-parameter distance (Kimura 1980) was used for divergence in intron regions.
To compare substitution rates in introns, the baseml program from the PAML version 3.13d software package (Yang 2001) was used. The "local clock" mode (Yoder and Yang 2000) was used to test the significance of substitution rate difference between the SlssX and SlssY lineages. To estimate the silent (Ks) and nonsilent (Ka) substitution rates and the Ka/Ks ratios for the X-linked and the Y-linked genes, the codeml program from the PAML package was used, assuming the phylogeny in figure 2. Each branch was assumed to have a separate substitution rate ("no clock mode"), and three Ka/Ks ratios were assigned as shown in figure 2 by different branch shadings. For the likelihood-ratio test, the model with three Ka/Ks ratios was compared with a model with just two ratios, one for autosomal and one for the X-linked and the Y-linked genes.
FIG. 2.— The gene tree used in the ML analyses. Different shading shows the separate Ka/Ks rates used for SlssX, SlssY, and Svss genes. The number of amino acid replacements and synonymous substitutions in every lineage is shown as nominator and denominator, respectively.
Results
Isolation of the SlssX/Y Genes
To isolate more S. latifolia sex-linked genes, I conducted segregation analysis of random cDNA clones isolated from male flower bud cDNA library (provided by Dr. Sarah Grant), as described in the Methods. In total, 21 clones were tested for sex linkage, and one of the clones, c2B12 (accession number AY705439), appeared to correspond to a sex-linked gene. PCR primers c2B12+1 and c2B12–2, designed using the sequence of the c2B12 cDNA clone, amplified an approximately 1-kb region from all male and female S. latifolia and S. dioica individuals. In addition, male S. latifolia individuals had a second PCR product, which was slightly longer (fig. 1). Male-specificity of the longer fragment suggested that it might be Y linked. Y linkage of this fragment was confirmed by segregation analysis in five families (36 male and 34 female F1 individuals in total), which demonstrated that this fragment is always inherited from father to sons but not to daughters (fig. 1). To find molecular markers for the segregation analysis of the shorter PCR product, it was sequenced directly for the female fBM4 and male mCB26 plants used as parents of family 1. This revealed two nucleotide differences, with mCB26 hemizygous for the T and fBM4 homozygous for C in both polymorphic sites. None of the nucleotide differences affected restriction sites, so the PCR products from eight female and seven male F1 progeny from family 1 were sequenced directly. All the female F1 progeny were heterozygous for paternal and maternal alleles (A/C), whereas the male F1 progeny were hemizygous for the maternal variant (C). Thus, the gene corresponding to the shorter PCR product is X linked.
Although the smaller and the larger PCR products demonstrate X linkage and Y linkage, there remains a possibility that these genes are located in the pseudoautosomal region. If so, the X-linked and Y-linked copies should occasionally recombine with each other. However, sequencing of the X-linked and Y-linked genes demonstrated that intron divergence among the X-linked and the Y-linked genes exceeds 8% (see below), which is too high to be accounted for by divergence between the recombining alleles. Also, the male specificity of the longer PCR product (over 30 males from wild populations were tested [data not shown]) makes the pseudoautosomal location unlikely.
Sequencing of the longer (Y-linked) and the shorter (X-linked) PCR fragments amplified by the c2B12+1 and c2B12–2 primers confirmed that both fragments are homologous to the c2B12 cDNA clone. This clone has strong (over 80% DNA sequence identity) homology to coding regions of spermidine synthase genes from Arabidopsis (accession number NM_102230), Pisum (accession number AF043108), Coffea (accession number AB015599), Malus (accession number AB072915), and Datura (accession number Y08253). Thus, the X-linked and Y-linked genes amplified by the c2B12+1 and c2B12–2 primers were assumed to encode S. latifolia spermidine synthase and were named SlssX and SlssY, respectively.
To isolate a longer region of the SlssX and SlssY genes, I used the sequences of the coding regions of plant spermidine synthase genes to design a degenerate forward primer, c2B12+6deg, which was used in a pair with c2B12–4 to amplify the region from intron 1 to the 3' untranslated region of the SlssX and SlssY genes. These primers amplified a single band of approximately 4 kb in S. latifolia females, and two bands of approximately 4 kb and approximately 4.5 kb in S. latifolia males. The larger and the smaller fragments were cloned and sequenced as described in the Methods section. As expected, the sequence of the 3' region of the larger fragment was identical to the fragment of the SlssY gene amplified with c2B12+1 and c2B12–2 primers, and the smaller fragment corresponded to SlssX sequence. Segregation analysis in the family 1 (seven male and eight female F1 progeny) also confirmed that the larger (4.5 kb long) band is Y linked. The X linkage of the smaller (4 kb long) PCR product was confirmed by partial direct sequencing of the fragment amplified from the maternal (fBM4) and paternal (mCB26) individuals and the same set of F1 offspring of family 1, as used for the segregation of the smaller 1-kb PCR product, amplified with c2B12+1 and c2B12–2 primers. As expected, all the male F1 progeny inherited the maternal allele but not the paternal allele, whereas all the female F1 progeny inherited maternal and paternal alleles.
Exon-Intron Structure of the SlssX/SlssY Genes
The alignment of sequences of the SlssX and SlssY genomic fragments and the cDNA clone revealed the presence of three insertions of 80, 370, and 250 nt in the genomic fragments. According to the position in the cDNA sequence, these insertions correspond to the introns 6, 7, and 8 of the A. thaliana spermidine synthase I gene. As the S. latifolia cDNA clone c2B12 contained only the 3' portion of the spermidine synthase coding region (exons 9 to 6 and 12 nt of exon 5), the exon-intron structure of the 5' region of the SlssX and SlssY genes had to be established from the comparisons with spermidine synthases of the other plant species. Alignment of the SlssX and SlssY genomic fragments with the sequences of coding regions of Arabidopsis, Pisum, Coffea, Malus, and Datura spermidine synthase genes allowed the positions of the other introns and exons to be established. The presence of splice-site consensus sequences (5'-GT...AT-3') supports the position of introns in the SlssX and SlssY genes, except intron 7, which had a rare (GC) 5' splice site. In the case of intron 7, the position of the intron was unambiguously established from the comparison of the cDNA and genomic sequences.
Divergence Between the SlssX and SlssY Genes
Overall, 837 nt of coding region and 2,554 nt of intron sequences were analyzed after exclusion of insertion/deletion (indel) regions and regions of uncertain alignment in the first exon and in the 3' UTR. Most indels were located in the first intron, including the 481 bp long insertion in SlssY, resulting in a substantial length difference between the SlssX and SlssY PCR products. Synonymous divergence between the SlssX and SlssY genes is 4.7% ± 1.6%, and divergence in introns is much higher, 8.1% ± 0.6%. The higher divergence in introns may partly be caused by misalignment in the first intron, which contains multiple insertions and deletions. However, exclusion of the first intron from the analysis does not reduce the divergence; in fact it slightly increases divergence (8.8% ± 0.8%).
There are only nine nonsynonymous differences between the SlssX and SlssY genes, and the nonsynonymous divergence is 1.4% ± 0.5%, substantially lower than synonymous divergence, suggesting that at least one of these genes is under purifying selective constraint. Interestingly, the sequence of the S. latifolia c2B12 cDNA clone is identical to the exons of the SlssY and differs from SlssX by 12 nt. Thus, the Y-linked copy of the S. latifolia spermidine synthase gene is actively transcribed. The open reading frame is preserved in both X-linked and Y-linked copies of the gene, strengthening the evidence that both copies are functional or had been functional until very recently.
Substitution Rates in the SlssX and SlssY Genes
If one of the copies of the SlssX/Y gene became nonfunctional after the X-linked and Y-linked copies stopped recombining with each other, it would be expected to accumulate multiple nonsynonymous substitutions. Even if both copies are functional, theory predicts that purifying selection should be less efficient in the nonrecombining Y-linked copy, and it is expected to accumulate more nonsynonymous substitutions than the X-linked copy (reviewed in Charlesworth and Charlesworth [2000]).
To detect whether the mutations in the SlssX or in the SlssY lineages disproportionately contributed to nonsilent and silent divergence between the two copies, I PCR-amplified and sequenced a homologous region (referred below as Svss) from nondioecious Silene vulgaris, using the c2B12+6deg and c2B12–2 primers. Using the sequence of the S. vulgasis Svss gene, it is possible to root the gene tree for the S. latifolia SlssX and SlssY genes and to establish how many of the nine amino acid differences between the SlssX and SlssY genes occurred in the X-linked and in the Y-linked genes. Interestingly, in all nine cases, the amino acid in the SlssX gene matched that in the Svss; thus, all nine of the nonsynonymous changes have probably occurred along the SlssY lineage. The excess of nonsynonymous substitutions in the Y-linked gene is significant (2= 7.0, 1df, P = 0.008) by the Tajima's relative-rate test (Tajima 1993) and is consistent with relaxed selective constraint on the nonrecombing Y chromosome, as suggested by theory. Interestingly, one of the mutations in SlssY (the AsnGly at the position 173 of the protein sequence alignment shown in figure 3 in Korolev et al. [2002]) is especially likely to disrupt the spermidine synthase activity of the protein encoded by the Y-linked copy of the gene, because according to crystal structure of the bacterial enzyme, this residue interacts with the substrate (Korolev et al. 2002).
Alternatively, the elevated nonsilent substitution rate in the SlssY gene could be caused by a higher mutation rate on the S. latifolia Y chromosome (Filatov and Charlesworth 2002). Pairwise Svss/SlssX and Svss/SlssY synonymous divergence is 9.9% ± 2.4% and 11.6% ± 2.6%, respectively, suggesting that the Y-linked gene accumulates synonymous substitutions (and probably mutates) faster than the X-linked homolog. To test whether the mutation rate is indeed higher in the SlssY than in the SlssX gene, I conducted a maximum-likelihood ratio test to compare substitution rates in introns (Ki) of these genes. The model with three substitution rates (KiX, KiY, and KiA) was compared with the model with two rates (KiX = KiY and KiA). The likelihoods for these models were calculated using baseml program with local molecular clock (Yoder and Yang 2000), and the significance was tested using the likelihood-ratio test. The model with three rates fits the data significantly better than the model with two rates (2lnL = 5.75, P < 0.05), demonstrating that the intron substitution rate in the SlssY gene (KiY= 0.058) is significantly elevated, compared with that in the SlssX gene (KiX= 0.04). The difference in the substitution rates, however is not as drastic as for the SlX4/SlY4 genes reported previously (Filatov and Charlesworth 2002).
The higher mutation rate in the Y-linked gene could have elevated the nonsynonymous substitution rate and has to be taken into account when comparing the SlssX and SlssY genes. For this purpose, I used the codeml program (Yang 2001) to compare two models, one with three Ka/Ks ratios (separate ratios for branches) and the other with the ratios for SlssX and SlssY genes forced to be equal. The former model fits the data significantly better that the latter model (2lnL = 4.003, P < 0.05), demonstrating that, taking into account the possible differences in underlying mutation rates between the SlssX and SlssY genes, the Y-linked gene accumulates nonsynonymous substitutions significantly faster than the X-linked homolog. The Ka/Ks ratios are well below unity for both genes (0.000 and 0.519 for the SlssX and SlssY, respectively), consistent with both genes being functional and subject to purifying selective constraint.
Discussion
Here, I reported isolation of a new pair of homologous X-linked and Y-linked genes, SlssX and SlssY, from S. latifolia. This is the fourth known pair of S. latifolia sex-linked genes with intact homologous copies on the X and Y chromosomes. Previous genes were isolated using fairly sophisticated molecular biology methods: screening of a S. latifolia cDNA library with a Y-specific probe, obtained using microdissection of the chromosomes (SlX1/Y1 and SlX4/Y4 genes [see Delichère et al. {1999} and Atanassov et al. {2003}]) and differential display (DD44X/Y genes [Moore et al. 2003]). The current study demonstrates that a much simpler approach, segregation analysis of random S. latifolia genes, obtained from a male flower bud cDNA library may be successfully used to isolate new S. latifolia sex-linked genes. A similar approach has already been used by Guttman and Charlesworth (1998), who found an X-linked gene, MROS3X, after testing only four genes for sex linkage. Laporte and Charlesworth (2001), however, have found no sex-linked genes after testing six genes. In this study, I tested over 20 genes before a single sex-linked gene was discovered. This plant species has 11 pairs of autosomes and a pair of sex chromosomes (2n = 22 + XX or XY). Assuming that all the chromosomes contain approximately the same number of genes, every 12th gene taken at random should be sex linked. However, as the Y chromosome is the largest, and the X chromosome is the second largest, the proportion of sex-linked genes should be higher than 1/12. To isolate the SlssX/Y gene pair, I tested 21 genes for sex linkage. Assuming that the proportion of sex-linked genes is 10%, according to the binomial distribution, the probability of such an outcome is about 25%. One may need to test as many as 30 genes to reduce the chance of failure to find a sex-linked gene to below 5%.
The new sex-linked genes encode proteins with strong homology to spermidine synthase, an enzyme that catalyzes the biosynthesis of spermidine (a ubiquitous polyamine). Thus, similar to the previously isolated SlX1/Y1, SlX4/Y4, and DD44X/Y genes, the new gene belongs to a group of housekeeping genes with X-linked and Y-linked copies resident singly on X and Y chromosomes. Such sex-linked genes were called class I by Lahn, Pearson, and Jegalian (2001) and are thought to represent the genes originally located on the protosex chromosomes that have evolved into X and Y chromosomes.
Although I have not isolated the 5' region of the new genes (the first exon and the 5' UTR), the low Ka/Ks ratio and the absence of stop codons in the coding sequence suggests that the SlssX and SlssY genes are functional. Moreover, the original c2B12 cDNA clone corresponds to the SlssY gene, demonstrating that this gene is actively transcribed. However, the comparison with the amino acid sequences of the human, Arabidopsis, and bacterial spermidine synthases revealed that three highly conserved residues are mutated in the protein encoded by the SlssY gene (at positions 161, 167, and 173 in the amino acid alignment shown on figure 3 in Korolev et al. [2002]), suggesting that its activity may be highly reduced.
Exon-intron structure also corresponds to that in the spermidine synthase genes of other plant species. Interestingly, the sequence of the SlssY gene is longer than that of the SlssX gene because of an approximately 0.5 Mb long insert into the first intron. This resembles the situation in the other S. latifolia sex-linked genes. With the exception of SlX1/Y1 gene, which has low X/Y divergence (Delichère et al. 1999; Filatov and Charlesworth 2002), all the other Y-linked genes are substantially longer than their X-linked homologs. Perhaps, this reflects the general trend on the Y chromosome and may explain why the Y chromosome "overgrew" in size the originally homologous X chromosome. The tendency of the S. latifolia Y chromosome to accumulate junk DNA is in line with theoretical predictions and empirical observations on the Y chromosomes of other species (Charlesworth, Sniegovski, and Stephan 1994; Junakovic et al. 1998).
Synonymous divergence between the SlssX and SlssY genes (4.7%) is higher than for SlX1/Y1 genes (3%) but lower than for DD44X/Y (8%) and SlX4/Y4 (16%) genes. If the X/Y divergence of homologous genes corresponds to the order of the genes on the sex chromosomes, as suggested by the "evolutionary strata" theory (Lahn, Pearson, and Jegalian 2001), then according to synonymous SlssX/SlssY divergence, the new gene should be located between the SlX1/Y1 and DD44X/Y. However, the SlssX/SlssY divergence in introns is higher than synonymous divergence, and according to intronic divergence, the new genes may fall into the same stratum as the DD44X/Y genes.
The rate of silent divergence in the SlssY gene is significantly higher than in SlssX gene, suggesting that the S. latifolia Y-linked gene accumulates substitutions, and probably mutates, faster than the X-linked homolog. This is consistent with the previous report of an elevated mutation rate on the S. latifolia Y chromosome (Filatov and Charlesworth 2002). However, the difference in silent substitution rates between the SlssX and SlssY genes is much lower than in the SlX4 and SlY4 genes reported previously, suggesting that the mutation rate may vary across the Y chromosome. Interestingly, all the amino acid replacements, which differ between the SlssX and SlssY genes, apparently occurred in the Y-linked gene. Some of these mutations affect highly conserved amino acid residues and are likely to disrupt the function of the SlssY gene. This is consistent with the theoretical prediction of relaxed purifying selection on the nonrecombining degenerating Y chromosomes (Charlesworth and Charlesworth 2000). Drastically reduced diversity in the S. latifolia Y-linked genes (Filatov et al. 2000, 2001; Laporte et al. 2004) should result in reduced efficacy of selection on the Y chromosome. The previous studies, however, have not been able to detect any evidence for this. The elevated nonsilent substitution rate was previously reported in the SlY4 gene (Filatov and Charlesworth 2002). However, because of a very high silent substitution rate in this gene, the Ka/Ks ratio does not differ significantly between the SlX4 and SlY4 genes, and the elevation of nonsilent substitution rate in SlY4 was attributed to a higher mutation rate in this gene.
Acknowledgements
I thank Sarah Grant for providing the S. latifolia male flower bud cDNA library, Deborah Charlesworth and Joe Ironside for providing Silene seeds, and Joe Ironside and Dave Gerrard for critical reading of the manuscript. This work was funded by the BBSRC.
References
Atanassov, I., C. Delichère, D. A. Filatov, D. Charlesworth, I. Negrutiu, and F. Moneger. 2002. Analysis and evolution of two functional Y-linked loci in a plant sex chromosome system. Mol. Biol. Evol. 18:2162–2168.
Bull, J. J. 1983. Evolution of sex determining mechanisms. The Benjamin/Cummings Publishing Company, Menlo Park, Calif.
Charlesworth, B. 1994. The effect of background selection against deleterious alleles on weakly selected, linked variants. Genet. Res. 63:213–228.
Charlesworth, B., and D. Charlesworth. 2000. The degeneration of Y chromosomes. Philos. Trans. R. Soc. Lond. B Biol. Sci. 355:1563–1572.
Charlesworth, B., P. Sniegovski, and W. Stephan. 1994. The evolutionary dynamics of repetitive DNA in eukariotes. Nature 371:215–220.
Delichère, C., J. Veuskens, M. Hernould, N. Baarbacar, A. Mouras, I. Negrutiu, and F. Monéger. 1999. SlY1, the first active gene cloned from a plant Y chromosome, encodes a WD-repeat protein. EMBO J. 18:4169–4179.
Desfeux, C., S. Maurice, J. P. Henry, B. Lejeune, and P. H. Gouyon. 1996. Evolution of reproductive systems in the genus Silene. Proc. R. Soc. Lond. B Biol. Sci. 263:409–414.
Engel, M. L., A. Chaboud, C. Dumas, and S. McCormick. 2003. Sperm cells of Zea mays have complex complement of mRNAs. Plant J. 34:697–707.
Filatov, D. A. 2002. A software for preparation and evolutionary analysis of DNA sequence data sets. Mol . Ecol. Notes 2:621–624.
———. 2004. Isolation of genes from plant Y chromosomes. Meth. Enzymol. (in press).
Filatov, D. A., and D. Charlesworth. 2002. Substitution rates in the X-link and Y-linked genes of the plants, Silene latifolia and S. dioica. Mol. Biol. Evol. 19:898–907.
Filatov, D. A., V. Laporte, C. Vitte, and D. Charlesworth. 2001. DNA diversity in sex linked and autosomal genes of the plant species Silene latifolia and S. dioica. Mol. Biol. Evol. 18:1442–1454.
Filatov, D. A., F. Moneger, I. Negrutiu, and D. Charlesworth. 2000. Low variability in a Y-linked plant gene and its implications for Y-chromosome evolution. Nature 404:388–390.
Gordo, I., and B. Charlesworth. 2000. On the speed of Muller's ratchet. Genetics 156:2137–2140.
Guttman, D. S., and D. Charlesworth. 1998. An X-linked gene with a degenerate Y-linked homologue in a dioecious plant. Nature 393:263–266.
Jukes, T. H., and C. R. Cantor. 1969. Evolution of protein molecules. Pp. 21–132 in N. H. Munro, ed. Mammalian protein metabolism. Academic Press, New York.
Junakovic, N., A. Terrinoni, C. Di Franco, C. Vieira, and C. Loevenbruck. 1998. Accumulation of transposable elements in the heterochromatin and on the Y chromosome of Drosophila simulans and Drosophila melanogaster. J. Mol. Evol. 46:661–668.
Keightley, P. D. 1998. Inference of genome-wide mutation rates and distributions of mutation effects for fitness traits: a simulation study. Genetics 150:1283–1293.
Keightley, P. D., and T. Johnson. 2004. MCALIGN: stochastic alignment of noncoding DNA sequences based on an evolutionary model of sequence evolution. Genome Res. 14:442–450.
Kejnovsky, E., J. Vrana, S. Matsunaga, P. Soucek, J. Siroky, J. Dolezel, and B. Vyskot. 2001. Localisation of male-specifically expressed MROS genes of Silene latifolia by PCR and flow-sorted sex chromosomes and autosomes. Genetics 158:1269–1277.
Kimura, M. 1980. A simple method for estimating evolutionary rate of base substitutions through comparative studies of nucleotide sequences. J. Mol. Evol. 16:111–120.
Korolev, S., Y. Ikeguchi, T. Skarina, S. Beasley, C. Arrowsmith, A. Edwards, A. Joachimiak, A.E. Pegg, and A. Savchenko. 2002. The crystal structure of spermidine synthase with a multisubstrate adduct inhibitor. Nat. Struct. Biol. 9:27–31.
Kumar, S., K. Tamura, I. Jacobsen, and M. Nei. 2000. MEGA2: molecular evolutionary genetics analysis. Version 2.0. Pennsylvania and Arizona State Universities, University Park, Pa. and Tempe, Ariz.
Lahn, B. T., N. M. Pearson, and K. Jegalian 2001. The human Y chromosome, in the light of evolution. Nat. Rev. Genet. 2:207–216.
Laporte, V., and D. Charlesworth. 2001. Non-sex-linked, nuclear cleaved amplified polymorphic sequences in Silene latifolia. J. Hered. 92:357–359.
Laporte, V., D. A. Filatov, E. Kamau, and D. Charlesworth. 2004. Indirect evidence from DNA sequence diversity for genetic degeneration of the Y-chromosome in dioecious species of the plant Silene. J. Evol. Biol. (in press).
Matsunaga, S., E. Isono, E. Kejnovsky, B. Vyskot, J. Dolezel, S. Kawano, and D. Charlesworth. 2003. Duplicative transfer of a MADS box gene to a plant Y chromosome. Mol. Biol. Evol. 20:1062–1069.
Moore, R. C., O. Kozyreva, S. Lebel-Hardenack, J. Siroky, R. Hobza, B. Vyskot, and S. R. Grant. 2003. Genetic and functional analysis of DD44, a sex-linked gene from the dioecious plant Silene latifolia, provides clues to early events in sex chromosome evolution. Genetics 163:321–334.
Nei, M., and T. Gogobori. 1986. Simple methods for estimating the numbers of synonymous and nonsynonymous nucleotide substitutions. Mol. Biol. Evol. 3:418–426.
Rice, W. R. 1987. Genetic hitchhiking and the evolution of reduced genetic activity of the Y sex chromosome. Genetics 116:161–167.
Skaletsky, H., T. Kuroda-Kawaguchi, P. J. Minx et al. (40 co-authors). 2003. The male-specific region of the human Y chromosome is a mosaic of discrete sequence classes. Nature 423:825–837.
Tajima, F. 1993. Simple methods for testing molecular clock hypotheses. Genetics 135:599–607.
Westergaard, M. 1958. The mechanism of sex determination in dioecious flowering plants. Adv. Genet. 9:217–281.
Yang, Z. 2001. Phylogenetic analysis by maximum likelihood (PAML), version 3e. University College London.
Ye, D., P. Installe, D. Ciupercescu, J. Veuskens, Y. Wu, G. Salesses, M. Jacobs, and I. Negrutiu. 1990. Sex determination in the dioecious Melandrium. I. First lessons from androgenic haploids. Sex. Plant Reprod. 3:179–186.
Yi, S., and B. Charlesworth. 2000. Contrasting patterns of molecular evolution of the genes on the new and old sex chromosomes of Drosophila miranda. Mol. Biol. Evol. 17:703–717.
Yoder, A., and Z. Yang. 2000. Estimation of primate speciation dates using local molecular clocks. Mol. Biol. Evol. 17:1081–1090.
Zurovcova, M., and W. F. Eanes. 1999. Lack of nucleotide polymorphism in the Y-linked sperm flagellar dynein gene Dhc-Yh3 of Drosophila melanogaster and D. simulans. Genetics 153:1709–1715(Dmitry A. Filatov)
Correspondence: E-mail: d.filatov@bham.ac.uk
Abstract
Dioecious white campion Silene latifolia has sex chromosomal sex determination, with homogametic (XX) females and heterogametic (XY) males. This species has become popular in studies of sex chromosome evolution. However, the lack of genes isolated from the X and Y chromosomes of this species is a major obstacle for such studies. Here, I report the isolation of a new sex-linked gene, Slss, with strong homology to spermidine synthase genes of other species. The new gene has homologous intact copies on the X and Y chromosomes (SlssX and SlssY, respectively). Synonymous divergence between the SlssX and SlssY genes is 4.7%, and nonsynonymous divergence is 1.4%. Isolation of a homologous gene from nondioecious S. vulgaris provided a root to the gene tree and allowed the estimation of the silent and replacement substitution rates along the SlssX and SlssY lineages. Interestingly, the Y-linked gene has higher synonymous and nonsynonymous substitution rates. The elevated synonymous rate in the SlssY gene, compared with SlssX, confirms our previous suggestion that the S. latifolia Y chromosome has a higher mutation rate, compared with the X chromosome. When differences in silent substitution rate are taken into account, the Y-linked gene still demonstrates significantly faster accumulation of nonsynonymous substitutions, which is consistent with the theoretical prediction of relaxed purifying selection in Y-linked genes, leading to the accumulation of nonsynonymous substitutions and genetic degeneration of the Y-linked genes.
Key Words: Silene latifolia ? sex chromosomes ? substitution rates ? spermidine synthase ? segregation analysis
Introduction
Sex chromosomes are not as common in plants as in animals, but they have been found in several phylogenetically distant groups, such as Rumex, Cannabis, and Silene (Westergaard 1958). The genus Silene contains over 500 species. Most of these are gynodioecious or hermaphroditic, but there are two clusters of dioecy that, apparently, evolved independently from each other (Desfeux et al. 1996). One such cluster is represented by five very closely related dioecious species (Silene latifolia, S. diclinis, S. dioica, S. heuffelii, and S. marizii), all of which have male heterogametic sex determination (Westergaard 1958). Silent nucleotide divergence of these species from nondioecious Silene species is about 15% (Filatov and Charlesworth 2002), suggesting that the common ancestor of these species diverged from a nondioecious ancestor about 10 to 20 MYA. The recent divergence of the sex chromosomes in S. latifolia provides an opportunity to study early stages of sex chromosome evolution, and so S. latifolia became a popular model species for such studies (Guttman and Charlesworth 1998; Filatov et al. 2000, 2001; Atanassov et al. 2002; Filatov and Charlesworth 2002; Moore et al. 2003; Matsunaga et al. 2003).
Despite their independent origins, the properties of sex chromosomes in different groups of organisms are quite similar: recombination is restricted between the Y and the X chromosomes, and Y chromosomes are usually genetically degenerate (Bull 1983), containing few functional genes (Skaletsky et al. 2003). Degeneration is thought to occur because of reduced the efficacy of selection on the nonrecombining Y chromosome (reviewed in Charlesworth and Charlesworth [2000]). Deleterious mutations may be carried to fixation by linked advantageous mutations ("selective sweeps," [Rice 1987]) or by "Muller's ratchet" (stochastic loss of chromosomes with the fewest mutations [Gordo and Charlesworth 2000]), and selective elimination of deleterious mutations ("background selection") may accelerate the stochastic fixation of mildly detrimental mutations (Charlesworth 1994). These processes should lead to reduced effective population size and sequence diversity in actively degenerating Y chromosomes (Charlesworth and Charlesworth 2000), which was indeed reported for several animal and plant species (Zurovcova and Eanes 1999; Yi and Charlesworth 2000; Filatov et al. 2000, 2001).
Although there is fairly good evidence for genetic degeneration of animal Y chromosomes, in plants the situation is less clear. The evidence for genetic degeneration of the Silene latifolia Y chromosome is based only on the finding of a degenerate Y-linked copy belonging to the MROS3 gene family (Guttman and Charlesworth 1998), which has functional copies on the X chromosome and autosomes (Kejnovsky et al. 2001), and on the fact that YY S. latifolia plants (having no X) are usually inviable (Ye et al. 1990). The degenerate Y-linked copy of the MROS3 gene could be a pseudogene that originated because of translocation or (retro)transposition of an autosomal copy, rather than a remnant of a functional Y-linked MROS3 gene. Inviability of the YY plants may be the result of a breakdown of gene regulation (e.g., in sex determination system), rather than caused by degeneracy of the Y chromosome. Thus, although suggestive, these findings cannot be taken as a solid evidence for Y chromosome degeneration in S. latifolia. Moreover, it is not clear whether the plant Y chromosomes can degenerate: active gene expression in haploid pollen (e.g., Engel et al. 2003) may help to efficiently eliminate deleterious mutations from plant Y-linked genes. This may explain why, despite the clear evidence for drastic reduction of genetic diversity on the S. latifolia Y chromosome (Filatov et al. 2000, 2001; Laporte et al. 2004), the previous study of substitution rates in S. latifolia sex-linked genes (Filatov and Charlesworth 2002) found no evidence for relaxation of purifying selection in the S. latifolia Y-linked SlY4 gene, compared with its X-linked homolog SlX4. The effect of reduced effective population size upon the nonsynonymous replacement rate depends on the distribution of the mutation selection coefficients, which is unknown (Keightley 1998). The SlX4/Y4 genes encode fructose-2,6-bisphosphatases (Atanassov et al. 2002), and selection against amino acid replacements in this housekeeping gene may be strong enough to remove most mutations, despite more than 10-fold reduction in the effective population size on the Y chromosome. Thus, more S. latifolia sex-linked genes have to be analyzed to detect the expected relaxation of purifying selection on the Y chromosome.
Despite substantial efforts by several laboratories to isolate S. latifolia sex-linked genes (reviewed in Filatov [2004]), only three genes with intact X-linked and Y-linked copies are known: SlX1/SlY1 (Delichère et al. 1999), SlX4/SlY4 (Atanassov et al. 2002), and DD44X/Y (Moore et al. 2003). In addition, an autosomal Slap3 gene was reported to have a functional Y-linked homolog (Matsunaga et al. 2003), and a MROS3 gene family has at least one active copy on the X chromosome (Guttman and Charlesworth 1998, Kejnovsky et al. 2001) and a degenerate nonfunctional copy on the Y chromosome (Guttman and Charlesworth 1998). The reasons for the low success rate with isolation of S. latifolia sex-linked genes are discussed in detail elsewhere (Filatov 2004).
Here, I report isolation of new S. latifolia sex-linked genes, using segregation analysis of random cDNA clones. Unlike previous approaches used to search for sex-linked genes (Desfeux et al. 1996; Atanassov et al. 2002; Moore et al. 2003) this method is technically simple and does not depend on X/Y divergence of the homologous copies. The analysis of silent and nonsilent substitution rates in the new X-linked and Y-linked genes demonstrated accelerated accumulation of nonsynonymous substitutions in the Y-linked gene and is consistent with the theoretical prediction of relaxed purifying selection on the Y chromosome.
Methods
Plant Material
Five families for use in segregation analyses were produced by crosses between three male S. latifolia plants (mSa9, mCB26, and mBM11), four S. latifolia females (fSa12, fBM4, fCB7, and fCB24), and one S. dioica female (fSd7): family 1 (fBM4xmCB26), family 2 (fCB7xmBM11), family 3 (fCB24xmBM11), family 4 (mSa9xfSa12), and family 5 (mSa9xfSd7). The parent S. latifolia and S. dioica plants were either grown from seeds collected by Dr. J. Ironside in Cluj botanic garden (CB plants), Romania and in Oradea, on the Czech/Romanian border (BM plants), or were provided by Prof. D. Charlesworth (Sa and Sd plants). The single S. vulgaris individual, which was used as an outgroup in molecular evolution analyses was grown from seeds provided by Prof. D. Charlesworth.
Genomic DNA was isolated from the leaves of S. latifolia parents and F1 progeny, as well as from a single S. vulgaris plant using Plant DNAzol reagent (Invitrogen), according to the manufacturer's instructions.
Segregation Analysis of S. latifolia cDNA Clones
Analysis of the previous work devoted to the isolation of plant sex-linked genes suggested that the most promising approach is segregation analysis of random genes (Filatov 2004). Currently, there are a limited number of S. latifolia genes available in GenBank, many of which have been tested for sex-linkage previously (Guttman and Charlesworth 1998; Laporte and Charlesworth 2001). Thus, as the first step in the search for new sex-linked genes, I sequenced 96 random cDNA clones taken from an S. latifolia male flower bud cDNA library (provided by Dr. S. Grant). Clones with homology to transposable elements, and without open reading frames (ORF), were excluded. Only 36 clones with long ORF (>150 amino acids) and encoding proteins with homology to known proteins were selected for further analysis. The sequences of these cDNA clones were used to design pairs of primers for PCR amplification of the corresponding genes from genomic DNA of the parents of the genetic families. Only primers yielding one or two PCR products in parents were used for further segregation analysis. If the PCR products amplified from parent genomic DNA differed in size (e.g., fig. 1), the size differences were used as markers in segregation analysis. Otherwise, the parental PCR products were sequenced and single-nucleotide polymorphisms (SNP) fixed between parents were used as genetic markers. In the former case, the segregation in the F1 progeny was tested in 1% to 2% agarose gels, whereas in the later case, PCR products of the F1 progeny were sequenced directly from one primer, and sex linkage was inferred from the pattern of SNP inheritance.
FIG. 1.— Segregation analysis of the SlssX/Y gene in the family 5. PCR amplification was conducted using c2B12+1 and c2B12–2 primer pairs. Lanes 8 and 10 correspond to male and female parents, respectively. Lanes 1 to 7 correspond to male F1 progeny, and lanes 11 to 18 correspond to female F1 progeny.
Isolation and Sequencing of SlssX and SlssY Genes
Amplification of shorter (1 kb) fragments of the SlssX/Y gene was conducted with c2B12+1 and c2B12–2 primers (table 1) using Promega Taq polymerase. The PCR products were separated on 2% agarose gels under low voltage (overnight) to ensure good separation of fast and slow bands in males. The bands were isolated from the gel using the Qiagen gel extraction kit and sequenced directly from c2B12+1 and c2B12–2 primers using the ABI BigDye version 3.1 sequencing kit and an ABI3700 automatic sequencer.
Table 1 Primers Used for PCR Amplification and Sequencing
The PCR amplification of the longer (4kb) regions was conducted using the c2B12+6deg and c2B12–4 primers and the ExpandTM Long-Range PCR System (Roche). PCR products were separated on 1% agarose gels, isolated from the gel using Qiagen gel extraction kit, cloned using the TOPO TA Cloning Kit (Invitrogen), and sequenced using the ABI BigDye version 3.1 sequencing kit and the ABI3700 automatic sequencer. The sequencing of the SlssX and SlssY genes was conducted using the following primers (table 1): universal M13F and M13R, c2B12+9, c2B12–13, c2B12+10, c2B12–12, c2B12–5, and c2B12+1. In addition, sequencing of the longer SlssY gene required c2B12–11 and c2B12+19 to cover the entire region. The same primers and conditions were used for isolation and sequencing of the Svss gene from S. vulgaris. GenGank accession numbers for the SlssX, SlssY, and Svss genes are AY705437, AY705438, and AY705436, respectively.
The sequence traces were checked, base calls were corrected, and contigs were assembled using ProSeq software (Filatov 2002). The sequences of SlssX, SlssY, and Svss genes were aligned by the mcalign program (Keightley and Johnson 2004).
The intron-exon structures and coding regions of the SlssX, SlssY, and Svss genes were inferred by aligning them with the sequences of the S. latifolia cDNA clone c2B12 and with cDNA sequences of spermidine synthases of Arabidopsis (NM_102230 [GenBank] ), Pisum (AF043108), Coffea (AB015599), Malus (AB072915), and Datura (Y08253).
Nucleotide Substitution Rates
Intronic, synonymous, and nonsilent pairwise divergence values were calculated using MEGA version 2 software (Kumar et al. 2000). MEGA2 was also used to conduct the Tajima's (1993) relative-rates test. Synonymous and nonsynonymous divergence was estimated using the Nei-Gojobori method (Nei and Gojobori 1986) with Jukes-Cantor correction (Jukes and Cantor 1969). The Kimura's two-parameter distance (Kimura 1980) was used for divergence in intron regions.
To compare substitution rates in introns, the baseml program from the PAML version 3.13d software package (Yang 2001) was used. The "local clock" mode (Yoder and Yang 2000) was used to test the significance of substitution rate difference between the SlssX and SlssY lineages. To estimate the silent (Ks) and nonsilent (Ka) substitution rates and the Ka/Ks ratios for the X-linked and the Y-linked genes, the codeml program from the PAML package was used, assuming the phylogeny in figure 2. Each branch was assumed to have a separate substitution rate ("no clock mode"), and three Ka/Ks ratios were assigned as shown in figure 2 by different branch shadings. For the likelihood-ratio test, the model with three Ka/Ks ratios was compared with a model with just two ratios, one for autosomal and one for the X-linked and the Y-linked genes.
FIG. 2.— The gene tree used in the ML analyses. Different shading shows the separate Ka/Ks rates used for SlssX, SlssY, and Svss genes. The number of amino acid replacements and synonymous substitutions in every lineage is shown as nominator and denominator, respectively.
Results
Isolation of the SlssX/Y Genes
To isolate more S. latifolia sex-linked genes, I conducted segregation analysis of random cDNA clones isolated from male flower bud cDNA library (provided by Dr. Sarah Grant), as described in the Methods. In total, 21 clones were tested for sex linkage, and one of the clones, c2B12 (accession number AY705439), appeared to correspond to a sex-linked gene. PCR primers c2B12+1 and c2B12–2, designed using the sequence of the c2B12 cDNA clone, amplified an approximately 1-kb region from all male and female S. latifolia and S. dioica individuals. In addition, male S. latifolia individuals had a second PCR product, which was slightly longer (fig. 1). Male-specificity of the longer fragment suggested that it might be Y linked. Y linkage of this fragment was confirmed by segregation analysis in five families (36 male and 34 female F1 individuals in total), which demonstrated that this fragment is always inherited from father to sons but not to daughters (fig. 1). To find molecular markers for the segregation analysis of the shorter PCR product, it was sequenced directly for the female fBM4 and male mCB26 plants used as parents of family 1. This revealed two nucleotide differences, with mCB26 hemizygous for the T and fBM4 homozygous for C in both polymorphic sites. None of the nucleotide differences affected restriction sites, so the PCR products from eight female and seven male F1 progeny from family 1 were sequenced directly. All the female F1 progeny were heterozygous for paternal and maternal alleles (A/C), whereas the male F1 progeny were hemizygous for the maternal variant (C). Thus, the gene corresponding to the shorter PCR product is X linked.
Although the smaller and the larger PCR products demonstrate X linkage and Y linkage, there remains a possibility that these genes are located in the pseudoautosomal region. If so, the X-linked and Y-linked copies should occasionally recombine with each other. However, sequencing of the X-linked and Y-linked genes demonstrated that intron divergence among the X-linked and the Y-linked genes exceeds 8% (see below), which is too high to be accounted for by divergence between the recombining alleles. Also, the male specificity of the longer PCR product (over 30 males from wild populations were tested [data not shown]) makes the pseudoautosomal location unlikely.
Sequencing of the longer (Y-linked) and the shorter (X-linked) PCR fragments amplified by the c2B12+1 and c2B12–2 primers confirmed that both fragments are homologous to the c2B12 cDNA clone. This clone has strong (over 80% DNA sequence identity) homology to coding regions of spermidine synthase genes from Arabidopsis (accession number NM_102230), Pisum (accession number AF043108), Coffea (accession number AB015599), Malus (accession number AB072915), and Datura (accession number Y08253). Thus, the X-linked and Y-linked genes amplified by the c2B12+1 and c2B12–2 primers were assumed to encode S. latifolia spermidine synthase and were named SlssX and SlssY, respectively.
To isolate a longer region of the SlssX and SlssY genes, I used the sequences of the coding regions of plant spermidine synthase genes to design a degenerate forward primer, c2B12+6deg, which was used in a pair with c2B12–4 to amplify the region from intron 1 to the 3' untranslated region of the SlssX and SlssY genes. These primers amplified a single band of approximately 4 kb in S. latifolia females, and two bands of approximately 4 kb and approximately 4.5 kb in S. latifolia males. The larger and the smaller fragments were cloned and sequenced as described in the Methods section. As expected, the sequence of the 3' region of the larger fragment was identical to the fragment of the SlssY gene amplified with c2B12+1 and c2B12–2 primers, and the smaller fragment corresponded to SlssX sequence. Segregation analysis in the family 1 (seven male and eight female F1 progeny) also confirmed that the larger (4.5 kb long) band is Y linked. The X linkage of the smaller (4 kb long) PCR product was confirmed by partial direct sequencing of the fragment amplified from the maternal (fBM4) and paternal (mCB26) individuals and the same set of F1 offspring of family 1, as used for the segregation of the smaller 1-kb PCR product, amplified with c2B12+1 and c2B12–2 primers. As expected, all the male F1 progeny inherited the maternal allele but not the paternal allele, whereas all the female F1 progeny inherited maternal and paternal alleles.
Exon-Intron Structure of the SlssX/SlssY Genes
The alignment of sequences of the SlssX and SlssY genomic fragments and the cDNA clone revealed the presence of three insertions of 80, 370, and 250 nt in the genomic fragments. According to the position in the cDNA sequence, these insertions correspond to the introns 6, 7, and 8 of the A. thaliana spermidine synthase I gene. As the S. latifolia cDNA clone c2B12 contained only the 3' portion of the spermidine synthase coding region (exons 9 to 6 and 12 nt of exon 5), the exon-intron structure of the 5' region of the SlssX and SlssY genes had to be established from the comparisons with spermidine synthases of the other plant species. Alignment of the SlssX and SlssY genomic fragments with the sequences of coding regions of Arabidopsis, Pisum, Coffea, Malus, and Datura spermidine synthase genes allowed the positions of the other introns and exons to be established. The presence of splice-site consensus sequences (5'-GT...AT-3') supports the position of introns in the SlssX and SlssY genes, except intron 7, which had a rare (GC) 5' splice site. In the case of intron 7, the position of the intron was unambiguously established from the comparison of the cDNA and genomic sequences.
Divergence Between the SlssX and SlssY Genes
Overall, 837 nt of coding region and 2,554 nt of intron sequences were analyzed after exclusion of insertion/deletion (indel) regions and regions of uncertain alignment in the first exon and in the 3' UTR. Most indels were located in the first intron, including the 481 bp long insertion in SlssY, resulting in a substantial length difference between the SlssX and SlssY PCR products. Synonymous divergence between the SlssX and SlssY genes is 4.7% ± 1.6%, and divergence in introns is much higher, 8.1% ± 0.6%. The higher divergence in introns may partly be caused by misalignment in the first intron, which contains multiple insertions and deletions. However, exclusion of the first intron from the analysis does not reduce the divergence; in fact it slightly increases divergence (8.8% ± 0.8%).
There are only nine nonsynonymous differences between the SlssX and SlssY genes, and the nonsynonymous divergence is 1.4% ± 0.5%, substantially lower than synonymous divergence, suggesting that at least one of these genes is under purifying selective constraint. Interestingly, the sequence of the S. latifolia c2B12 cDNA clone is identical to the exons of the SlssY and differs from SlssX by 12 nt. Thus, the Y-linked copy of the S. latifolia spermidine synthase gene is actively transcribed. The open reading frame is preserved in both X-linked and Y-linked copies of the gene, strengthening the evidence that both copies are functional or had been functional until very recently.
Substitution Rates in the SlssX and SlssY Genes
If one of the copies of the SlssX/Y gene became nonfunctional after the X-linked and Y-linked copies stopped recombining with each other, it would be expected to accumulate multiple nonsynonymous substitutions. Even if both copies are functional, theory predicts that purifying selection should be less efficient in the nonrecombining Y-linked copy, and it is expected to accumulate more nonsynonymous substitutions than the X-linked copy (reviewed in Charlesworth and Charlesworth [2000]).
To detect whether the mutations in the SlssX or in the SlssY lineages disproportionately contributed to nonsilent and silent divergence between the two copies, I PCR-amplified and sequenced a homologous region (referred below as Svss) from nondioecious Silene vulgaris, using the c2B12+6deg and c2B12–2 primers. Using the sequence of the S. vulgasis Svss gene, it is possible to root the gene tree for the S. latifolia SlssX and SlssY genes and to establish how many of the nine amino acid differences between the SlssX and SlssY genes occurred in the X-linked and in the Y-linked genes. Interestingly, in all nine cases, the amino acid in the SlssX gene matched that in the Svss; thus, all nine of the nonsynonymous changes have probably occurred along the SlssY lineage. The excess of nonsynonymous substitutions in the Y-linked gene is significant (2= 7.0, 1df, P = 0.008) by the Tajima's relative-rate test (Tajima 1993) and is consistent with relaxed selective constraint on the nonrecombing Y chromosome, as suggested by theory. Interestingly, one of the mutations in SlssY (the AsnGly at the position 173 of the protein sequence alignment shown in figure 3 in Korolev et al. [2002]) is especially likely to disrupt the spermidine synthase activity of the protein encoded by the Y-linked copy of the gene, because according to crystal structure of the bacterial enzyme, this residue interacts with the substrate (Korolev et al. 2002).
Alternatively, the elevated nonsilent substitution rate in the SlssY gene could be caused by a higher mutation rate on the S. latifolia Y chromosome (Filatov and Charlesworth 2002). Pairwise Svss/SlssX and Svss/SlssY synonymous divergence is 9.9% ± 2.4% and 11.6% ± 2.6%, respectively, suggesting that the Y-linked gene accumulates synonymous substitutions (and probably mutates) faster than the X-linked homolog. To test whether the mutation rate is indeed higher in the SlssY than in the SlssX gene, I conducted a maximum-likelihood ratio test to compare substitution rates in introns (Ki) of these genes. The model with three substitution rates (KiX, KiY, and KiA) was compared with the model with two rates (KiX = KiY and KiA). The likelihoods for these models were calculated using baseml program with local molecular clock (Yoder and Yang 2000), and the significance was tested using the likelihood-ratio test. The model with three rates fits the data significantly better than the model with two rates (2lnL = 5.75, P < 0.05), demonstrating that the intron substitution rate in the SlssY gene (KiY= 0.058) is significantly elevated, compared with that in the SlssX gene (KiX= 0.04). The difference in the substitution rates, however is not as drastic as for the SlX4/SlY4 genes reported previously (Filatov and Charlesworth 2002).
The higher mutation rate in the Y-linked gene could have elevated the nonsynonymous substitution rate and has to be taken into account when comparing the SlssX and SlssY genes. For this purpose, I used the codeml program (Yang 2001) to compare two models, one with three Ka/Ks ratios (separate ratios for branches) and the other with the ratios for SlssX and SlssY genes forced to be equal. The former model fits the data significantly better that the latter model (2lnL = 4.003, P < 0.05), demonstrating that, taking into account the possible differences in underlying mutation rates between the SlssX and SlssY genes, the Y-linked gene accumulates nonsynonymous substitutions significantly faster than the X-linked homolog. The Ka/Ks ratios are well below unity for both genes (0.000 and 0.519 for the SlssX and SlssY, respectively), consistent with both genes being functional and subject to purifying selective constraint.
Discussion
Here, I reported isolation of a new pair of homologous X-linked and Y-linked genes, SlssX and SlssY, from S. latifolia. This is the fourth known pair of S. latifolia sex-linked genes with intact homologous copies on the X and Y chromosomes. Previous genes were isolated using fairly sophisticated molecular biology methods: screening of a S. latifolia cDNA library with a Y-specific probe, obtained using microdissection of the chromosomes (SlX1/Y1 and SlX4/Y4 genes [see Delichère et al. {1999} and Atanassov et al. {2003}]) and differential display (DD44X/Y genes [Moore et al. 2003]). The current study demonstrates that a much simpler approach, segregation analysis of random S. latifolia genes, obtained from a male flower bud cDNA library may be successfully used to isolate new S. latifolia sex-linked genes. A similar approach has already been used by Guttman and Charlesworth (1998), who found an X-linked gene, MROS3X, after testing only four genes for sex linkage. Laporte and Charlesworth (2001), however, have found no sex-linked genes after testing six genes. In this study, I tested over 20 genes before a single sex-linked gene was discovered. This plant species has 11 pairs of autosomes and a pair of sex chromosomes (2n = 22 + XX or XY). Assuming that all the chromosomes contain approximately the same number of genes, every 12th gene taken at random should be sex linked. However, as the Y chromosome is the largest, and the X chromosome is the second largest, the proportion of sex-linked genes should be higher than 1/12. To isolate the SlssX/Y gene pair, I tested 21 genes for sex linkage. Assuming that the proportion of sex-linked genes is 10%, according to the binomial distribution, the probability of such an outcome is about 25%. One may need to test as many as 30 genes to reduce the chance of failure to find a sex-linked gene to below 5%.
The new sex-linked genes encode proteins with strong homology to spermidine synthase, an enzyme that catalyzes the biosynthesis of spermidine (a ubiquitous polyamine). Thus, similar to the previously isolated SlX1/Y1, SlX4/Y4, and DD44X/Y genes, the new gene belongs to a group of housekeeping genes with X-linked and Y-linked copies resident singly on X and Y chromosomes. Such sex-linked genes were called class I by Lahn, Pearson, and Jegalian (2001) and are thought to represent the genes originally located on the protosex chromosomes that have evolved into X and Y chromosomes.
Although I have not isolated the 5' region of the new genes (the first exon and the 5' UTR), the low Ka/Ks ratio and the absence of stop codons in the coding sequence suggests that the SlssX and SlssY genes are functional. Moreover, the original c2B12 cDNA clone corresponds to the SlssY gene, demonstrating that this gene is actively transcribed. However, the comparison with the amino acid sequences of the human, Arabidopsis, and bacterial spermidine synthases revealed that three highly conserved residues are mutated in the protein encoded by the SlssY gene (at positions 161, 167, and 173 in the amino acid alignment shown on figure 3 in Korolev et al. [2002]), suggesting that its activity may be highly reduced.
Exon-intron structure also corresponds to that in the spermidine synthase genes of other plant species. Interestingly, the sequence of the SlssY gene is longer than that of the SlssX gene because of an approximately 0.5 Mb long insert into the first intron. This resembles the situation in the other S. latifolia sex-linked genes. With the exception of SlX1/Y1 gene, which has low X/Y divergence (Delichère et al. 1999; Filatov and Charlesworth 2002), all the other Y-linked genes are substantially longer than their X-linked homologs. Perhaps, this reflects the general trend on the Y chromosome and may explain why the Y chromosome "overgrew" in size the originally homologous X chromosome. The tendency of the S. latifolia Y chromosome to accumulate junk DNA is in line with theoretical predictions and empirical observations on the Y chromosomes of other species (Charlesworth, Sniegovski, and Stephan 1994; Junakovic et al. 1998).
Synonymous divergence between the SlssX and SlssY genes (4.7%) is higher than for SlX1/Y1 genes (3%) but lower than for DD44X/Y (8%) and SlX4/Y4 (16%) genes. If the X/Y divergence of homologous genes corresponds to the order of the genes on the sex chromosomes, as suggested by the "evolutionary strata" theory (Lahn, Pearson, and Jegalian 2001), then according to synonymous SlssX/SlssY divergence, the new gene should be located between the SlX1/Y1 and DD44X/Y. However, the SlssX/SlssY divergence in introns is higher than synonymous divergence, and according to intronic divergence, the new genes may fall into the same stratum as the DD44X/Y genes.
The rate of silent divergence in the SlssY gene is significantly higher than in SlssX gene, suggesting that the S. latifolia Y-linked gene accumulates substitutions, and probably mutates, faster than the X-linked homolog. This is consistent with the previous report of an elevated mutation rate on the S. latifolia Y chromosome (Filatov and Charlesworth 2002). However, the difference in silent substitution rates between the SlssX and SlssY genes is much lower than in the SlX4 and SlY4 genes reported previously, suggesting that the mutation rate may vary across the Y chromosome. Interestingly, all the amino acid replacements, which differ between the SlssX and SlssY genes, apparently occurred in the Y-linked gene. Some of these mutations affect highly conserved amino acid residues and are likely to disrupt the function of the SlssY gene. This is consistent with the theoretical prediction of relaxed purifying selection on the nonrecombining degenerating Y chromosomes (Charlesworth and Charlesworth 2000). Drastically reduced diversity in the S. latifolia Y-linked genes (Filatov et al. 2000, 2001; Laporte et al. 2004) should result in reduced efficacy of selection on the Y chromosome. The previous studies, however, have not been able to detect any evidence for this. The elevated nonsilent substitution rate was previously reported in the SlY4 gene (Filatov and Charlesworth 2002). However, because of a very high silent substitution rate in this gene, the Ka/Ks ratio does not differ significantly between the SlX4 and SlY4 genes, and the elevation of nonsilent substitution rate in SlY4 was attributed to a higher mutation rate in this gene.
Acknowledgements
I thank Sarah Grant for providing the S. latifolia male flower bud cDNA library, Deborah Charlesworth and Joe Ironside for providing Silene seeds, and Joe Ironside and Dave Gerrard for critical reading of the manuscript. This work was funded by the BBSRC.
References
Atanassov, I., C. Delichère, D. A. Filatov, D. Charlesworth, I. Negrutiu, and F. Moneger. 2002. Analysis and evolution of two functional Y-linked loci in a plant sex chromosome system. Mol. Biol. Evol. 18:2162–2168.
Bull, J. J. 1983. Evolution of sex determining mechanisms. The Benjamin/Cummings Publishing Company, Menlo Park, Calif.
Charlesworth, B. 1994. The effect of background selection against deleterious alleles on weakly selected, linked variants. Genet. Res. 63:213–228.
Charlesworth, B., and D. Charlesworth. 2000. The degeneration of Y chromosomes. Philos. Trans. R. Soc. Lond. B Biol. Sci. 355:1563–1572.
Charlesworth, B., P. Sniegovski, and W. Stephan. 1994. The evolutionary dynamics of repetitive DNA in eukariotes. Nature 371:215–220.
Delichère, C., J. Veuskens, M. Hernould, N. Baarbacar, A. Mouras, I. Negrutiu, and F. Monéger. 1999. SlY1, the first active gene cloned from a plant Y chromosome, encodes a WD-repeat protein. EMBO J. 18:4169–4179.
Desfeux, C., S. Maurice, J. P. Henry, B. Lejeune, and P. H. Gouyon. 1996. Evolution of reproductive systems in the genus Silene. Proc. R. Soc. Lond. B Biol. Sci. 263:409–414.
Engel, M. L., A. Chaboud, C. Dumas, and S. McCormick. 2003. Sperm cells of Zea mays have complex complement of mRNAs. Plant J. 34:697–707.
Filatov, D. A. 2002. A software for preparation and evolutionary analysis of DNA sequence data sets. Mol . Ecol. Notes 2:621–624.
———. 2004. Isolation of genes from plant Y chromosomes. Meth. Enzymol. (in press).
Filatov, D. A., and D. Charlesworth. 2002. Substitution rates in the X-link and Y-linked genes of the plants, Silene latifolia and S. dioica. Mol. Biol. Evol. 19:898–907.
Filatov, D. A., V. Laporte, C. Vitte, and D. Charlesworth. 2001. DNA diversity in sex linked and autosomal genes of the plant species Silene latifolia and S. dioica. Mol. Biol. Evol. 18:1442–1454.
Filatov, D. A., F. Moneger, I. Negrutiu, and D. Charlesworth. 2000. Low variability in a Y-linked plant gene and its implications for Y-chromosome evolution. Nature 404:388–390.
Gordo, I., and B. Charlesworth. 2000. On the speed of Muller's ratchet. Genetics 156:2137–2140.
Guttman, D. S., and D. Charlesworth. 1998. An X-linked gene with a degenerate Y-linked homologue in a dioecious plant. Nature 393:263–266.
Jukes, T. H., and C. R. Cantor. 1969. Evolution of protein molecules. Pp. 21–132 in N. H. Munro, ed. Mammalian protein metabolism. Academic Press, New York.
Junakovic, N., A. Terrinoni, C. Di Franco, C. Vieira, and C. Loevenbruck. 1998. Accumulation of transposable elements in the heterochromatin and on the Y chromosome of Drosophila simulans and Drosophila melanogaster. J. Mol. Evol. 46:661–668.
Keightley, P. D. 1998. Inference of genome-wide mutation rates and distributions of mutation effects for fitness traits: a simulation study. Genetics 150:1283–1293.
Keightley, P. D., and T. Johnson. 2004. MCALIGN: stochastic alignment of noncoding DNA sequences based on an evolutionary model of sequence evolution. Genome Res. 14:442–450.
Kejnovsky, E., J. Vrana, S. Matsunaga, P. Soucek, J. Siroky, J. Dolezel, and B. Vyskot. 2001. Localisation of male-specifically expressed MROS genes of Silene latifolia by PCR and flow-sorted sex chromosomes and autosomes. Genetics 158:1269–1277.
Kimura, M. 1980. A simple method for estimating evolutionary rate of base substitutions through comparative studies of nucleotide sequences. J. Mol. Evol. 16:111–120.
Korolev, S., Y. Ikeguchi, T. Skarina, S. Beasley, C. Arrowsmith, A. Edwards, A. Joachimiak, A.E. Pegg, and A. Savchenko. 2002. The crystal structure of spermidine synthase with a multisubstrate adduct inhibitor. Nat. Struct. Biol. 9:27–31.
Kumar, S., K. Tamura, I. Jacobsen, and M. Nei. 2000. MEGA2: molecular evolutionary genetics analysis. Version 2.0. Pennsylvania and Arizona State Universities, University Park, Pa. and Tempe, Ariz.
Lahn, B. T., N. M. Pearson, and K. Jegalian 2001. The human Y chromosome, in the light of evolution. Nat. Rev. Genet. 2:207–216.
Laporte, V., and D. Charlesworth. 2001. Non-sex-linked, nuclear cleaved amplified polymorphic sequences in Silene latifolia. J. Hered. 92:357–359.
Laporte, V., D. A. Filatov, E. Kamau, and D. Charlesworth. 2004. Indirect evidence from DNA sequence diversity for genetic degeneration of the Y-chromosome in dioecious species of the plant Silene. J. Evol. Biol. (in press).
Matsunaga, S., E. Isono, E. Kejnovsky, B. Vyskot, J. Dolezel, S. Kawano, and D. Charlesworth. 2003. Duplicative transfer of a MADS box gene to a plant Y chromosome. Mol. Biol. Evol. 20:1062–1069.
Moore, R. C., O. Kozyreva, S. Lebel-Hardenack, J. Siroky, R. Hobza, B. Vyskot, and S. R. Grant. 2003. Genetic and functional analysis of DD44, a sex-linked gene from the dioecious plant Silene latifolia, provides clues to early events in sex chromosome evolution. Genetics 163:321–334.
Nei, M., and T. Gogobori. 1986. Simple methods for estimating the numbers of synonymous and nonsynonymous nucleotide substitutions. Mol. Biol. Evol. 3:418–426.
Rice, W. R. 1987. Genetic hitchhiking and the evolution of reduced genetic activity of the Y sex chromosome. Genetics 116:161–167.
Skaletsky, H., T. Kuroda-Kawaguchi, P. J. Minx et al. (40 co-authors). 2003. The male-specific region of the human Y chromosome is a mosaic of discrete sequence classes. Nature 423:825–837.
Tajima, F. 1993. Simple methods for testing molecular clock hypotheses. Genetics 135:599–607.
Westergaard, M. 1958. The mechanism of sex determination in dioecious flowering plants. Adv. Genet. 9:217–281.
Yang, Z. 2001. Phylogenetic analysis by maximum likelihood (PAML), version 3e. University College London.
Ye, D., P. Installe, D. Ciupercescu, J. Veuskens, Y. Wu, G. Salesses, M. Jacobs, and I. Negrutiu. 1990. Sex determination in the dioecious Melandrium. I. First lessons from androgenic haploids. Sex. Plant Reprod. 3:179–186.
Yi, S., and B. Charlesworth. 2000. Contrasting patterns of molecular evolution of the genes on the new and old sex chromosomes of Drosophila miranda. Mol. Biol. Evol. 17:703–717.
Yoder, A., and Z. Yang. 2000. Estimation of primate speciation dates using local molecular clocks. Mol. Biol. Evol. 17:1081–1090.
Zurovcova, M., and W. F. Eanes. 1999. Lack of nucleotide polymorphism in the Y-linked sperm flagellar dynein gene Dhc-Yh3 of Drosophila melanogaster and D. simulans. Genetics 153:1709–1715(Dmitry A. Filatov)