当前位置: 首页 > 医学版 > 期刊论文 > 基础医学 > 分子生物学进展 > 2005年 > 第10期 > 正文
编号:11259170
Substitution Rate and Structural Divergence of 5'UTR Evolution: Comparative Analysis Between Human and Cynomolgus Monkey cDNAs
     * Division of Genetic Resources, National Institute of Infectious Diseases, Tokyo, Japan; Department of Medical Genome Sciences, Graduate School of Frontier Sciences, University of Tokyo, Chiba, Japan; Center of Information Biology, National Institute of Genetics, Research Organization of Information and Systems, Shizuoka, Japan; Institute of Molecular Biology, Academia Sinica, Taipei, Taiwan; and || Department of Ecology and Evolution, University of Chicago

    E-mail: khashi@nih.go.jp.

    Abstract

    The substitution rate and structural divergence in the 5'-untranslated region (UTR) were investigated by using human and cynomolgus monkey cDNA sequences. Due to the weaker functional constraint in the UTR than in the coding sequence, the divergence between humans and macaques would provide a good estimate of the nucleotide substitution rate and structural divergence in the 5'UTR. We found that the substitution rate in the 5'UTR (K5UTR) averaged 10%–20% lower than the synonymous substitution rate (Ks). However, both the K5UTR and nonsynonymous substitution rate (Ka) were significantly higher in the testicular cDNAs than in the brain cDNAs, whereas the Ks did not differ. Further, an in silico analysis revealed that 27% (169/622) of macaque testicular cDNAs had an altered exon-intron structure in the 5'UTR compared with the human cDNAs. The fraction of cDNAs with an exon alteration was significantly higher in the testicular cDNAs than in the brain cDNAs. We confirmed by using reverse transcriptase–polymerase chain reaction that about one-third (6/16) of in silico "macaque–specific" exons in the 5'UTR were actually macaque specific in the testis. The results imply that positive selection increased K5UTR and structural alteration rate of a certain fraction of genes as well as Ka. We found that both positive and negative selection can act on the 5'UTR sequences.

    Key Words: evolution ? substitution rate ? 5'UTR ? alternative splicing ? primates

    Introduction

    The most pronounced evolutionary conservation of genomic sequences reflects constraints on protein structure and function, and as a result, protein-coding sequences are much more conserved than noncoding sequences. Because of this, it has long been argued that changes in gene regulation may be more important to phenotypic evolution than changes in protein-coding sequences (King and Wilson 1975; Enard et al. 2002). Although functional constraints in intergenic sequences as promoter or enhancer sequences have been thoroughly studied because of their utility as markers of functional parts of noncoding sequences (e.g., Bejerano et al. 2004; Suzuki et al. 2004), the study of evolution of the untranslated region (UTR) of transcripts has been limited owing to the paucity of transcript sequences in appropriate species. Typical mRNA contains UTR upstream (5'UTR) and downstream (3'UTR) of protein-coding sequence. Watanabe et al. (2004) reported that orthologous genes with high divergences in their 5'UTRs tend to show differences in expression levels between humans and chimpanzees, while no correlation between the nucleotide and expression divergence was found in the 3'UTR. Thus, molecular evolutionary study of the 5'UTR is important to understand how our genome has evolved and become organized. As an initial step, we should measure the nucleotide substitution rate and the magnitude of structural divergence in the 5'UTR to infer what type of natural selection has an influence on the 5'UTR evolution.

    Due to the weaker functional constraint in the UTR than in the coding sequence (CDS, e.g., Miyata, Yasunaga, and Nishida 1980; Li 1997; Makalowski and Boguski 1998), there is a limitation to study the UTR sequence evolution by using distantly related species. For example, most UTR sequences between humans and mice are not conserved well enough to be aligned, which would considerably hamper the evolutionary analysis especially when we want to find the signature of positive selection (i.e., accelerated evolution). The divergence between humans and macaques are approximately 5%–7% at the nucleotide level (e.g., Savatier et al. 1987; Kawamura et al. 1991; Osada et al. 2002b; Wang et al. 2003), which allows us to compare the macaque UTR sequence with that of humans. So far, the macaque is the only model organism for which UTRs are readily alignable. We constructed cDNA libraries from cynomolgus monkey (Macaca fascicularis) brains and testis by the oligo-capping method for a variety of purposes, such as identification of novel human genes (Osada et al. 2001, 2002a, 2002b) and evolutionary comparative analysis (Osada et al. 2002c; Mesak et al. 2003). The cynomolgus monkey cDNA libraries were used to conduct comparative analysis of the 5'UTR sequences of humans and macaques.

    The result of recent studies has suggested that the 5'UTRs in humans may have been under positive selection because of the higher substitution rate and lower polymorphism in the 5'UTR than in the synonymous sites (Hellmann et al. 2003). In this report, we compared macaque cDNA sequences from two different organs (brain and testis) with human orthologous cDNAs and reported the substitution rates for nonsynonymous sites (Ka), synonymous sites (Ks), and 5'UTR sites (K5UTR).

    Structural divergence of UTRs, such as gains and/or losses of exons, should be an important factor in 5'UTR evolution, as well as in the evolution related to nucleotide substitution. Several reports have shown species-specific gains and/or losses of protein-coding exons between humans and rodents that have an important role in the creation of proteomic diversity (Modrek and Lee 2003; Nurtdinov et al. 2003; Pan et al. 2005). Similarly, several studies have found that transcripts from the same locus have different 5'UTRs as a result of alternative splicing or use of different promoters (Chew et al. 2003; Bernard, Woodruff, and Plant 2004). In some cases, tissue-specific transcripts are generated starting at a particular transcription start site by using cis-regulatory elements different from those used in other tissues (Mao, Chirala, and Wakil 2003; Newton et al. 2003), and many such transcripts have been identified for genes expressed in the testis (P. Mezquita, C. Mezquita, and J. Mezquita 1999; Newton et al. 2003; Sugiura et al. 2003). These variants in the exons for 5'UTRs also have a quantitative effect on the translation of the genes (Wang et al. 1999; Chamas and Sabban 2002; Gellersen et al. 2002; Lammich et al. 2004). In this study, we analyzed the structural divergence in the 5'UTR between human cDNAs registered in public databases and cynomolgus monkey cDNAs.

    Materials and Methods

    cDNA Library from Cynomolgus Monkey

    Two cynomolgus monkeys, a 15-year-old male and a 16-year-old female, were used for tissue collection. The monkeys were cared for and handled according to the guidelines established by the Institutional Animal Care and Use Committee of the National Institute of Infectious Diseases (NIID) of Japan. The tissues were harvested in accordance with all the guidelines in the Laboratory Biosafety Manual of the World Health Organization and were carried out at the P3 facility for monkeys of the Tsukuba Primate Center of NIID. Immediately after removing the organs, the tissue samples were frozen with liquid nitrogen. Oligo-capped cDNA libraries were constructed according to the method described previously (Suzuki et al. 1997). The 5'-ends of the cDNAs were capped with oligonucleotides to preserve the full length of the transcript.

    Sequencing of Cynomolgus Monkey cDNA Clones

    The 5'-end of the testicular cDNA clones were sequenced with an ABI 3700 sequencer (Applied Biosystems Japan, Cjuo-ku, Tokyo, Japan) and clustered with DYNACLUST (DYNACOM, Mobara, Chiba, Japan). We isolated 10,426 cDNA clones, and sequencing their 5'-ends yielded 4,980 clusters of sequences. To investigate how many macaque cDNA clones have valid human homologous genes, we performed a Blast search of the human RefSeq (Pruitt, Tatusova, and Maglott 2003) database. The 5'-end sequences of 6,151 clones had homology to 2,343 human RefSeq genes at a cutoff value of 1 x 10–60.

    The entire sequences of clones were determined by the primer walking method. Cycle sequencing was performed with an ABI PRISM BigDye Terminator Sequencing kit (Applied Biosystems) according to the manufacturer's instructions. We sequenced approximately 2,200 cDNA clones whose 5'-end sequences had homology to human RefSeq sequences. The CDSs of the macaque cDNA clones were searched in the University of California Santa Cruz (UCSC) human genome database (http://genome.ucsc.edu, verified on May 2004) with a visual inspection. For the further analysis, we selected the macaque cDNA clones of which CDSs covered translation start site and nearly entire CDSs of the homologous human cDNAs and obtained 785 human-macaque cDNA pairs. The homologous regions of human-macaque cDNA sequences were aligned to each other using ClustalW (Thompson, Higgins, and Gibson 1994). After removing the ambiguous alignments shorter than 300 bp (100 aa) in the CDSs and redundant cDNA pairs (we occasionally sequenced more than one macaque cDNA per one RefSeq gene), we compiled 622 one-to-one human-macaque orthologous alignments.

    cDNA sequences from the brain were also compiled as a control group for the cDNA sequences from the testis. So far, over 8,000 cDNA sequences from the macaque brain have been accumulated. However, we used only 443 cDNA sequences preliminarily extracted from the initial phase of our macaque cDNA sequencing project. The selected 443 brain cDNAs are supposed to have no assortment bias. The analysis using all the macaque brain cDNA sequences will be presented elsewhere. We identified 302 orthologous gene pairs of human and macaque cDNAs with the same procedure as above.

    The name of macaque cDNAs represents the anatomical parts where the clone was derived from (Qtr: temporal lobe, Qfl: frontal lobe, Qnp: parietal lobe, Qcc: cerebellum cortex, Qts: testis; see Supplementary Table 1 [Supplementary Material online] for the accession numbers of the macaque cDNA clones and human orthologs). All 2,331 macaque cDNA sequences analyzed in this study were deposited to the DNA Data Bank of Japan/European Molecular Biology Laboratory/GenBank DNA database (accession numbers AB168131–AB169925, AB178956–AB179491).

    Computational Analyses

    We calculated Ka and Ks by the method of Li-Pamilo-Bianchi (Li 1993; Pamilo and Bianchi 1993). The K5UTR was estimated by using Kimura's two-parameter method (Kimura 1980). Bootstrap test was performed to estimate the sampling variance of substitution rate for the concatenated 5'UTR sequences. We reconstructed 5'UTR sequences of the testicular and brain genes by randomly choosing nucleotides from the original concatenated sequences and estimated K5UTR 1,000 times. BLAT program against the human genome sequences at the UCSC human genome database (http://genome.ucsc.edu, verified on May 2004) was used to visually inspect whether the macaque 5'UTRs had any structural divergence to the homologous human cDNAs in the database.

    Reverse Transcriptase–Polymerase Chain Reaction

    The templates of the human total RNA from brain, liver, and testis were purchased from Clontech (Mountain View, Calif.). Total RNA of the cynomolgus monkey brain, liver, and testis was isolated using TRIzol (Invitrogen, Carlsbad, Calif.). One microliter of total mRNA was amplified using One Step RNA PCR Kit (TakaraBio, Otsu, Shiga, Japan). Temperature and time schedule were 40 cycles of 94°C for 30 s, 58°C for 30 s, and 72°C for 1.5 min. The primers were designed to match both human and macaque cDNA sequences, and their sequences are presented in Supplementary Table 2 (Supplementary Material online).

    Results

    Substitution Rate

    Using the full-insert sequences of macaque cDNAs, we obtained 622 and 302 orthologous pairs of human and macaque cDNAs derived from the testis and brain, respectively (see Materials and Methods for further information). There was no overlapping of genes between the two data sets. Average length of the alignments was 103.88 bp in the 5'UTR and 1,248.17 bp in the CDS. The Ka, Ks, K5UTR, and length of each alignment are presented in Supplementary Table 1 (Supplementary Material online). Because the 5'UTR sequences of some genes were only several base pairs long, we filtered out the alignments containing 5'UTR shorter than 20 bp to estimate the K5UTR, which yielded 480 and 254 5'UTR alignments for the testicular and brain cDNAs, respectively. After the filtering, average length of the alignments in the 5'UTR was 128.91 bp. The mean values and standard errors of the Ka, Ks, and K5UTR per 100 sites are shown in table 1.

    Table 1 The Means and Standard Errors of the Substitution Rate Per 100 Sites Between the Human and Cynomolgus Monkey cDNAs

    In both the organs, the Ka and K5UTR were significantly lower than the Ks by the Wilcoxon matched-pair signed-rank test (P values ranged from 1 x 10–5 to 1 x 10–15). Next, we tested whether the substitution rates are different between the testicular and brain cDNAs by the Wilcoxon rank-sum test. The testicular cDNAs showed significantly higher Ka (P < 1 x 10–12) and K5UTR (P = 0.014) than the brain cDNAs, whereas the Ks did not significantly differ (P = 0.359, table 1 and fig. 1). The cumulative distributions of Ka, Ks, and K5UTR are shown in figure 1. Because we did not use the short alignments (less than 20 bp) in the 5'UTR for the former statistical test, we subsequently concatenated all 5'UTR alignments and estimated the K5UTR. Whether the K5UTR of the testicular cDNAs is significantly greater than the K5UTR of the brain cDNAs was tested by a bootstrap method with 1,000 times iteration. The result was highly significant (P < 0.001).

    FIG. 1.— The cumulative distribution of the Ka (A), Ks (B), and K5UTR (C) between the human and cynomolgus monkey cDNAs. The solid lines represent the substitution rate of the testicular cDNAs, and dotted lines denote the substitution rate of the brain cDNAs. The testicular cDNAs showed a significantly different distribution from the brain cDNAs in both Ka and K5UTR, while the Ks did not significantly differ.

    Because the high mutability of CpG sites may increase the substitution rate of each class (Hellmann et al. 2003), we estimated the substitution rate after masking all CG to CA and TG substitutions between human and macaque cDNA sequences, but there was no change in the trend as a result of masking (Supplementary Table 3, Supplementary Material online).

    Structural Divergence

    When the macaque testicular cDNA sequences were aligned to the sequences of the human ortholog, we frequently found the unaligned blocks of 5'UTR sequences between the two. We used BLAT program (Kent 2002) at the UCSC human genome database (http://genome.ucsc.edu) to visually inspect the loci on the human genome where the macaque testicular cDNAs were mapped and found that the many blocks of macaque 5'UTR sequences, which aligned with the human genome sequences, did not show any homology to human cDNA sequences mapped on the same loci. The unaligned macaque sequences that are possibly due to the extension of macaque transcription start site were removed from the further analysis. The exonlike blocks in the human genome appeared to be "macaque-specific" exons in the 5'UTR. Note that the macaque-specific exon does not always literally refer to specific in macaques (e.g., chimpanzees may transcribe the macaque-specific exon). Throughout this report, however, we shall use the macaque specific in terms of the comparison between humans and cynomolgus monkeys. The number of testicular clones with macaque-specific exons in the 5'UTR amounted to 169 (27%) of the 622 gene pairs. Supplementary Table 4 (Supplementary Material online) lists the orthologous pairs that were found to carry a macaque-specific exon by in silico analysis.

    We investigated whether alteration of the 5'UTR in macaque transcripts occurs more frequently with the clones derived from the testis than the clones from the brain, and the results showed that 41 (14%) of the 302 orthologous pairs in the brain had altered 5'UTRs (data not shown). Thus, the testis-derived transcripts had a significantly higher rate of alteration than the brain-derived transcripts (P < 1 x 10–5: Fisher's exact test).

    We classified the macaque-specific exons in the 5'UTR into three categories. Class A consists of macaque cDNA clones whose first exon differs between human and macaque cDNAs. The transcripts in class A have different transcription start sites, and thus may use different promoters and transcription regulatory elements (Supplementary Fig. 1A, Supplementary Material online). Class B consists of clones whose first exons start at the same region as human cDNAs but whose intron-exon structure in the 5'UTR differs between humans and macaques (Supplementary Fig. 1B, Supplementary Material online). Class C consists of cDNA pairs, part of whose 5'UTR was not found in the human genome (Supplementary Fig. 1C, Supplementary Material online).

    We randomly selected 18 of the 169 macaque-specific exons found by in silico analysis in the testis and performed the reverse transcriptase–polymerase chain reaction (RT-PCR) to examine whether the 18 macaque-specific exons in silico occurred in vivo. Primers were designed to match both human and macaque sequences and either one of the primer sets on the potential macaque-specific exon. The primer sequences are shown in Supplementary Table 2 (Supplementary Material online). Human and cynomolgus monkey total RNA samples from brain, liver, and testis were used for the expression analysis. Of the 16 pairs of primers that amplified the product at least in one of the macaque samples, six pairs amplified products of the expected size only in the macaque samples. The results are summarized in Supplementary Table 2 (Supplementary Material online). Figure 2A (QtsA-12177) and 2B (QtsA-17708) shows the examples of the exon-intron structures of transcripts and gel images of the RT-PCR products. The finding that about one-third of macaque-specific exons in silico were the actual macaque-specific exons in vivo indicates that around 10% (169/622 x 6/16) of the testicular clones carry macaque-specific exons in their 5'UTRs.

    FIG. 2.— Example of a macaque-specific exon confirmed by the RT-PCR experiments. All results and primer sequences are summarized in Supplementary Table 2. Upper panels: open boxes and closed boxes represent the 5'UTR and CDS, respectively. The genes are transcribed from left to right. The primers were designed to match both human and macaque sequences and are indicated by arrows. Lower panels: images of the RT-PCR gels. Tissues from three organs of humans and macaques (brain, liver, and testis) were examined.

    Discussion

    The 622 human-macaque alignments yielded 169 macaque cDNAs carrying macaque-specific exons in the testis in silico. We confirmed that about one-third of the 16 macaque-specific exons in silico are not transcribed in humans but that the rest of them are expressed in human tissues. However, the fact that we did not find any human cDNAs corresponding to the macaque-specific exons in the public databases suggests that these macaque-type transcripts are very rare in humans. This indicates that human cDNA databases require registration of more transcript variants for one locus to represent the whole transcriptome and that transcriptional resources from other species, especially primates, would help to complement human transcriptome databases.

    It might be the case that human brain transcripts are overrepresented in the public databases, making it less likely to miss a splice variant than in other tissues. We further surveyed how many transcripts are registered per locus that we used for the study. The average and median of the number of transcripts per locus are presented in Supplementary Table 5 (Supplementary Material online). Indeed, there are more transcripts per locus in the public database for the brain genes than the testicular genes. However, if this assortment bias affected the finding rate of evolutionarily altered exons, the transcripts with macaque-specific exons should have less homologous human transcripts in the public database than the evolutionarily conserved transcripts. As shown in Supplementary Table 5 (Supplementary Material online), we did not find any systematic trend among them. Hence, the assortment bias would not violate our interpretation.

    We estimated that around 10% of macaque transcripts contain macaque-specific exons in the testis. We found that all the six experimentally confirmed macaque-specific exons retain the consensus splicing donor-acceptor site (GT-AG) in the human genome, in spite of the fact that the exonlike blocks were not transcribed in the human tissues. Therefore, it is plausible that the exonlike sequences in the human genome were inactivated during the evolution so that they are not expressed or are expressed in only certain tissues. If we assume that the evolutionary exon alteration is mainly due to the exon loss and the evolutionary rate is the same in human and macaque lineages, the number of evolutionarily altered exons in humans and cynomolgus monkeys would be around 20% of the total number of transcripts in the testis.

    We found a slower substitution rate in the 5'UTR than in the synonymous sites, suggesting negative selection acting on the 5'UTR evolution at the genome-wide level. However, it is possible that the K5UTR of some genes might have been increased by positive selection (Hellmann et al. 2003; Kohn, Fang, and Wu 2004). Because the length of the 5'UTR sequence of the gene is sometimes very short and the sampling variance of K5UTR is substantially large, the analysis that calculates K5UTR/Ks for each gene cannot be easily applied to our data set. Another source of evidence inferring the type of natural selection would be a different selection pressure acting on different functional classes of genes. Significantly higher Ka and K5UTR in the cDNAs from the testis than those from the brain were observed in this study (table 1 and fig. 1). There is a great deal of evidence that the Ka of genes expressed in reproductive tissues, especially in the testis, has been increased by positive selection (Wyckoff, Wang, and Wu 2000; Swanson and Vacquier 2002). In our data set, more testicular genes showed a signature of positive selection in the CDS (Ka/Ks > 1) than the brain genes (25/622 and 5/302 from the testis and brain, respectively). If we apply the same augment to the 5'UTR, higher K5UTR in the testis than in the brain would be driven by positive selection as well. When a mutation yields a gene expression pattern that is spatially and/or temporally beneficial to an organism, the mutation would fix to a population faster than neutral mutations.

    We should note that more constraints on brain genes than on genes expressed in other tissues might be the cause of the difference between the substitution rate in the brain and testis (Duret and Mouchroud 2000). Because the brain is a special tissue, especially in primates, whether the evolution of brain genes is fast or slow is still in debate (Dorus et al. 2004). The reference genes expressed in other tissues, such as housekeeping genes, would be useful to access how much of acceleration of evolution in the testis is due to positive selection.

    Nucleotide substitution is not the only source of genetic changes in the evolution. We estimated that about 10% of macaque testicular cDNAs have the exons that are not transcribed in humans, and this would have a larger impact on the divergence of gene regulation. The fraction of cDNAs with an exon alteration found by in silico analysis was significantly higher in the testicular cDNAs than in the brain cDNAs. Thus, it is plausible that the 5'UTRs of a certain fraction of genes are under positive selection in terms of both substitution rate and structural alteration.

    In this report, we found that both positive and negative selection can act on not only the protein-coding (Fay, Wyckoff, and Wu 2002) and promoter sequences (Kohn, Fang and Wu 2004) but also the 5'UTR sequences. It is worthwhile to study how the 5'UTR divergence affects gene expression and modifies the phenotype of organisms.

    Supplementary Material

    Supplementary tables 1–5 and supplementary figure 1 are available at Molecular Biology and Evolution online (http://www.mbe.oxfordjournals.org/).

    Acknowledgements

    This study was supported in part by a Health Science Research grant from the Human Genome Program of the Ministry of Health, Labor and Welfare of Japan. We thank Michael H. Kohn for comments and discussions. We also thank two anonymous reviewers for helpful suggestions.

    References

    Bejerano, G., M. Pheasant, I. Makunin, S. Stephen, W. J. Kent, J. S. Mattick, and D. Haussler. 2004. Ultraconserved elements in the human genome. Science 304:1321–1325.

    Bernard, D. J., T. K. Woodruff, and T. M. Plant. 2004. Cloning of a novel inhibin alpha cDNA from rhesus monkey testis. Reprod. Biol. Endocrinol. 2:71.

    Chamas, F., and E. L. Sabban. 2002. Role of the 5' untranslated region (UTR) in the tissue-specific regulation of rat tryptophan hydroxylase gene expression by stress. J. Neurochem. 82:645–654.

    Chew, C. H., M. R. Samian, N. Najimudin, and T. S. Tengku Muhammad. 2003. Molecular characterisation of six alternatively spliced variants and a novel promoter in human peroxisome proliferator-activated receptor alpha. Biochem. Biophys. Res. Commun. 305:235–243.

    Dorus, S., E. J. Vallender, P. D. Evans, J. R. Anderson, S. L. Gilbert, M. Mahowald, G. J. Wyckoff, C. M. Malcom, and B. T. Lahn. 2004. Accelerated evolution of nervous system genes in the origin of Homo sapiens. Cell 119:1027–1040.

    Duret, L., and D. Mouchiroud. 2000. Determinants of substitution rates in mammalian genes: expression pattern affects selection intensity but not mutation rate. Mol. Biol. Evol. 17:68–74.

    Enard, W., P. Khaitovich, J. Klose et al. (13 co-authors). 2002. Intra- and interspecific variation in primate gene expression patterns. Science 296:340–343.

    Fay, J. C., G. J. Wyckoff, and C. I. Wu. 2002. Testing the neutral theory of molecular evolution with genomic data from Drosophila. Nature 415:1024–1026.

    Gellersen, B., R. Kempf, R. Sandhowe, G. F. Weinbauer, and R. Behr. 2002. Novel leader exons of the cyclic adenosine 3',5'-monophosphate response element modulator (CREM) gene, transcribed from promoters P3 and P4, are highly testis-specific in primates. Mol. Hum. Reprod. 8:965–976.

    Hellmann, I., S. Zollner, W. Enard, I. Ebersberger, B. Nickel, and S. Paabo. 2003. Selection on human genes as revealed by comparisons to chimpanzee cDNA. Genome Res. 13:831–837.

    Kawamura, S., H. Tanabe, Y. Watanabe, K. Kurosaki, N. Saitou, and S. Ueda. 1991. Evolutionary rate of immunoglobulin alpha noncoding region is greater in hominoids than in Old World monkeys. Mol. Biol. Evol. 8:743–752.

    Kent, W. J. 2002. BLAT—the BLAST-like alignment tool. Genome Res. 12:656–664.

    Kimura, M. 1980. A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences. J. Mol. Evol. 16:111–120.

    King, M. C., and A. C. Wilson. 1975. Evolution at two levels in humans and chimpanzees. Science 188:107–116.

    Kohn, M. H., S. Fang, and C. I. Wu. 2004. Inference of positive and negative selection on the 5' regulatory regions of Drosophila genes. Mol. Biol. Evol. 21:374–383.

    Lammich, S., S. Schobel, A. K. Zimmer, S. F. Lichtenthaler, and C. Haass. 2004. Expression of the Alzheimer protease BACE1 is suppressed via its 5'-untranslated region. EMBO Rep. 5:620–625.

    Li, W. H. 1993. Unbiased estimation of the rates of synonymous and nonsynonymous substitution. J. Mol. Evol. 36:96–99.

    ———. 1997. Molecular evolution. Sinauer Associates, Sunderland, Mass.

    Makalowski, W., and M. S. Boguski. 1998. Evolutionary parameters of the transcribed mammalian genome: an analysis of 2,820 orthologous rodent and human sequences. Proc. Natl. Acad. Sci. USA 95:9407–9412.

    Mao, J., S. S. Chirala, and S. J. Wakil. 2003. Human acetyl-CoA carboxylase 1 gene: presence of three promoters and heterogeneity at the 5'-untranslated mRNA region. Proc. Natl. Acad. Sci. USA 100:7515–7520.

    Mesak, F. M., N. Osada, K. Hashimoto, Q. Y. Liu, and C. E. Ng. 2003. Molecular cloning, genomic characterization and over-expression of a novel gene, XRRA1, identified from human colorectal cancer cell HCT116Clone2_XRR and macaque testis. BMC Genomics 4:32.

    Mezquita, P., C. Mezquita, and J. Mezquita. 1999. Novel transcripts of carbonic anhydrase II in mouse and human testis. Mol. Hum. Reprod. 5:199–205.

    Miyata, T., T. Yasunaga, and T. Nishida. 1980. Nucleotide sequence divergence and functional constraint in mRNA evolution. Proc. Natl. Acad. Sci. USA 77:7328–7332.

    Modrek, B., and C. J. Lee. 2003. Alternative splicing in the human, mouse and rat genomes is associated with an increased frequency of exon creation and/or loss. Nat. Genet. 34:177–180.

    Newton, D. C., S. C. Bevan, S. Choi, G. B. Robb, A. Millar, Y. Wang, and P. A. Marsden. 2003. Translational regulation of human neuronal nitric-oxide synthase by an alternatively spliced 5'-untranslated region leader exon. J. Biol. Chem. 278:636–644.

    Nurtdinov, R. N., I. I. Artamonova, A. A. Mironov, and M. S. Gelfand. 2003. Low conservation of alternative splicing patterns in the human and mouse genomes. Hum. Mol. Genet. 12:1313–1320.

    Osada, N., M. Hida, J. Kusuda, R. Tanuma, M. Hirata, M. Hirai, K. Terao, Y. Suzuki, S. Sugano, and K. Hashimoto. 2002a. Prediction of unidentified human genes on the basis of sequence similarity to novel cDNAs from cynomolgus monkey brain. Genome Biol. 3:RESEARCH0006.

    Osada, N., M. Hida, J. Kusuda, R. Tanuma, M. Hirata, Y. Suto, M. Hirai, K. Terao, S. Sugano, and K. Hashimoto. 2002b. Cynomolgus monkey testicular cDNAs for discovery of novel human genes in the human genome sequence. BMC Genomics 3:36.

    Osada, N., M. Hida, J. Kusuda et al. (12 co-authors). 2001. Assignment of 118 novel cDNAs of cynomolgus monkey brain to human chromosomes. Gene 275:31–37.

    Osada, N., J. Kusuda, M. Hirata, R. Tanuma, M. Hida, S. Sugano, M. Hirai, and K. Hashimoto. 2002c. Search for genes positively selected during primate evolution by 5'-end-sequence screening of cynomolgus monkey cDNAs. Genomics 79:657–662.

    Pamilo, P., and N. O. Bianchi. 1993. Evolution of the Zfx and Zfy genes: rates and interdependence between the genes. Mol. Biol. Evol. 10:271–281.

    Pan, Q., M. A. Bakowski, Q. Morris, W. Zhang, B. J. Frey, T. R. Hughes, and B. J. Blencowe. 2005. Alternative splicing of conserved exons is frequently species-specific in human and mouse. Trends Genet. 21:73–77.

    Pruitt, K. D., T. Tatusova, and D. R. Maglott. 2003. NCBI Reference Sequence project: update and current status. Nucleic Acids Res. 31:34–37.

    Savatier, P., G. Trabuchet, Y. Chebloune, C. Faure, G. Verdier, and V. M. Nigon. 1987. Nucleotide sequence of the delta-beta-globin intergenic segment in the macaque: structure and evolutionary rates in higher primates. J. Mol. Evol. 24:297–308.

    Sugiura, S., S. Kashiwabara, S. Iwase, and T. Baba. 2003. Expression of a testis-specific form of TBP-related factor 2 (TRF2) mRNA during mouse spermatogenesis. J. Reprod. Dev. 49:107–111.

    Suzuki, Y., R. Yamashita, M. Shirota, Y. Sakakibara, J. Chiba, J. Mizushima Sugano, K. Nakai, and S. Sugano. 2004. Sequence comparison of human and mouse genes reveals a homologous block structure in the promoter regions. Genome Res. 14:1711–1718.

    Suzuki, Y., K. Yoshitomo Nakagawa, K. Maruyama, A. Suyama, and S. Sugano. 1997. Construction and characterization of a full length-enriched and a 5'-end-enriched cDNA library. Gene 200:149–156.

    Swanson, W. J., and V. D. Vacquier. 2002. The rapid evolution of reproductive proteins. Nat. Rev. Genet. 3:137–144.

    Thompson, J. D., D. G. Higgins, and T. J. Gibson. 1994. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22:4673–4680.

    Wang, H. Y., H. Tang, C. K. Shen, and C. I. Wu. 2003. Rapidly evolving genes in human. I. The glycophorins and their possible role in evading malaria parasites. Mol. Biol. Evol. 20:1795–1804.

    Wang, Y., D. C. Newton, G. B. Robb, C. L. Kau, T. L. Miller, A. H. Cheung, A. V. Hall, S. VanDamme, J. N. Wilcox, and P. A. Marsden. 1999. RNA diversity has profound effects on the translation of neuronal nitric oxide synthase. Proc. Natl. Acad. Sci. USA 96:12150–12155.

    Watanabe, H., A. Fujiyama, M. Hattori et al. 2004. DNA sequence and comparative analysis of chimpanzee chromosome 22. Nature 429:382–388.

    Wyckoff, G. J., W. Wang, and C. I. Wu. 2000. Rapid evolution of male reproductive genes in the descent of man. Nature 403:304–309.

    Yeo, G., D. Holste, G. Kreiman, and C. B. Burge. 2004. Variation in alternative splicing across human tissues. Genome Biol. 5:R74.(Naoki Osada*,1, Makoto Hi)