Thirteen-exon-motif signature for vertebrate nuclear and mitochondrial
http://www.100md.com
《核酸研究医学期刊》
Laboratory of Molecular Pharmacology and 1 Laboratory of Experimental Carcinogenesis, Center for Cancer Research, National Cancer Institute, NIH, DHHS, Bethesda, MD, USA
*To whom correspondence should be addressed. Tel: +1 301 496 5944; Fax: +1 301 402 0752; Email: pommier@nih.gov
+AF362952, AF503620, AY429654, AY429655 and TPA BK001786
DDBJ/EMBL/GenBank accession nos+.
ABSTRACT
DNA topoisomerases contribute to various cellular activities that involve DNA. We previously identified a human nuclear gene that encodes a mitochondrial DNA topoisomerase. Here we show that genes for mitochondrial DNA topoisomerases (type IB) exist only in vertebrates. A 13-exon topoisomerase motif was identified as a characteristic of genes for both nuclear and mitochondrial type IB topoisomerases. The presence of this signature motif is thus an indicator of the coexistence of nuclear and mitochondrial type IB DNA topoisomerases. We hypothesize that the prototype topoisomerase IB with the 13-exon structure formed first, and then duplicated. One topoisomerase specialized for nuclear DNA and the other for mitochondrial DNA.
INTRODUCTION
DNA topoisomerases play important roles in cellular activities that involve DNA and have been classified as either type I or type II enzymes (1–3). The type I enzymes break and reseal one strand of duplex DNA at a time, whereas the type II enzymes break and reseal both strands in concert. Topoisomerases are further divided into types IA, IB, IIA and IIB. A total of six human DNA topoisomerases has been identified to date, including two type IA enzymes (topoisomerases III and III?), two type IB enzymes and two type IIA enzymes (topoisomerases II and II?).
In eukaryotic cells, mitochondrial DNA (mtDNA) constitutes extranuclear genetic material. A typical mammalian cell contains 1000 mitochondria, with each of these organelles containing five to ten copies of covalently closed circular mtDNA of 16–18 kb. Human mtDNA consists of a circular DNA duplex of 16 569 bp and encodes 22 tRNAs, 13 mRNAs, and 12S and 16S rRNAs (4,5). Unwinding of mtDNA in human cells is mediated by a specific type IB enzyme encoded by a nuclear gene (Hs-TOP1mt) (6). A minor proportion of topoisomerase III (a type IA enzyme) molecules is also present in mitochondria (7).
We have now investigated the existence of TOP1mt genes in other species. After finding that such genes are restricted to and conserved among vertebrates, we compared the structures of TOP1mt and TOP1 (nuclear) genes. We show that the TOP1mt and vertebrate TOP1 genes consist of 13 exons at the end of the genes in the conserved regions. This terminal 13-exon structure thus appears to be a common signature of both mitochondrial and nuclear topoisomerases I, and the fact that this signature exists only in vertebrates suggests that both genes arose from the duplication of a common ancestor gene.
MATERIALS AND METHODS
Cloning of Top1mt
DNA manipulation, PCR and DNA sequencing were performed according to standard protocols. We obtained clone BF139529 from Incyte Genonics (St Louis, MO), IMAGE clone 2601221 from ATCC (Manassas, VA) and clone pgf2n.pk002.c13 from Delaware Biotechnology Institute (Newark, DE), and sequenced them on a 377 DNA sequencer using ABI Prism Big Dye Terminator (PE Applied Biosystems). The missing 5' end portions of TOP1mt genes were amplified using a GeneRacer kit (Invitrogen, Carlsbad, CA). The 5' end was joined to the corresponding clones to generate full-length TOP1mt. All oligonucleotide sequences used for cDNA identification are available upon request.
Fluorescence microscopy, FISH localization, DNA relaxation assays and DNA cleavage assays
These procedures were carried out as described previously (6).
Database searches and alignment
We identified putative homologous genes using the discontiguous Mega BLAST (http://www.ncbi.nlm.nih.gov) to search all available NCBI databases. We aligned DNA sequences and corresponding amino acid sequences with available TOP1 and TOP1mt genes using the ClustalW in MacVector (Accelrys, San Diego, CA).
RESULTS
Identification of a mouse mitochondrial topoisomerase I gene (Mm-TOP1mt)
Screening of the NCBI database with the BLAST search engine and human mitochondrial topoisomerase I (Hs-top1mt) as the bait yielded a mouse cDNA sequence (DDBJ/EMBL/GenBank accession no. BF139529 ). Sequencing of BF139529 revealed an open reading frame encoding a polypeptide with high homology to Hs-top1mt. The GeneRacer protocol (Invitrogen, Carlsbad, CA) was used to determine the sequence of the 5' end of the gene. The combination of both approaches yielded a 2011 bp cDNA sequence that encodes a 593 amino acid protein. This protein, which we have designated Mm-top1mt, shares 73% sequence identity and 84% similarity with Hs-top1mt (6) (Fig. 1).
Figure 1. Alignment of the amino acid sequences of top1mt proteins. (A) Alignment of the sequences encoded by the first exon of the genes for the five identified mitochondrial topoisomerases I. (B) Alignment of the sequences encoded by the last 13 exons of the genes for the five mitochondrial (mt) and corresponding nuclear (n) topoisomerases I. Residues encoded by each exon are marked by solid or open bars above the sequences; exon numbers for the mitochondrial and nuclear enzymes are outside and inside the parentheses, respectively. The catalytic tyrosine (Y) is marked with an asterisk and the critical basic amino acids (RKR) are marked with plus signs.
We next designed two sets of PCR primers based on the 5' and 3' ends of the Mm-TOP1mt cDNA for the purpose of screening a mouse genomic library. A bacterial artificial chromosome clone containing the full-length Mm-TOP1mt gene was obtained. We then used this clone as a probe to determine the chromosomal location of Mm-TOP1mt by fluorescence in situ hybridization. In two independent experiments with biotin- or digoxigenin-labeled probes, most metaphase spreads with informative signals and minimal nonspecific background fluorescence yielded symmetrical fluorescent spots on a small chromosome. Furthermore, 27 out of a total of 30 labeled spreads recorded in the two experiments exhibited a specific signal at the same site, bands E2–E3, on both chromosomes 15, to which we therefore assign Mm-TOP1mt (Fig. 2). This region of mouse chromosome 15 is homologous to human chromosome 8q24.3 (8), the site of Hs-TOP1mt (6).
Figure 2. Fluorescence in situ hybridization analysis demonstrating the location of the Mm-TOP1mt gene at bands E2–E3 on chromosomes 15. Both chromosomes with symmetrical FITC signals on sister chromatids are identified by arrows.
Both Hs-TOP1mt and Mm-TOP1mt are positioned between locus H of the lymphocyte antigen 6 complex and the rhophilin (Rho GTPase binding protein 1) gene. The region of the mouse genome containing Mm-TOP1mt is thus syntenic to that of the human genome containing Hs-TOP1mt, suggesting that these regions share a common ancestor.
To determine the structure of Mm-TOP1mt, we sequenced the 5' end of the gene and combined the resulting sequence with that available in the NCBI database. Like Hs-TOP1mt, Mm-TOP1mt contains 14 exons. This 14-exon structure is also shared by other TOP1mt genes (Table 1; see below). All TOP1mt genes also exhibit the same intron phases (Table 1). Furthermore, the corresponding introns of the human and mouse TOP1mt genes are similar in size, with the exception of intron 7 which is larger in human (Homo sapiens) than in mouse (Mus musculus) (Table 1).
Table 1. Structure of the genes for mitochondrial topoisomerases I (TOP1mt)
TOP1mt genes are present only in vertebrates
We examined the available eukaryotic DNA sequences to determine which species possess genes for both mitochondrial and nuclear topoisomerases I. With Hs-top1mt and Mm-top1mt as baits, we detected TOP1mt genes in all the vertebrate genomes: zebra fish (Danio rerio) (Dr), chicken (Gallus gallus) (Gg) and rat (Rattus norvegicus) (Rn).
For chicken top1mt, we derived most of the sequence from a cDNA clone (clone ID, pgf2n.pk002.c13) and used GeneRacer to obtain the remaining 5' sequence. For zebra fish, the cDNA sequence was directly derived from a single clone (IMAGE clone ID, 2601221). Expression experiments revealed that both the recombinant chicken (Gg-top1mt) and zebra fish (Dr-top1mt) proteins possess topoisomerase I activity. Cleavage assays (6) also confirmed that Gg-top1mt is a type IB topoisomerase, given that it forms a covalent bond with the 3' end of the cleaved DNA (data not shown).
The sequences and structures of the rat, chicken and zebra fish TOP1mt genes were derived from the recently released databases (NCBI). The rat (Rn-TOP1mt) and chicken (Gg-TOP1mt) genes, like the human and mouse genes, comprise 14 exons (Table 1). For the zebra fish gene (Dr-TOP1mt), we were able to compile only 11 exons from the incomplete genomic sequence (Table 1). The exon sizes for these five vertebrate TOP1mt genes vary for the first exon but are identical for the remaining 13 exons, with the minor exception that exons 2 and 13 of the rodent genes are 3 bp shorter (corresponding to deletion of one amino acid and likely a characteristic of the common rodent ancestor).
The NH2-terminal portion of Hs-top1mt encoded by exon 1 contains the mitochondrial localization signal (6). Alignment of the corresponding NH2-terminal regions of the vertebrate top1mt polypeptides revealed that they share little sequence homology (Fig. 1A). To verify that the newly identified top1mt proteins are indeed mitochondrial enzymes, we transfected M059J human neuroblastoma cells with expression vectors for either Mm-top1mt or Gg-top1mt tagged at their COOH-termini with green fluorescent protein (GFP) and then examined the transfected cells by fluorescence microscopy, as previously described for Hs-top1mt (6). Both of the GFP fusion proteins localized to mitochondria (data not shown), demonstrating the presence of a functional mitochondrial targeting sequence in both mouse and chicken top1mt. We also examined whether, despite their low homology, the amino acid sequences encoded by exon 1 of the various TOP1mt genes might function as mitochondrial leader sequences with the use of the Mitoprot program (http://www.mips.biochem.mpg.de/cgi-bin/proj/medgen/mitofilter). The probability of mitochondrial targeting was high for all five identified top1mt proteins: 92, 98, 99, 99 and 98% for zebra fish, chicken, mouse, rat and human top1mt, respectively. When the sequences encoded by the first exons were removed, however, low scores were obtained for all five proteins, indicating that the mitochondrial-targeting sequences are located in the regions encoded by exon 1 of the TOP1mt genes.
The terminal 13-exon motif is present in all vertebrate TOP1 genes and is highly conserved between TOP1 and TOP1mt genes
The conservation of the terminal 13-exon structure among TOP1mt genes as well as the human gene for nuclear topoisomerase I (Hs-TOP1) (6) led us to investigate whether this structure was common to other type IB topoisomerases. All the vertebrate TOP1 genes examined (rat, mouse, chicken and zebra fish) consist of 21 exons, of which the last 13 exons (exons 9–21) are conserved with regard to size and phase (Table 2).
Table 2. Structure of the genes for nuclear topoisomerases I (TOP1)
Alignment of the amino acid sequences encoded by the last 13 exons of both nuclear and mitochondrial topoisomerases I revealed a high degree of conservation between the nuclear and mitochondrial enzymes (Fig. 1B). The catalytic residues, including the critical basic amino acids (RKR, marked with plus signs) and tyrosine residue (Y, marked with an asterisk), are all preserved.
Pairwise comparisons of the 13-exon motifs revealed high homology among the 10 topoisomerases examined (Table 3). At the nucleotide level, the TOP1 genes exhibited a higher level of identity (83.43 ± 6.87%) than did the TOP1mt genes (73.98 ± 7.91%); the level of identity between the TOP1 and TOP1mt genes was lower (67.72 ± 2.08%).
Table 3. Comparison of the nucleotide and predicted amino acid sequences of the last 13 exons of the genes for vertebrate mitochondrial (mt) and nuclear (n) type IB topoisomerases
Both the 13-exon motif and the presence of two type IB topoisomerase (mitochondrial and nuclear) genes are restricted to vertebrates
We next investigated the existence of genes for type IB topoisomerases in nonvertebrate eukaryotes. The 13-exon topoisomerase motif was not detected in budding yeast (Saccharomyces cerevisiae), fission yeast (Schizosaccharo myces pombe), fruit fly (Drosophila melanogaster), nematode (Caenorhabditis elegans), rice (Oryza sativa) or thale cress (Arabidopsis thaliana). In budding yeast, the TOP1 gene contains no introns. The fission yeast TOP1 gene contains two introns at its 5' end. The fruit fly TOP1 gene consists of eight exons, and the nematode TOP1 gene comprises five exons. Both the rice and the two thale cress TOP1 genes share a common 15-exon structure unrelated to the vertebrate 13-exon topoisomerase motif (not shown). The existence of a distinct but shared (in length and phase) 15-exon structure in these plant species indicates that they are derived from a common ancestor.
The sea squirt (Ciona intestinalis) (Ci) has a single TOP1 gene that is markedly similar to those of vertebrates (Table 2). We determined the structure of exons 3–21 of Ci-TOP1, assuming that the gene consists of 21 exons (Table 2). For exons 3–8, the homology with vertebrate TOP1 genes is low. In contrast, the homology (in terms of size and phase) is high for exons 10–18 and for exon 21 of Ci-TOP1 and vertebrate TOP1 genes. Moreover, the cumulative length of exons 19 and 20 of Ci-TOP1 (174 + 71 = 245 bp) is equal to the total size of the corresponding exons in vertebrates (95 + 150 = 245 bp), and exon 18 of Ci-TOP1 is only 3 bp longer than that of vertebrate TOP1 genes. Thus Ci-TOP1 resembles the vertebrate TOP1 genes in its terminal 12 exons.
Finally, we compared the common portions of the vertebrate type IB topoisomerases encoded by the last 13 exons of their genes and the corresponding sequences of other type IB topoisomerases (Fig. 3). Three main clusters, corresponding to the vertebrate mitochondrial enzymes, the vertebrate nuclear enzymes and the nonvertebrate enzymes, were obtained, with the sea squirt topoisomerase being positioned between the vertebrate and other nonvertebrate enzymes.
Figure 3. Cluster analysis of the mitochondrial and nuclear topoisomerases I. The amino acid sequences encoded by the 13-exon topoisomerase motif of vertebrate genes for nuclear (blue bracket) and mitochondrial (red bracket) enzymes (Fig. 1B) were used for this analysis. For the other enzymes (green bracket), the corresponding regions, based on sequence alignment and size, were used. Hs, H.sapiens; Rn, R.norvegicus; Mm, M.musculus; Gg, G.gallus; Dr, D.rerio; Ci, C.intestinalis; Dm, D.melanogaster; Ce, C.elegans; Sp, S.pombe; Sc, S.cerevisiae; At, A.thaliana; Os, O.sativa. The bar indicates the scale for 10% dissimilarity between amino acid sequences.
DISCUSSION
From zebra fish to human, all the vertebrate type IB topoisomerases examined possess a common terminal 13-exon motif, suggesting that this motif is characteristic of vertebrates. This 13-exon motif, encoding the topoisomerase activity, corresponds to the portion of human topoisomerase I resolved by crystallography (9). Interestingly, the exon–intron boundaries do not occur at the boundaries of the domains identified in the crystal structures of human topoisomerase I.
Given that the organisms with this 13-exon motif possess nuclear genes for both a nuclear and a mitochondrial topoisomerase IB, it is likely that both genes evolved from a common ancestor gene. During evolution, gene duplication might thus have resulted in the emergence of one gene for an enzyme targeted to nuclear DNA and of another gene for an enzyme targeted to mitochondrial DNA. The common topoisomerase I catalytic domain is encoded by the last 13 exons of each gene, and the targeting sequences are encoded by the first exon of the genes for the mitochondrial enzymes and by the first eight exons of the genes for the nuclear enzymes. The structure of the sea squirt TOP1 gene shares similarities with the vertebrate genes in its last 12 exons, suggesting that this gene might share a common ancestor with an early precursor of the vertebrate TOP1 and TOP1mt genes, but failed short of vertebrates.
The absence of a specific TOP1mt gene in the other eukaryotic species raises the question of how these organisms perform mitochondrial DNA metabolism functions. It is possible that other topoisomerases (types II or IA) perform such functions in these species. However, the only specific mitochondrial topoisomerase enzymes identified to date are type IB enzymes. We cannot exclude the possibility that other types of topoisomerase contain mitochondrial targeting sequences that are short and not readily recognizable. Alternatively, a single gene might encode two polypeptides that are targeted either to the nucleus or to mitochondria. The human TOP3 gene, for example, contains two start codons that yield two distinct enzymes, one for the nucleus and the other for mitochondria (7).
The sequence data from this study have been submitted to GenBank under the following accession numbers. Mm-TOP1mt cDNA, AF362952 ; the 5' end of Mm-TOP1mt gene, AF503620 ; Gg-TOP1mt cDNA, AY429654 ; Dr-TOP1mt cDNA, AY429655 ; Rn-TOP1mt cDNA, TPA BK001786 .
ACKNOWLEDGEMENT
We wish to thank Dr Kurt W. Kohn for insightful discussions.
REFERENCES
Champoux,J.J. (2001) DNA topoisomerases: structure, function and mechanism. Annu. Rev. Biochem., 70, 369–413.
Krogh,B.O. and Shuman,S. (2002) A poxvirus-like type IB topoisomerase family in bacteria. Proc. Natl Acad. Sci. USA, 99, 1853–1858.
Wang,J.C. (2002) Cellular roles of DNA topoisomerases: a molecular perspective. Nat. Rev. Mol. Cell. Biol., 3, 430–440.
Anderson,L. (1981) Identification of mitochondrial proteins and some of their precursors in two-dimensional electrophoretic maps of human cells. Proc. Natl Acad. Sci. USA, 78, 2407–2411.
Bibb,M.J., Van Etten,R.A., Wright,C.T., Walberg,M.W. and Clayton,D.A. (1981). Sequence and gene organization of mouse mitochondrial DNA. Cell, 26, 167–180.
Zhang,H., Barcelo,J.M., Lee,B., Kohlhagen,G., Zimonjic,D.B., Popescu,N.C. and Pommier,Y. (2001) Human mitochondrial topoisomerase I. Proc. Natl Acad. Sci. USA, 98, 10608–10613.
Wang,Y., Lyu,Y.L. and Wang,J.C. (2002) Dual localization of human DNA topoisomerase IIIalpha to mitochondria and nucleus. Proc. Natl Acad. Sci. USA, 99, 12114–12119.
DeBry,R.W. and Seldin,M.F. (1996) Human/mouse homology relationships. Genomics, 33, 337–351.
Redinbo,M.R., Stewart,L., Kuhn,P., Champoux,J.J. and Hol,W.G. (1998) Crystal structures of human topoisomerase I in covalent and noncovalent complexes with DNA. Science, 279, 1504–1513.(Hongliang Zhang, Ling-Hua Meng, Drazen B)
*To whom correspondence should be addressed. Tel: +1 301 496 5944; Fax: +1 301 402 0752; Email: pommier@nih.gov
+AF362952, AF503620, AY429654, AY429655 and TPA BK001786
DDBJ/EMBL/GenBank accession nos+.
ABSTRACT
DNA topoisomerases contribute to various cellular activities that involve DNA. We previously identified a human nuclear gene that encodes a mitochondrial DNA topoisomerase. Here we show that genes for mitochondrial DNA topoisomerases (type IB) exist only in vertebrates. A 13-exon topoisomerase motif was identified as a characteristic of genes for both nuclear and mitochondrial type IB topoisomerases. The presence of this signature motif is thus an indicator of the coexistence of nuclear and mitochondrial type IB DNA topoisomerases. We hypothesize that the prototype topoisomerase IB with the 13-exon structure formed first, and then duplicated. One topoisomerase specialized for nuclear DNA and the other for mitochondrial DNA.
INTRODUCTION
DNA topoisomerases play important roles in cellular activities that involve DNA and have been classified as either type I or type II enzymes (1–3). The type I enzymes break and reseal one strand of duplex DNA at a time, whereas the type II enzymes break and reseal both strands in concert. Topoisomerases are further divided into types IA, IB, IIA and IIB. A total of six human DNA topoisomerases has been identified to date, including two type IA enzymes (topoisomerases III and III?), two type IB enzymes and two type IIA enzymes (topoisomerases II and II?).
In eukaryotic cells, mitochondrial DNA (mtDNA) constitutes extranuclear genetic material. A typical mammalian cell contains 1000 mitochondria, with each of these organelles containing five to ten copies of covalently closed circular mtDNA of 16–18 kb. Human mtDNA consists of a circular DNA duplex of 16 569 bp and encodes 22 tRNAs, 13 mRNAs, and 12S and 16S rRNAs (4,5). Unwinding of mtDNA in human cells is mediated by a specific type IB enzyme encoded by a nuclear gene (Hs-TOP1mt) (6). A minor proportion of topoisomerase III (a type IA enzyme) molecules is also present in mitochondria (7).
We have now investigated the existence of TOP1mt genes in other species. After finding that such genes are restricted to and conserved among vertebrates, we compared the structures of TOP1mt and TOP1 (nuclear) genes. We show that the TOP1mt and vertebrate TOP1 genes consist of 13 exons at the end of the genes in the conserved regions. This terminal 13-exon structure thus appears to be a common signature of both mitochondrial and nuclear topoisomerases I, and the fact that this signature exists only in vertebrates suggests that both genes arose from the duplication of a common ancestor gene.
MATERIALS AND METHODS
Cloning of Top1mt
DNA manipulation, PCR and DNA sequencing were performed according to standard protocols. We obtained clone BF139529 from Incyte Genonics (St Louis, MO), IMAGE clone 2601221 from ATCC (Manassas, VA) and clone pgf2n.pk002.c13 from Delaware Biotechnology Institute (Newark, DE), and sequenced them on a 377 DNA sequencer using ABI Prism Big Dye Terminator (PE Applied Biosystems). The missing 5' end portions of TOP1mt genes were amplified using a GeneRacer kit (Invitrogen, Carlsbad, CA). The 5' end was joined to the corresponding clones to generate full-length TOP1mt. All oligonucleotide sequences used for cDNA identification are available upon request.
Fluorescence microscopy, FISH localization, DNA relaxation assays and DNA cleavage assays
These procedures were carried out as described previously (6).
Database searches and alignment
We identified putative homologous genes using the discontiguous Mega BLAST (http://www.ncbi.nlm.nih.gov) to search all available NCBI databases. We aligned DNA sequences and corresponding amino acid sequences with available TOP1 and TOP1mt genes using the ClustalW in MacVector (Accelrys, San Diego, CA).
RESULTS
Identification of a mouse mitochondrial topoisomerase I gene (Mm-TOP1mt)
Screening of the NCBI database with the BLAST search engine and human mitochondrial topoisomerase I (Hs-top1mt) as the bait yielded a mouse cDNA sequence (DDBJ/EMBL/GenBank accession no. BF139529 ). Sequencing of BF139529 revealed an open reading frame encoding a polypeptide with high homology to Hs-top1mt. The GeneRacer protocol (Invitrogen, Carlsbad, CA) was used to determine the sequence of the 5' end of the gene. The combination of both approaches yielded a 2011 bp cDNA sequence that encodes a 593 amino acid protein. This protein, which we have designated Mm-top1mt, shares 73% sequence identity and 84% similarity with Hs-top1mt (6) (Fig. 1).
Figure 1. Alignment of the amino acid sequences of top1mt proteins. (A) Alignment of the sequences encoded by the first exon of the genes for the five identified mitochondrial topoisomerases I. (B) Alignment of the sequences encoded by the last 13 exons of the genes for the five mitochondrial (mt) and corresponding nuclear (n) topoisomerases I. Residues encoded by each exon are marked by solid or open bars above the sequences; exon numbers for the mitochondrial and nuclear enzymes are outside and inside the parentheses, respectively. The catalytic tyrosine (Y) is marked with an asterisk and the critical basic amino acids (RKR) are marked with plus signs.
We next designed two sets of PCR primers based on the 5' and 3' ends of the Mm-TOP1mt cDNA for the purpose of screening a mouse genomic library. A bacterial artificial chromosome clone containing the full-length Mm-TOP1mt gene was obtained. We then used this clone as a probe to determine the chromosomal location of Mm-TOP1mt by fluorescence in situ hybridization. In two independent experiments with biotin- or digoxigenin-labeled probes, most metaphase spreads with informative signals and minimal nonspecific background fluorescence yielded symmetrical fluorescent spots on a small chromosome. Furthermore, 27 out of a total of 30 labeled spreads recorded in the two experiments exhibited a specific signal at the same site, bands E2–E3, on both chromosomes 15, to which we therefore assign Mm-TOP1mt (Fig. 2). This region of mouse chromosome 15 is homologous to human chromosome 8q24.3 (8), the site of Hs-TOP1mt (6).
Figure 2. Fluorescence in situ hybridization analysis demonstrating the location of the Mm-TOP1mt gene at bands E2–E3 on chromosomes 15. Both chromosomes with symmetrical FITC signals on sister chromatids are identified by arrows.
Both Hs-TOP1mt and Mm-TOP1mt are positioned between locus H of the lymphocyte antigen 6 complex and the rhophilin (Rho GTPase binding protein 1) gene. The region of the mouse genome containing Mm-TOP1mt is thus syntenic to that of the human genome containing Hs-TOP1mt, suggesting that these regions share a common ancestor.
To determine the structure of Mm-TOP1mt, we sequenced the 5' end of the gene and combined the resulting sequence with that available in the NCBI database. Like Hs-TOP1mt, Mm-TOP1mt contains 14 exons. This 14-exon structure is also shared by other TOP1mt genes (Table 1; see below). All TOP1mt genes also exhibit the same intron phases (Table 1). Furthermore, the corresponding introns of the human and mouse TOP1mt genes are similar in size, with the exception of intron 7 which is larger in human (Homo sapiens) than in mouse (Mus musculus) (Table 1).
Table 1. Structure of the genes for mitochondrial topoisomerases I (TOP1mt)
TOP1mt genes are present only in vertebrates
We examined the available eukaryotic DNA sequences to determine which species possess genes for both mitochondrial and nuclear topoisomerases I. With Hs-top1mt and Mm-top1mt as baits, we detected TOP1mt genes in all the vertebrate genomes: zebra fish (Danio rerio) (Dr), chicken (Gallus gallus) (Gg) and rat (Rattus norvegicus) (Rn).
For chicken top1mt, we derived most of the sequence from a cDNA clone (clone ID, pgf2n.pk002.c13) and used GeneRacer to obtain the remaining 5' sequence. For zebra fish, the cDNA sequence was directly derived from a single clone (IMAGE clone ID, 2601221). Expression experiments revealed that both the recombinant chicken (Gg-top1mt) and zebra fish (Dr-top1mt) proteins possess topoisomerase I activity. Cleavage assays (6) also confirmed that Gg-top1mt is a type IB topoisomerase, given that it forms a covalent bond with the 3' end of the cleaved DNA (data not shown).
The sequences and structures of the rat, chicken and zebra fish TOP1mt genes were derived from the recently released databases (NCBI). The rat (Rn-TOP1mt) and chicken (Gg-TOP1mt) genes, like the human and mouse genes, comprise 14 exons (Table 1). For the zebra fish gene (Dr-TOP1mt), we were able to compile only 11 exons from the incomplete genomic sequence (Table 1). The exon sizes for these five vertebrate TOP1mt genes vary for the first exon but are identical for the remaining 13 exons, with the minor exception that exons 2 and 13 of the rodent genes are 3 bp shorter (corresponding to deletion of one amino acid and likely a characteristic of the common rodent ancestor).
The NH2-terminal portion of Hs-top1mt encoded by exon 1 contains the mitochondrial localization signal (6). Alignment of the corresponding NH2-terminal regions of the vertebrate top1mt polypeptides revealed that they share little sequence homology (Fig. 1A). To verify that the newly identified top1mt proteins are indeed mitochondrial enzymes, we transfected M059J human neuroblastoma cells with expression vectors for either Mm-top1mt or Gg-top1mt tagged at their COOH-termini with green fluorescent protein (GFP) and then examined the transfected cells by fluorescence microscopy, as previously described for Hs-top1mt (6). Both of the GFP fusion proteins localized to mitochondria (data not shown), demonstrating the presence of a functional mitochondrial targeting sequence in both mouse and chicken top1mt. We also examined whether, despite their low homology, the amino acid sequences encoded by exon 1 of the various TOP1mt genes might function as mitochondrial leader sequences with the use of the Mitoprot program (http://www.mips.biochem.mpg.de/cgi-bin/proj/medgen/mitofilter). The probability of mitochondrial targeting was high for all five identified top1mt proteins: 92, 98, 99, 99 and 98% for zebra fish, chicken, mouse, rat and human top1mt, respectively. When the sequences encoded by the first exons were removed, however, low scores were obtained for all five proteins, indicating that the mitochondrial-targeting sequences are located in the regions encoded by exon 1 of the TOP1mt genes.
The terminal 13-exon motif is present in all vertebrate TOP1 genes and is highly conserved between TOP1 and TOP1mt genes
The conservation of the terminal 13-exon structure among TOP1mt genes as well as the human gene for nuclear topoisomerase I (Hs-TOP1) (6) led us to investigate whether this structure was common to other type IB topoisomerases. All the vertebrate TOP1 genes examined (rat, mouse, chicken and zebra fish) consist of 21 exons, of which the last 13 exons (exons 9–21) are conserved with regard to size and phase (Table 2).
Table 2. Structure of the genes for nuclear topoisomerases I (TOP1)
Alignment of the amino acid sequences encoded by the last 13 exons of both nuclear and mitochondrial topoisomerases I revealed a high degree of conservation between the nuclear and mitochondrial enzymes (Fig. 1B). The catalytic residues, including the critical basic amino acids (RKR, marked with plus signs) and tyrosine residue (Y, marked with an asterisk), are all preserved.
Pairwise comparisons of the 13-exon motifs revealed high homology among the 10 topoisomerases examined (Table 3). At the nucleotide level, the TOP1 genes exhibited a higher level of identity (83.43 ± 6.87%) than did the TOP1mt genes (73.98 ± 7.91%); the level of identity between the TOP1 and TOP1mt genes was lower (67.72 ± 2.08%).
Table 3. Comparison of the nucleotide and predicted amino acid sequences of the last 13 exons of the genes for vertebrate mitochondrial (mt) and nuclear (n) type IB topoisomerases
Both the 13-exon motif and the presence of two type IB topoisomerase (mitochondrial and nuclear) genes are restricted to vertebrates
We next investigated the existence of genes for type IB topoisomerases in nonvertebrate eukaryotes. The 13-exon topoisomerase motif was not detected in budding yeast (Saccharomyces cerevisiae), fission yeast (Schizosaccharo myces pombe), fruit fly (Drosophila melanogaster), nematode (Caenorhabditis elegans), rice (Oryza sativa) or thale cress (Arabidopsis thaliana). In budding yeast, the TOP1 gene contains no introns. The fission yeast TOP1 gene contains two introns at its 5' end. The fruit fly TOP1 gene consists of eight exons, and the nematode TOP1 gene comprises five exons. Both the rice and the two thale cress TOP1 genes share a common 15-exon structure unrelated to the vertebrate 13-exon topoisomerase motif (not shown). The existence of a distinct but shared (in length and phase) 15-exon structure in these plant species indicates that they are derived from a common ancestor.
The sea squirt (Ciona intestinalis) (Ci) has a single TOP1 gene that is markedly similar to those of vertebrates (Table 2). We determined the structure of exons 3–21 of Ci-TOP1, assuming that the gene consists of 21 exons (Table 2). For exons 3–8, the homology with vertebrate TOP1 genes is low. In contrast, the homology (in terms of size and phase) is high for exons 10–18 and for exon 21 of Ci-TOP1 and vertebrate TOP1 genes. Moreover, the cumulative length of exons 19 and 20 of Ci-TOP1 (174 + 71 = 245 bp) is equal to the total size of the corresponding exons in vertebrates (95 + 150 = 245 bp), and exon 18 of Ci-TOP1 is only 3 bp longer than that of vertebrate TOP1 genes. Thus Ci-TOP1 resembles the vertebrate TOP1 genes in its terminal 12 exons.
Finally, we compared the common portions of the vertebrate type IB topoisomerases encoded by the last 13 exons of their genes and the corresponding sequences of other type IB topoisomerases (Fig. 3). Three main clusters, corresponding to the vertebrate mitochondrial enzymes, the vertebrate nuclear enzymes and the nonvertebrate enzymes, were obtained, with the sea squirt topoisomerase being positioned between the vertebrate and other nonvertebrate enzymes.
Figure 3. Cluster analysis of the mitochondrial and nuclear topoisomerases I. The amino acid sequences encoded by the 13-exon topoisomerase motif of vertebrate genes for nuclear (blue bracket) and mitochondrial (red bracket) enzymes (Fig. 1B) were used for this analysis. For the other enzymes (green bracket), the corresponding regions, based on sequence alignment and size, were used. Hs, H.sapiens; Rn, R.norvegicus; Mm, M.musculus; Gg, G.gallus; Dr, D.rerio; Ci, C.intestinalis; Dm, D.melanogaster; Ce, C.elegans; Sp, S.pombe; Sc, S.cerevisiae; At, A.thaliana; Os, O.sativa. The bar indicates the scale for 10% dissimilarity between amino acid sequences.
DISCUSSION
From zebra fish to human, all the vertebrate type IB topoisomerases examined possess a common terminal 13-exon motif, suggesting that this motif is characteristic of vertebrates. This 13-exon motif, encoding the topoisomerase activity, corresponds to the portion of human topoisomerase I resolved by crystallography (9). Interestingly, the exon–intron boundaries do not occur at the boundaries of the domains identified in the crystal structures of human topoisomerase I.
Given that the organisms with this 13-exon motif possess nuclear genes for both a nuclear and a mitochondrial topoisomerase IB, it is likely that both genes evolved from a common ancestor gene. During evolution, gene duplication might thus have resulted in the emergence of one gene for an enzyme targeted to nuclear DNA and of another gene for an enzyme targeted to mitochondrial DNA. The common topoisomerase I catalytic domain is encoded by the last 13 exons of each gene, and the targeting sequences are encoded by the first exon of the genes for the mitochondrial enzymes and by the first eight exons of the genes for the nuclear enzymes. The structure of the sea squirt TOP1 gene shares similarities with the vertebrate genes in its last 12 exons, suggesting that this gene might share a common ancestor with an early precursor of the vertebrate TOP1 and TOP1mt genes, but failed short of vertebrates.
The absence of a specific TOP1mt gene in the other eukaryotic species raises the question of how these organisms perform mitochondrial DNA metabolism functions. It is possible that other topoisomerases (types II or IA) perform such functions in these species. However, the only specific mitochondrial topoisomerase enzymes identified to date are type IB enzymes. We cannot exclude the possibility that other types of topoisomerase contain mitochondrial targeting sequences that are short and not readily recognizable. Alternatively, a single gene might encode two polypeptides that are targeted either to the nucleus or to mitochondria. The human TOP3 gene, for example, contains two start codons that yield two distinct enzymes, one for the nucleus and the other for mitochondria (7).
The sequence data from this study have been submitted to GenBank under the following accession numbers. Mm-TOP1mt cDNA, AF362952 ; the 5' end of Mm-TOP1mt gene, AF503620 ; Gg-TOP1mt cDNA, AY429654 ; Dr-TOP1mt cDNA, AY429655 ; Rn-TOP1mt cDNA, TPA BK001786 .
ACKNOWLEDGEMENT
We wish to thank Dr Kurt W. Kohn for insightful discussions.
REFERENCES
Champoux,J.J. (2001) DNA topoisomerases: structure, function and mechanism. Annu. Rev. Biochem., 70, 369–413.
Krogh,B.O. and Shuman,S. (2002) A poxvirus-like type IB topoisomerase family in bacteria. Proc. Natl Acad. Sci. USA, 99, 1853–1858.
Wang,J.C. (2002) Cellular roles of DNA topoisomerases: a molecular perspective. Nat. Rev. Mol. Cell. Biol., 3, 430–440.
Anderson,L. (1981) Identification of mitochondrial proteins and some of their precursors in two-dimensional electrophoretic maps of human cells. Proc. Natl Acad. Sci. USA, 78, 2407–2411.
Bibb,M.J., Van Etten,R.A., Wright,C.T., Walberg,M.W. and Clayton,D.A. (1981). Sequence and gene organization of mouse mitochondrial DNA. Cell, 26, 167–180.
Zhang,H., Barcelo,J.M., Lee,B., Kohlhagen,G., Zimonjic,D.B., Popescu,N.C. and Pommier,Y. (2001) Human mitochondrial topoisomerase I. Proc. Natl Acad. Sci. USA, 98, 10608–10613.
Wang,Y., Lyu,Y.L. and Wang,J.C. (2002) Dual localization of human DNA topoisomerase IIIalpha to mitochondria and nucleus. Proc. Natl Acad. Sci. USA, 99, 12114–12119.
DeBry,R.W. and Seldin,M.F. (1996) Human/mouse homology relationships. Genomics, 33, 337–351.
Redinbo,M.R., Stewart,L., Kuhn,P., Champoux,J.J. and Hol,W.G. (1998) Crystal structures of human topoisomerase I in covalent and noncovalent complexes with DNA. Science, 279, 1504–1513.(Hongliang Zhang, Ling-Hua Meng, Drazen B)