Evolution of the Transposable Element Pokey in the Ribosomal DNA of Species in the Subgenus Daphnia (Crustacea: Cladocera)
http://www.100md.com
分子生物学进展 2004年第9期
Department of Zoology, University of Guelph, Guelph, Ontario, Canada
E-mail: tcrease@uoguelph.ca.
Abstract
Pokey is a member of the piggyBac (previously called the TTAA-specific) family of transposons and inserts into a conserved region of the large subunit ribosomal RNA gene. This location is a "hot spot" for insertional activity, as it is known to contain other arthropod transposable elements. However, Pokey is unique in that it is the first DNA transposon yet known to insert into this region. All other insertions are class I non-LTR retrotransposons. This study surveyed variation in Pokey elements through phylogenetic analysis of the 3' ends of Pokey elements from ribosomal DNA (rDNA) in species from the nominate subgenus of the genus Daphnia (Crustacea: Cladocera). The results suggest that Pokey has been stably, vertically inherited within rDNA over long periods of evolutionary time. No evidence was found to support horizontal transfer, which commonly occurs in other DNA transposons, such as P and mariner. Furthermore, Pokey has diverged into sublineages that have persisted across speciation events in some groups. In addition, a new highly divergent paralogous Pokey element was discovered in the rDNA of one species.
Key Words: transposon ? Pokey ? R1/R2 ? piggyBac ? rDNA ? Daphnia ? horizontal transfer
Introduction
Transposable element evolution within host genomes is a dynamic process that involves both vertical transmission within a species and horizontal transfer between species. A typical transposon life cycle is thought to consist of vertical transmission through progeny, the eventual silencing of active elements by the host, and stochastic loss of inactive elements. The latter two stages are both balanced by the reintroduction of elements through horizontal transfer (Lohe et al. 1995; Hartl, Lohe, and Lozovskaya 1997). Life cycles vary both within and between the different classes of transposons. Class I retrotransposons tend to exhibit long periods of vertical transmission (Malik, Burke, and Eickbush 1999; Stuart-Rogers and Flavell 2001), with rare interspecific transfer events (Gonzalez and Lessios 1999; Zupunski, Gubensek, and Kordis 2001). In contrast, class II DNA transposons, such as the P and mariner families, display more frequent cycles of loss and horizontal transfer, which, at times, leads to patchy distributions across species (Clark and Kidwell 1997; Hartl, Lohe, and Lozovskaya 1997).
Some transposons undergo stable vertical transmission, and, as a result, they are maintained within a species over long evolutionary periods (Eickbush and Eickbush 1995). Such cases illustrate not only that transposons can be a stable component of a host's genome but also that horizontal transfer is not always required to evade extinction. Often, cases of stable vertical transmission involve retrotransposable elements that have high copy numbers per genome, most of which are defective. R1 and R2, which insert into a conserved region of the arthropod large subunit ribosomal RNA (LSU rRNA) gene, are two such examples. They are both non–long terminal repeat (non-LTR) retrotransposons that have been found in a multitude of divergent arthropods (Jakubczak, Burke, and Eickbush 1991; Burke et al. 1993, 1998). Moreover, their phylogenetic relationships are consistent with those of their host species (Eickbush and Eickbush 1995), which suggests that they have been present in arthropods from the origin of the phylum, some 500 MYA (Burke et al. 1999) and are stable components of their genomes. These two elements insert in a sequence-specific manner within the LSU rRNA gene, at sites only 74 bp apart (Eickbush 2002).
Several other highly divergent families of transposons have been documented in arthropods within the conserved region of the LSU rRNA gene occupied by R1 and R2 (Burke, Muller, and Eickbush 1995; Sullender 1993). The high frequency of elements specific to this gene could be explained if rRNA genes are a suitable "habitat" for the persistence of insertion sequences. Indeed, Kidwell and Lisch (2001) suggested that genomes are composed of various "ecological niches" that are exploited by different types of transposons. The LSU rRNA gene occurs in tandemly repeated ribosomal DNA (rDNA) units, which form a multigene family that undergoes concerted evolution through gene conversion and unequal crossing over (Coen, Thoday, and Dover 1982; Arnheim 1983; Dvorak, Jue, and Lassner 1987). As a result, the transposons that occupy this niche are subjected to the same homogenization forces as are the genes themselves (Eickbush and Eickbush 1995). In fact, Eickbush and Eickbush (1995) suggested that it is these characteristics that make the rDNA "niche" ideal for the propagation and long-term persistence of transposable elements.
The presence of another site-specific transposon, Pokey, has been detected in the "hot spot" region of the LSU rRNA gene (Sullender 1993). In fact, Pokey inserts only base pairs away from R1 and R2 (fig. 1). Pokey is a class II DNA transposon that possesses terminal inverted repeats of 16 bp and a single 1.5-kb open reading frame (ORF) that codes for a putative transposase (Penton, Sullender, and Crease 2002). It creates a 4-bp target-site duplication (TTAA) on insertion, which makes it a member of the piggyBac family of transposons (Cary et al. 1989; Wang, Fraser, and Cary 1989; Beames and Summers 1990; Penton, Sullender, and Crease 2002; Sarkar et al. 2003). Pokey was originally found within a specific TTAA site in approximately 10% of the LSU rRNA genes of the cladoceran crustacean Daphnia pulex, although Pokey also inserts at many other genomic locations (Sullender 1993; Sullender and Crease 2001).
FIG. 1. Location of the arthropod transposable elements in the LSU rRNA gene. A portion of the Daphnia pulicaria gene is shown. Arrows indicate the insertion site of the various elements based on the 3'-end junction of the element within the rRNA gene. Vertical lines represent top and bottom strand cleavage sites generated by the endonuclease encoded by each element based on the R2 model of integration. The diagram is modified from Burke, Muller, and Eickbush (1995). ETS = external transcribed spacer, ITS = internal transcribed spacer, IGS = intergenic spacer, 18S = small subunit ribosomal RNA gene, and 28S = large subunit ribosomal RNA gene
Here, we report the results of a phylogenetic analysis of the 3' end of Pokey elements that were amplified from rDNA in species from the nominate subgenus of the genus Daphnia (Crustacea: Cladocera). In addition, we used sequences from two genes, the mitochondrial small subunit (mtSSU) rRNA gene and the nuclear LSU rRNA gene, to construct phylogenies of the "host" Daphnia species from which the elements are derived. We show that Pokey is widely distributed in this subgenus and that the phylogeny of the elements is consistent with that of the host species, which suggests that Pokey has been stably inherited within rDNA over long periods of evolutionary time. Furthermore, we find that Pokey has diverged into multiple sublineages that have persisted across speciation events.
Methods
Daphnia Samples
We analyzed 32 isolates representing 14 species of the subgenus Daphnia (table 1). Total genomic DNA was extracted by use of the Isoquick kit (Orca Research) from single animals that were flash frozen in liquid nitrogen in the field or from multiple animals that were propagated parthenogenetically from a single female in the laboratory by the standard culture technique of Hebert and Crease (1980).
Table 1 Daphnia Samples Used in This Study.
DNA Amplification and Sequencing
We used the polymerase chain reaction (PCR) to amplify an approximately 1,820-bp fragment of the 3' end of Pokey elements located in rDNA by use of an internal Pokey primer, Pok5026F (5'-TCGAACCTGCAGCCGGACGAATTTGCAG-3'), and a primer located in the LSU rRNA gene about 200 bp downstream of the element insertion site, 28SBR (5'-CGTCTCCCACTTATGCTACACCTC-3'). The Pokey primer is located in the ORF, which may code for a transposase (Penton, Sullender, and Crease 2002). A few samples that did not amplify well were amplified using an alternate reverse primer 28SR (5'-TCCATTCGTGCGCGTCACTAATTAGATGAC-3'), which is located only 46 bp downstream of the TTAA target site. All PCR reactions were of 50 μl total volume and contained 1.5 mM MgCl2, 5 pmol of each primer, 40 μM dNTPs, 10 mM Tris-HCl (pH8.3), 50 mM KCl, 10 to 50 ng of genomic DNA, and 1 unit of Taq DNA polymerase (Roche). The amplification reactions were performed in an MJ PTC-100 thermal cycler (MJ Research Inc.). The thermocycling profile consisted of 1 cycle of 1 min at 94°C, 35 cycles of 30 sec at 94°C, 30 sec at 55°C and 2 min at 72°C, with a final incubation of 5 min at 72°C.
To verify the taxon identification of each Daphnia DNA sample, we amplified a 596-bp region of the mtSSU rRNA gene by use of primers 12SA (5'-CCAGTACATCTACTTTGTTACGAC-3') and 12SB (5'-AAATCGTGCCAGCCGTCGCGG-3') (Colbourne and Hebert 1996) and sequenced the PCR product directly. The sequences were compared with those obtained by Colbourne and Hebert (1996) in their phylogenetic analysis of the genus Daphnia in North America. In addition, we amplified two regions of the LSU rRNA gene. A 694-bp region that spans the insertion site of Pokey was amplified by use of primers 28SF (5'-CTGCCCAGTGCTCTGAATGTCAAAGTGAAG-3') and 28SCR (5'-GATGTACCGCCCCAGTCAAACTCC-3'). A 1134-bp region upstream of the Pokey insertion site was amplified by use of primers 28S77F (5'-AACCTCGCGCCCGGTTGAGC-3') and 28S1211R (5'-TCCGACGATCGATTTGCACG-3'). The PCR conditions were identical to those described above.
All PCR products were electrophoresed on 0.8% TAE agarose gels, stained with ethidium bromide, and visualized under UV light. The DNA fragments were excised from the gel and purified using either a QIAEX II Agarose Gel Extraction kit (Qiagen) or a freeze/thaw method. The freeze/thaw protocol was as follows: after excision, the agarose slice was frozen in the top of a filter-plugged pipette tip (50 μL), thawed, and then spun at maximum speed for 10 min in a 1.5 mL microfuge tube. The resulting eluant was precipitated in ethanol. All purified samples were sequenced by use of 20 to 50 ng of template with 5 pmol of the primers PokF, 28SBR, 28SR, 12SA, 28SF, 28SCR, 28S77F, and 28S1211R as appropriate, using the ABI Prism TaqFS dye terminator kit (PerkinElmer). The sequences were resolved on an ABI 377 automated sequencer. Sequences that were well resolved on the electropherograms were only sequenced in one direction. However, sequences longer than 700 bp were sequenced from both ends so that overlapping data were available for the middle region of the fragments. Samples that provided low quality sequence data were cloned by use of the TOPO TA Cloning kit for Sequencing (Invitrogen). For this process we used high fidelity JumpStartTM Taq DNA polymerase (Sigma-Genosys) to generate the cloning template to avoid error associated with the misincorporation of nucleotides by the Taq DNA polymerase. When necessary, we sequenced these clones with an internal primer, Pok5338F (5'-TGTCTRGTGAAYAGCTGGATATGC-3').
Sequences of Pokey from the rDNA of Daphnia pulicaria (D. pulicaria-SK-C1 and D. pulicaria-SK-C2) were taken from GenBank (accession numbers AY115589 and AY115590). The Pokey sequence D. pulex-IL+IN-consensus is the consensus of 20 cloned sequences from two isolates of D. pulex from IL and IN. In addition, the Pokey sequences D. pulex-PQ-C1, D. obtusa NA1-TN-C11, and European D. pulex-GR2-C9 were taken from full-length rDNA Pokey clones generated for another study (Penton, unpublished data). Sequences of the LSU rRNA gene for D. pulicaria (accession number AF346514) and D. ambigua (accession number AF346513) were taken from Omilian and Taylor (2001).
Alignment and Phylogenetic Analyses
Pokey, mtSSU rRNA gene, and LSU rRNA gene sequences were all aligned by the Align program (Person et al. 1997) and/or by eye with the aid of Bioedit (Hall 1999). Use of the complete gene sequences reported by Omilian and Taylor (2001) in their phylogenetic reconstruction of daphniids facilitated alignment of the LSU rRNA gene sequences. Highly variable regions of expansion helices were removed from this alignment before phylogenetic analysis.
We performed a Bayesian phylogenetic analysis for all three data sets by application of MrBayes (http://morphbank.ebc.uu.sc/mrbayes) (Huelsenbeck 2000). We first chose the model of DNA substitution by characterizing the sequences with Modeltest version 3.0 (http://inbio.byu.edu/Faculty/kac/crandall_lab/modeltest.htm) (Posada and Crandall 1998). The results of these analyses were then incorporated into the Bayesian phylogenetic analyses. The number of generations run for all three data sets was 500,000. All trees constructed before confluence (10,000 generations) were discarded as "burn-in" (Huelsenbeck 2000).
We performed cladistic analyses by implementation of the maximum-parsimony criterion in PAUP version 4.0b10 (Swofford 2002) and use of the program's heuristic search algorithm and tree-bisection and reconnection (TBR) feature. The results of the Modeltest analyses were also incorporated into these analyses. Sequences were added randomly in 50 replicate trials, with one tree held at each step. Bootstrap values for maximum-parsimony trees were based upon 1,000 pseudoreplicates.
We used the Kimura two-parameter model (Kimura 1980) to estimate sequence divergence for all three genes in MEGA version 2.1 (Kumar, Tamura, and Nei 1993). Phenetic analysis of the resulting distance matrices was performed by use of the neighbor-joining (NJ) method in MEGA (Saitou and Nei 1987), with pairwise deletion of missing sites. The bootstrap percentages from 1,000 pseudoreplicates were calculated in MEGA. We constructed NJ trees separately for the different domains (coding versus noncoding) of the Pokey transposon to determine the presence of any region-specific differences in the pattern of Pokey sequence divergence.
We rooted all phylogenetic trees through the sequence of D. ambigua based on Colbourne and Hebert's (1996) phylogenetic analysis of the genus Daphnia from North America. We performed tests of neutrality on the ORF region of the Pokey sequences by application of the Z-test (Nei and Kumar 2000) as implemented in MEGA. The Nei-Gojobori p-distance model with pairwise deletion of missing data was used to estimate the number of the synonymous and nonsynonymous substitutions between pairs of sequences.
Statistical Analysis of Phylogenetic Congruency
We used three statistical approaches to test whether the topologies constructed from the Pokey sequences and the gene sequences from the Daphnia "host" species are congruent. The first approach, ParaFit (Legendre, Desdevises, and Bazin 2002), which is available on-line at http://www.fas.umontreal.ca/biol/casgrain/en/labo/parafit.html, evaluates the correlation between sequence divergence in the "parasite" Pokey and its host species. Both the overall coevolutionary structure (ParaFitGlobal statistic) and the significance of individual host-parasite links (ParaFitLink1 statistic) are evaluated. We first generated sequence divergence matrices for the Pokey sequences and the mtSSU rRNA gene sequences from the Daphnia hosts in MEGA and then translated them into principle coordinate data with the program DistPCoA (http://www.fas.umontreal.ca/biol/casgrain/en/labo/distpcoa.html). The principle coordinate data were then used in the ParaFit analysis. Correction for any negative eigenvalues was obtained by the Lingoes method. The number of permutations in the ParaFit analysis was 999.
The divergence estimates between Daphnia LSU rRNA gene sequences were too small to be evaluated with ParaFit. Thus, we also used the tree-mapping procedure of Page (1990) and the Shimodaira-Hasegawa (SH) test (Shimodaira and Hasegawa 1999) to evaluate congruence between the topology of the Pokey tree and the trees generated from the mtSSU rRNA and LSU rRNA genes of the Daphnia hosts. The tree-mapping analysis was done by the RECONCILE WITH TREE routine in the MAPPING program from the software package COMPONENT, available on-line at http://taxonomy.zoology.gla.ac.uk/rod/cpw/index.html. The program provides a host tree that is fully reconciled with the parasite tree under the assumption that no host switching by the parasite has occurred. When complete congruence exists between the host and parasite trees, the number of terminal nodes on the reconciled host tree will equal the number of host taxa. However, incongruencies with the parasite tree will require that host taxa be "gained" or "lost" multiple times on the reconciled tree. To evaluate the statistical significance of the association between host and parasite topologies, we used the RANDOM option to generate 1,000 random Daphnia trees, and then mapped each of these trees onto the Pokey tree as above. We compared the distribution of "gains" and "losses" for these 1,000 reconciled Daphnia trees to the number of "gains" and "losses" required for the observed Daphnia tree.
To perform the SH test, which was implemented in PAUP*, we compared the NJ tree generated from the Pokey sequences to constrained topologies generated from the mtSSU rRNA and the LSU rRNA gene sequences from the Daphnia hosts.
Results
Phylogenetic Analyses
All sequences generated for this study are available on GenBank. Accession numbers for the mtSSU rRNA gene sequences are AY626352 to AY626366, accession numbers for the LSU rRNA gene sequences are AY630599 to AY630618, and accession numbers for the Pokey sequences are AY630579 to AY630598. All of the alignments are available at the Molecular Biology and Evolution Web site as Supplementary Material online.
Pairwise sequence divergence between mtSSU rRNA gene sequences ranges from 0.2% to 21.2%. Of the total 502 nucleotide positions, 177 are polymorphic, 119 of which are phylogenetically informative. The phylogeny (fig. 2) produced from these sequences is consistent with that of Colbourne and Hebert (1996). However, support is low for a few nodes that define the relationships between major groups. Nonetheless, the monophyly of these major groups is well supported.
FIG. 2. Neighbor-joining phylogenies based on variation in Pokey and the mtSSU rRNA gene from species in the subgenus Daphnia. Solid lines denote significant coevolutionary relationships between parasite (Pokey) and host (Daphnia), wheras dotted lines represent nonsignificant host-parasite links. Numbers beside major nodes represent bootstrap support (1,000 pseudoreplicates) from neighbor-joining and maximum-parsimony analyses and clade credibility values from Bayesian analysis (NJ/MP/BA). Asterisks indicate no support in the MP or BA analyses for the topology shown. The scale bar indicates sequence divergence. Taxon labels are as shown in table 1. Pokey sequences obtained from cloned PCR products are denoted with a clone number as in D. pileata-OK-C1. Significant rate variation occurs across sites in Pokey based on the HKY85 model of DNA substitution (number of substitution types = 2) and the gamma model (P < 0.000001). The gamma distribution shape parameter is 0.5513 and the Ti/tv ratio is 0.9671. Significant rate variation occurs across sites in the mtSSU rRNA gene based on the TVM model of DNA substitution (number of substitution types = 6) and the gamma model (P < 0.000001). The gamma distribution shape parameter is 0.2382 and the substitution rate matrix is R[A-C] = 1.0834, R[A-G] = 11.7885, R[A-T] = 2.7280, R[C-G] = 0.1182, R[C-T] = 11.7885, R[G-T] = 1.0000
Because of the slow evolution of reproductive isolation in the genus Daphnia, interspecific hybridization is not uncommon (Colbourne and Hebert 1996). As mitochondria exhibit uniparental transmission, cases of hybridization could go undetected. In addition, support is low for some nodes in the mitochondrial trees. Thus, we also constructed a phylogeny based on a nuclear gene. For the combined LSU rRNA gene fragments, 142 nucleotides of a total 1,481 are polymorphic, 55 of which are phylogenetically informative. Pairwise sequence divergence for these sequences ranges from 0.1% to 6.4%. Phylogenies produced from the LSU rRNA gene (fig. 3) are generally consistent with the trees constructed from the mtSSU rRNA gene. Although support is low for a few nodes that define relationships among major groups, the monophyly of those major groups is well supported. The only major difference between the two Daphnia phylogenies concerns the sister group relationship between D. parvula/D. retrocurva and the D. obtusa group in the LSU rRNA gene tree. In the mtSSU rRNA gene tree, D. parvula/D. retrocurva clusters with D. catawba/D. minnehaha (fig. 2).
FIG. 3. Neighbor-joining phylogenies based on variation in Pokey and the LSU rRNA gene from species in the subgenus Daphnia. Dotted lines connect Pokey isolates to corresponding host genomes. These relationships were not tested statistically and are included for display only. Numbers beside major nodes represent bootstrap support (1,000 pseudoreplicates) from neighbor-joining and maximum-parsimony analyses and clade credibility values from Bayesian analysis (NJ/MP/BA). Asterisks indicate no support in the MP or BA analyses for the topology shown. The scale bar indicates sequence divergence. Significant rate variation occurs across sites in the LSU rRNA gene based on the TrN model of DNA substitution (number of substitution sites = 6) and the invariant-gamma model (P = 0.000002). The gamma distribution shape parameter is 0.7347, the proportion of invariable sites is 0.7437, and the substitution rate matrix is R[A-C] = 1.0000, R[A-G] = 3.9170, R[A-T] = 1.0000, R[C-G] = 1.0000, R[C-T] = 7.9080, R[G-T] = 1.0000. Details for the Pokey tree are given in figure 2
In the Pokey sequence alignment, 824 nucleotides of 1,711 are polymorphic, 525 of which are phylogenetically informative. The pairwise divergence between unique sequences ranges from 0.1% to 44.0%. This sequence contains approximately 355 bp of the 3' end of the ORF, which may encode a transposase and approximately 1,350 bp of noncoding DNA. The pairwise sequence divergence for the ORF is much smaller than that of the noncoding sequence and ranges from 0% to 18% (average of 6.5%). That of the noncoding region ranges from 0.1% to 54.7% (average of 23.2%).
Most Pokey sequences share the same stop codon. However, D. obtusa (NA1 and NA2) and D. catawba possess different deletions that both lead to the same downstream stop codon, which adds 7 and 8 amino acids (aa) to the proteins, respectively. On the other hand, D. pileata has a 1-bp insertion that leads to a premature stop codon, which truncates the protein by 3 aa. Finally, D. ambigua has a large deletion at the 3' end of the ORF, which leads to the absence of a nearby stop codon. Because of these differences, an ORF data set was constructed that contained the 315 nucleotide positions up to and including the codon directly upstream of the large deletion in the sequence of D. ambigua. This regions contains 78 variable nucleotide sites leading to 24 aa changes. The majority of the nucleotide changes (66.67%) are at third codon positions (52 of 78), whereas 17 (21.79%) and 9 (11.54%) are at first and second codon positions, respectively. The high proportion of third position differences suggests that the ORF is conserved because of functional constraint. Indeed, Penton, Sullender, and Crease (2002) and Sarkar et al. (2003) have shown that the putative protein encoded by this ORF is similar to the functional transposase of the piggyBac element. Moreover, Sarkar et al. (2003) found a putative DDD amino acid motif located in the middle of a conserved core region (D268, D346, and D447) of the transposases from the piggyBac family. The widely documented DDD motif is believed to be the functional catalytic domain of the transposase proteins from the Tc1 and mariner groups (Doak et al. 1994; Robertson and Lampe 1995). As this study only examined the 3' end of the ORF, we could not detect the entire DDD motif. However, the third aspartic acid (D) residue is present in all Pokey sequences examined here.
An overall Z-test of neutrality that included all Pokey sequences indicated that purifying selection, rather than positive selection, is responsible for the pattern of nucleotide substitution in the ORF across species (Z = 5.445, P<<<0.001). Because some species were represented by multiple sequences, we repeated the analysis but only included one sequence for each species (D. pulex-ON for D. pulex/D. pulicaria, Euro D. pulex-GR2-C9, D. obtusa NA1-OK-C1, D. obtusa NA2-IL2-C4, and D. parvula-ON). Again, the results strongly suggest that purifying selection is responsible for the pattern of nucleotide substitution observed among the species (Z = 6.53, P<<<0.001).
Trees constructed from the Pokey sequences by different phylogenetic methods are generally consistent with one another and the support is high for most nodes (fig. 2). Two additional Pokey trees were constructed; one based entirely on the coding regions from the ORF (see below) and the other based on the noncoding DNA (data not shown). These two phylogenies are generally consistent with the one shown in figure 2, which indicates a lack of region-specific differences in the pattern of Pokey sequence divergence.
Statistical Analysis of Phylogenetic Congruency
The ParaFit global test of an association between Pokey sequence divergence and that of its host's mtSSU rRNA gene is highly significant (i.e., coevolutionary) (ParaFitGlobal = 0.02092, P = 0.001). The analysis of individual host-parasite links (fig. 2) shows only the D. cheraphila link to be nonsignificant (i.e., random) (ParaFitLink1 = 0.00065, P = 0.115), even though the topology of the two trees is congruent in this region.
Fourteen Daphnia sequences appear on the mtSSU rRNA gene tree (Euro D. pulex was only included once in the topology) and 17 sequences appear on the Pokey tree used in the tree-mapping analysis. The reconciled Daphnia tree requires the addition of 33 nodes and the loss of 11 nodes. The frequency distribution of nodes added in 1,000 random Daphnia trees ranges from 66 to 151, and the distribution of nodes lost ranges from 31 to 74. Both of these ranges are significantly higher than the number of gains and losses observed in the reconciled Daphnia tree. Thus, the association between the host mtSSU rRNA gene tree and the Pokey tree is highly nonrandom. Twelve Daphnia sequences appear on the LSU rRNA gene tree (this gene was not analyzed in D. cheraphila and D. arenata), and, thus, the Pokey sequences were correspondingly reduced to 15. The reconciled Daphnia tree requires the addition of 23 nodes and the loss of 11 nodes. The frequency distribution of nodes added in 1,000 random Daphnia trees ranges from 38 to 105, and the distribution of nodes lost ranges from 21 to 60. Again, the association between the host LSU rRNA gene tree and the Pokey tree is highly nonrandom.
The only major difference between the Pokey tree and the Daphnia LSU rRNA gene tree is the position of D. pileata, but two major differences exist between the Pokey tree and the Daphnia mtSSU rRNA gene tree: (1) the relationships among D. pulex/D. pulicaria/D. arenata, Euro D. pulicaria, and Euro D. pulex and (2) the position of the D. parvula/D. retrocurva group. A separate SH test was used to determine whether each of these three differences was significant. A single representative of Pokey from each species was used in this analysis (D. pulex-ON for D. pulex/D. pulicaria/D. arenata, Euro D. pulex-GR2-C9, D. obtusa NA1-OK-C1, D. obtusa NA2-IL2-C4, and D. parvula-ON). The difference between the LSU rRNA gene topology and the Pokey topology is not significant, but both differences between the mtSSU rRNA gene topology and the Pokey topology are significant (table 2).
Table 3 Mean Sequence Divergence (Above the Diagonal) Among Pokey Elements trom Groups of Daphnia Identified on the Neighbor-Joining Phylogram.
Intraspecific Variation in Pokey
To determine the level of Pokey sequence divergence within a geographically widespread species, we analyzed a smaller segment (763 bp) of the 3' end of Pokey from additional isolates of D. obtusa from across its North American range (table 1 and fig. 4). Based on analysis of the mitochondrial cytochrome c oxidase subunit I gene, Penton, Hebert, and Crease (2004) showed that D. obtusa in the United States diverged into two morphologically cryptic species, denoted NA1 and NA2, at least 12 MYA. Furthermore, D. obtusa NA1 radiated into four lineages with largely allopatric distributions during the Pleistocene (< 1 MYA). Even so, the mean sequence divergence between Pokey sequences from all D. obtusa isolates, including the two species, is only 0.7%. However, cloned sequences from two isolates, NA1-TN and NA2-IL2, are substantially more divergent from the others with a mean sequence divergence of 1.2%. When these two isolates are removed from the analysis, the mean divergence between the remaining Pokey sequences decreases to 0.4%.
FIG. 4. Map showing the 13 collection sites of North American Daphnia obtusa used in this study
We cloned and sequenced the larger Pokey fragment (1,618 bp) from isolates NA1-NV, NA1-OK, NA1-TN, NA2-IL2, and NA2-OH. The mean divergence between these larger Pokey fragments is 3.9%. This increase reflects the fact that the additional sequence includes a hypervariable region. In this case, the NA1-TN and NA2-IL2 Pokey fragments show 6.6% sequence divergence from the other three sequences, whereas the mean divergence among the other sequences is only 0.9%. These results indicate the existence of at least two lineages of Pokey elements in D. obtusa and that both occur in each of the two species.
Inspection of the electropherograms generated by direct sequencing of PCR fragments from several of the other D. obtusa isolates shows that "double peaks" often occur at nucleotide positions where the two types of Pokey elements differ. In fact, the sequence in one highly variable region where most of the differences occur was impossible to read because of the presence of both of these lineages within these individuals. This situation is similar to the one seen in D. pulicaria, where two different lineages of Pokey elements, differing by 6% sequence divergence, were PCR amplified and cloned from the genome of a single individual (Penton, Sullender, and Crease 2002). These two sequences are included in the Pokey tree in figures 2 and 3 (D. pulicaria-SK-C1 and D. pulicaria-SK-C2). Note that the sequences from D. pulex are clearly more closely related to the SK-C1 sequence. To date, no indication has emerged that the second lineage (SK-C2) is present in the D. pulex isolates that have been sampled, but a conclusion that it has been lost from this species altogether is clearly premature.
Two isolates of D. parvula (table 1) were also included in the Pokey analysis, one from Ontario (ON) and a second from Arkansas (AR), and the divergence between them is 0.8%. In contrast, the sequence divergence between Pokey elements from different species is often substantial (table 3). For example, sequence divergence ranges from 39.1% to 42.6% (average of 41.0%) between Pokey from the most divergent taxon, D. ambigua, and from all of the other taxa. Overall, this finding suggests that each species contains one or more highly homogenous lineages of rDNA Pokey elements whose interspecific sequence divergence is highly correlated with that of their hosts.
A New Paralogous Pokey Lineage
Two of the cloned Pokey elements from D. obtusa NA2-OH that we sequenced were so divergent from the others that we were not able to completely align their noncoding regions. An NJ phylogeny produced from the ORF region (fig. 5) shows that these elements form a sister group to the sequences obtained from all of the other species in the subgenus, and, thus, they represent a new paralogous Pokey family in the rDNA of the subgenus Daphnia. This new family is hereafter denoted as PokeyB, and the original family is denoted as PokeyA.
FIG. 5. Neighbor-joining phylogeny based on a portion of the open reading frame from Pokey elements in the rDNA of species from the subgenus Daphnia. The sequence alignment includes a 315-bp segment from the 3' end of the putative Pokey transposase protein. Numbers beside nodes represent bootstrap support from 1,000 pseudoreplicates. The scale bar indicates nucleotide sequence divergence
We incorporated the ORF sequence of PokeyB into the alignment of known piggyBac elements (which includes PokeyA) generated by Sarkar et al. (2003), calculated a pairwise matrix of the number of amino acid differences, and then used it to generate an NJ tree (data not shown). PokeyA and PokeyB cluster with one another with strong bootstrap support (999/1,000 replicates) on this tree, which indicates that they are both members of the same major group of transposons.
The D. obtusa NA2-OH isolate from which PokeyB was isolated also contains PokeyA elements (fig. 5), so clearly these two families can coexist in the same genome. Analysis of the complete nucleotide fragment for PokeyB (fig. 6) reveals that the 3' end of its ORF is intact, although the location of its stop codon would produce a protein that is 11 aa shorter than the one produced by the PokeyA elements from most species. Sequence divergence between the consensus ORF of PokeyA and PokeyB is only 16%, whereas that of the noncoding DNA is greater than 50%. This finding suggests that the ORF is conserved across Pokey families because of functional constraint, and that PokeyB is likely to be an autonomous transposon family. A hypervariable region that is not alignable across the two Pokey families is also highly variable within PokeyA and is, thus, not likely to be necessary for transposon function.
FIG. 6. Schematic diagram of the alignment between the paralogous PokeyA and PokeyB families from species in the subgenus Daphnia. Percentages represent the amount of sequence divergence between different regions of the two families. Solid boxes represent the 16-bp 3'-terminal inverted repeats, and the vertically striped boxes denote a hypervariable region that is not alignable between the two families. Stippled boxes represent a segment of the 3' end of a putative transposase open reading frame. The total number of amino acid substitutions found between the consensus transposase sequences of PokeyB and PokeyA is given
Discussion
Colbourne and Hebert (1996) found weak support for the relationships among major groups in their phylogenetic reconstruction of North American Daphnia. Similarly, the relationships between major groups are not well supported in either the mtSSU rRNA gene or the LSU rRNA gene phylogenies constructed here (figs. 2 and 3). A comparison between these two host phylogenies reveals one major difference, the sister group relationship of D. parvula/D. retrocurva to the D. obtusa group in the LSU rRNA gene tree. Hybridization is unlikely to be the explanation for this occurrence, as these events have only been documented between species possessing mtSSU rRNA gene sequence divergence of less than 14% (Colbourne and Hebert 1996, and references therein). Other than this difference, and despite the low support of relationships among major groups on both trees, phylogenies based on the mtSSU rRNA gene and the LSU rRNA gene are consistent with one another. In addition, trees based on Pokey sequences are generally congruent with the phylogenies based on its host genome (figs. 2 and 3). Furthermore, the sister group relationship of D. parvula/D. retrocurva to the D. obtusa group is strongly supported on the Pokey tree. Indeed, the Pokey elements from all members of this monophyletic clade except D. cheraphila contain the same 74-bp insertion.
Two significant differences exist between the Pokey tree and the mtSSU rRNA gene tree, but the position of D. pileata is not well resolved in the latter. Moreover, the discrepancy in relationships among members of the D. pulex group could also be explained by hybridization, which is known to occur in this group (Colbourne and Hebert 1996). Overall, no conclusive evidence of horizontal transfer was detected, and the results of the ParaFit and Component analyses revealed a coevolutionary association between Pokey and its host. Altogether, these results suggest that Pokey has persisted in the rDNA locus of Daphnia species via stable vertical transmission.
Rare horizontal transfers of Pokey may occur, but they appear not to be necessary for the long-term survival of this element in the genomes of Daphnia species. The age of the genus Daphnia is on the order of 50 to100 Myr (Colbourne and Hebert 1996). Thus, we can infer that Pokey has existed in the rDNA of species in the subgenus Daphnia for long periods of evolutionary time. Whether this pattern of long-term persistence, or even if Pokey itself, occurs outside of this subgenus is unknown. Although PCR-based attempts to detect the presence of Pokey in species from the other two subgenera of Daphnia, Hyalodaphnia, and Ctenodaphnia, yielded negative results (unpublished data), the failure to detect the element could be a failure of the PCR primers to bind to sequences from such distantly related taxa. Studies that use Southern hybridization are now underway to determine the distribution of Pokey beyond the subgenus Daphnia.
The pattern of evolution in Pokey contrasts with the pattern documented in the two best-known DNA transposons, P and mariner. These two elements belong to vastly different superfamilies of class II elements, yet both exhibit short life cycles within a species and horizontal transfer of element copies between different species. For example, a horizontal transfer event of a P element from the genome of Drosophila willistoni to Drosophila melanogaster has been documented (Daniels et al. 1990; Clark and Kidwell 1997). In this case, P elements from the genomes of the two fly species differ by only a single nucleotide position despite the 30 to 50 Myr of divergence between them. Likewise, numerous horizontal transfer events have been described for mariner, a member of the DD34D variant of the DD35E motif group. For example, a mariner-like element from the earwig Forficula auricularia is more than 90% similar to that from the honeybee despite more than 265 Myr of divergence between these insect species (Robertson 1993; Robertson and MacLeod 1993). Horizontal transfer of mariner elements has even been demonstrated between species from distant phyla (Hartl, Lohe, and Lozovskaya 1997). Similarly, Leaver (2001) found evidence of horizontal transfer of another member of the DDD motif group, a Tc1-like element, between the genomes of frogs and fishes. Indeed, cases of horizontal transfer have been documented for Drosophila elements Minos (Arca and Savakis 2000) and Mos1-like (Brunet et al. 1999) as well as the Fusarium element Fot1 (Daboussi et al. 2002), all of which are members of the Tc1-mariner superfamily of elements.
Recent studies have questioned the prevalence of horizontal transfer and indeed, Capy, Anxolabehere, and Langin (1994) argue that phylogenetic inconsistencies may have been inaccurately attributed to horizontal transfer when they could be explained by other factors that are consistent with vertical transmission. For example, comparison of paralogous elements (i.e., different lineages), varying rates of sequence evolution of elements between species, and retention of ancestral polymorphisms (Capy, Anxolabehere, and Langin 1994) can lead to a transposon tree that is incongruent with that of the host species, even in the complete absence of horizontal transfer. The few discrepancies between the Pokey tree and the Daphnia gene trees in the present study are likely the result of one or more of the above factors, especially given the occurrence of multiple lineages of rDNA Pokey elements in the genome of both D. pulicaria (Penton, Sullender, and Crease 2002) and D. obtusa.
The occurrence of multiple, long-lived lineages such as we observed in Pokey has also been observed in the arthropod rDNA-specific non-LTR retrotransposons R1 and R2. For example, Gentile et al. (2001) detected two paralogous lineages of R1, A and B, in five species groups of Drosophila. All 35 species surveyed contained the A lineage, whereas 11 species, from three of the five species groups, also contained the B lineage. In addition, the A lineage had diverged into two sublineages, A1 and A2, in the melanogaster species group. All the species in this group had the A1 lineage, whereas the five species in the takahashii subgroup (which is within the melanogaster species group) also contained the A2 lineage. Results obtained by Burke et al. (1993) in an analysis of elements from five divergent insect species—Bombyx mori (Lepidoptera), D. melanogaster (Diptera), Sciara coprophila (Diptera), Popillia japonica (Colleoptera), and Nasonia vitripennis (Hymenoptera)—suggest that multiple paralogous lineages of R1 can persist for even longer periods of time. Four lineages of R1 were detected in N. vitripennis, but they were not monophyletic. One of them was most closely related to the element obtained from B. mori, whereas the other three formed a monophyletic cluster that grouped with the elements from D. melanogaster and S. coprophila (figure 6 in Burke et al. [1993]). Based on this limited sampling of taxa, at least two paralogous lineages of R1 appears to exist in insects.
As is the case for Pokey elements in Daphnia, intraspecific sequence divergence between copies of R1 and R2 elements from the same lineage in Drosophila is generally less than 1%, whereas sequence divergence between lineages is much higher (Gentile, Burke, and Eickbush 2001). Originally, such homogeneity of elements was thought to be maintained by concerted evolution within rDNA, but this hypothesis made an explanation of how new lineages could evolve difficult. However, Pérez-González and Eickbush (2001, 2002) showed that the sequence homogeneity within R1 and R2 lineages is most likely a function of rapid turnover of elements, which includes both elimination and insertion. Individual variants are rapidly eliminated by concerted evolution, which is primarily a consequence of intrachromosomal crossing over (Pérez-González, Burke, and Eickbush 2003), and then replaced by new copies so that sequence homogeneity is mainly a function of recent retrotransposition. Consequently, active elements give rise to progeny that diverge from one another independently, which provides opportunities for lineage sorting. The presence of a transposable element in an LSU rRNA gene results in a nonfunctional copy (Long and Dawid 1979; Kidd and Glover 1981). Thus, the number of LSU rRNA genes that can be inactivated by R1 and R2 without affecting the fitness of the host creates a limiting resource, especially at particular times during development when a very high proportion of active genes are required to meet the requirements of rRNA transcription. The resulting competition results in the persistance of only a small number of lineages within any one species (Pérez-González and Eickbush 2001, 2002).
Gentile, Burke, and Eickbush (2001) extracted DNA from single stocks of each of the Drosophila species that they analyzed, so they were not able to obtain information on geographic variation among R1 elements. The present study shows that sequence divergence within a lineage of Pokey is extremely low even across very broad geographic areas. This condition is difficult to attribute to common ancestry from a small pool of active elements, given the substantial divergence in allozymes (Hebert and Finston 1996) and mtDNA (Penton, Hebert, and Crease 2004) that is known to have occurred among the D. obtusa populations that were surveyed. More likely, concerted evolution accounts for such homogeneity, as it does for the rDNA itself. If this conclusion is correct, how then can new lineages of Pokey evolve and persist for such long periods of time?
Nothing is known about rates of Pokey transposition or its precise mechanism. However, if Pokey transposes by the same "cut and "paste" mechanism used by piggyBac (Elick, Bauser, Fraser 1996; Lobo, Li, and Fraser 1999) then transposition is not replicative, as is retrotransposition. Thus, increases in Pokey's copy number in rDNA could be facilitated by concerted evolution or by transposition of copies from other genomic locations, as active copies of Pokey are known to occur outside of rDNA (Sullender and Crease 2001). Thus, the possibility that such exchange could occur would also help explain how different lineages of Pokey are able to evolve despite the homogenizing effect of concerted evolution. Genomic copies of Pokey could act as a reservoir of new variants that have evolved independently of lineages currently occupying rDNA. A survey of sequence variation among rDNA and non-rDNA copies of Pokey screened from a cosmid library produced from a single individual of D. pulicaria is currently underway to determine if transposition does occur between rDNA and other genomic locations.
rDNA has been suggested to possess characteristics that make it ideal for the evolution of insertion sequences. Although the presence of a transposon within the LSU rRNA gene results in a nonfunctional copy, rDNA is highly repetitive, so individuals have more units than they need for survival. Thus, elements can exist to a certain threshold level in this specific location within a genome without any noticeable phenotypic effects at the organismal level (Eickbush and Eickbush 1995; Malik and Eickbush 1999). High levels of insertion of R1 and R2 have been exhibited in Drosophila mercatorum and D. melanogaster/D. hydei, where they are associated with the abnormal abdomen and bobbed phenotypes, respectively (Franz and Kunz 1981; Templeton et al. 1989; Lathe et al. 1995; Malik and Eickbush 1999). Overall, however, targeting a site within a multigene family reduces the risk that the element will insert into single-copy genes and adversely affect the phenotype. In addition, the rRNA genes are actively transcribed, ensuring expression of the element's transposase, and highly conserved, ensuring a population of uniform future target sites (Eickbush and Eickbush 1995). This feature is a definite advantage as transposons with nonspecific insertion sites are at risk of transposing into areas of the genome that are not transcriptionally active, such as heterochromatin.
Overall, the characteristics of rDNA combine to define a genomic location that is advantageous for the propagation of elements. Indeed, phylogenetic reconstruction that involves complete R2 elements from various arthropod species revealed that they have been present in these lineages from the origin of the phylum, some 500 Myr (Burke et al. 1999). A subsequent analysis of the RT domain from non-LTR retrotransposons identified 11 distinct clades and dated the origin of this superfamily of elements to the Precambrian era (Malik, Burke, and Eickbush 1999). No evidence for horizontal transfer was found either within or between clades. Malik, Burke, and Eickbush (1999) suggested that the lack of horizontal transfer and subsequent vertical nature of inheritance of non-LTR retrotransposons could be caused by their target-primed reverse transcription mechanism. However, Zupunski, Gubensek, and Kordis (2001) recognized cases of horizontal transfer in the RTE clade of non-LTR retrotransposons, which indicates that this superfamily of elements is indeed capable of horizontal transfer. Thus, the lack of horizontal transfer in R1 and R2 may be more a function of their occupation of rDNA than of their transposition mechanism. Indeed, Pokey is a very different type of element than R1 and R2, and yet its pattern of evolution in rDNA is very similar to theirs, which suggests that the rDNA location itself has a major impact on the evolution of the elements that insert within it.
The presence of multiple, divergent elements within a specific region of the rDNA unit suggests that it is a "hot spot" for insertional mutations. This region is small, only about 100 bp in length (fig. 1), and occurs in a core region (domain V) upstream of the D9 expansion segment (Gray and Schnare 1990) in a highly conserved section of the LSU rRNA gene, about one-third of the way upstream from its 3' end. The rDNA-specific transposons insert into both helices and into unpaired regions between helices. However, what specific features of this particular region of the rDNA unit that account for its susceptibility to insertion by transposons is still unclear. Sullender (1993) showed that Pokey inserts into one particular TTAA target site in this region, even though other target sites exist in the LSU rRNA gene; in fact, another TTAA is just downstream of the actual insertion site. If Pokey did insert there, we would be able to detect it with the PCR primers that we use.
Before the discovery of Pokey, no DNA transposon was known to have such site specificity (Eickbush and Malik 2002), although Pokey does not seem to be site specific in other genomic locations beyond the specificity for TTAA. We have sequenced several hundred base pairs of the 3' flanking region of four non-rDNA Pokey elements screened from the D. pulicaria cosmid library (unpublished data) and found no primary sequence similarity among these sequences and the LSU rRNA gene immediately downstream from the TTAA target site. However, further analysis is required to determine whether these sequences share other attributes to which Pokey is attracted. Even so, the fact that Pokey does occur at other genomic locations may have provided the advantage that it required to establish itself in the rDNA "hot spot," despite the occupation of this region by the highly successful R1 and R2. In fact, the insertion site of R2 begins one nucleotide downstream from Pokey's TTAA site (fig. 1), which suggests the possibility of competition between these elements. Preliminary PCR analysis suggests that R2 does occur in Daphnia (unpublished data). Thus, future study of the interaction between these elements within the rDNA of a single species may provide important insights into competition and survival patterns of transposable elements.
Supplementary Material
Both sequential (FastA) and interleaved Nexus files of the sequences used in this study are available at the Molecular Biology and Evolution Web site as Supplementary data.
Supplementary Data File 1. supplement-mbe040092–01.txt: Nucleotide FASTA file of 15 Daphnia mtSSU rRNA gene sequences and Interleaved NEXUS alignment file of 15 Daphnia nucleotide mtSSU rRNA gene sequences.
Supplementary Data File 2. supplement-mbe040092–02.txt: Nucleotide FASTA file of 12 Daphnia LSU rRNA gene sequences and Interleaved NEXUS alignment file of 12 Daphnia nucleotide LSU rRNA gene sequences.
Supplementary Data File 3. supplement-mbe040092–03.txt: Nucleotide FASTA file of 22 Pokey sequences from Daphnia and Interleaved NEXUS alignment file of 22 nucleotide Pokey sequences from Daphnia.
Table 2 Comparison of Different Pokey Topologies by the Shimodaira-Hasegawa Test.
Acknowledgements
Financial support for this work was provided by a research grant from the Natural Sciences and Engineering Research Council to T.J.C. J. Colbourne, P. Hebert, C. Prokopowich, D. Taylor, and L. Weider generously provided samples of Daphnia. We thank A. Holliss for sequencing, Jeremy DeWaard for his assistance with the SH tests, and Hope Hollocher, Denis Lynn, and two anonymous reviewers for comments on earlier drafts of the manuscript.
Literature Cited
Arca, B., and C. Savakis. 2000. Distribution of the transposable element Minos in the genus Drosophila. Genetica 108:263-267.
Arnheim, N. 1983. Concerted evolution of multigene families. Pp. 38–61 in M. Nei and R. K. Koehn, eds. Evolution of genes and proteins. Sinauer Associates, Sunderland, Mass.
Beames, B., and M. D. Summers. 1990. Sequence comparison of cellular and viral copies of host cell DNA insertions found in Autographa californica nuclear polyhedrosis virus. Virology 174:354-363.
Brunet, F., F. Godin, C. Bazin, and P. Capy. 1999. Phylogenetic analysis of Mos1-like transposable elements in the Drosophilidae. J. Mol. Evol. 49:760-768.
Burke, W. D., D. G. Eickbush, Y. Xiong, J. Jakuczak, and T. H. Eickbush. 1993. Sequence relationship of retrotransposable elements R1 and R2 within and between divergent insect species. Mol. Biol. Evol. 10:163-185.
Burke, W. D., H. S. Malik, J. P. Jones, and T. H. Eickbush. 1999. The domain structure and retrotransposition mechanism of R2 elements are conserved throughout arthropods. Mol. Biol. Evol. 16:502-511.
Burke, W. D., H. S. Malik, W. C. Lathe, III, and T. H. Eickbush. 1998. Are retrotransposons long-term hitchhikers? Nature 239:141-142.
Burke, W. D., F. Muller, and T. H. Eickbush. 1995. R4, a non-LTR retrotransposon specific to the large subunit rRNA genes of nematodes. Nucleic Acids Res. 23:4628-4634.
Capy, P., D. Anxolabehere, and T. Langin. 1994. The strange phylogenies of transposable elements: Are horizontal transfers the only explanation? Trends Genet. 10:7-12.
Cary, L. C., M. Goebel, H. H. Corsaro, H. H. Wang, E. Rosen, and M. J. Fraser. 1989. Transposon mutagenesis of baculovirus: analysis of Trichoplusia ni transposon IFP2 insertions within the FP-locus of nuclear polyhedrosis virus. Virology 172:156-169.
Clark, J. B., and M. G. Kidwell. 1997. A phylogenetic perspective on P transposable element evolution in Drosophila. Proc. Natl. Acad. Sci. USA 94:11428-11433.
Coen, E. S., J. M. Thoday, and G. Dover. 1982. Rate of turnover of structural variants in the rDNA gene family of Drosophila melanogaster. Nature 295:546-568.
Colbourne, J. K., and P. D. N. Hebert. 1996. The systematics of North American Daphnia (Crustacea: Anomopoda): a molecular phylogenetic approach. Philos. Trans. R Soc. Lond. B Biol. Sci. 351:349-360.
Daboussi, M. J., J. M. Daviere, S. Graziani, and T. Langin. 2002. Evolution of the Fot1 transposons in the genus Fusarium: discontinuous distribution and epigenetic inactivation. Mol. Biol. Evol. 19:510-520.
Daniels, S. B., K. R. Peterson, L. D. Strausbaugh, M. G. Kidwell, and A. Chovnick. 1990. Evidence for horizontal transmission of the P transposable element between Drosophila species. Genetics 124:339-355.
Doak, T. G., F. P. Doerder, C. L. Jahn, and G. Herrick. 1994. A proposed superfamily of transposase genes: transposon-like elements in ciliated protozoa and a common "D35E" motif. Proc. Natl. Acad. Sci. USA 91:942-946.
Dvorak, J., D. Jue, and M. Lassner. 1987. Homogenization of tandemly repeated nucleotide sequences by distance dependent nucleotide sequence conversion. Genetics 116:487-498.
Eickbush, T. H. 2002. R2 and related site-specific non-long terminal repeat retrotransposons. Pp. 813–835 in N. Craig, R. Craigie, M. Gellert, and A. Lambowitz, eds. Mobile DNA II. American Society of Microbiology Press, Washington, DC.
Eickbush, D. G., and T. H. Eickbush. 1995. Vertical transmission of the retrotransposable elements R1 and R2 during the evolution of the Drosophila melanogaster species subgroup. Genetics 139:671-684.
Eickbush, T. H., and H. S. Malik. 2002. Origins and evolution of retrotransposons. Pp. 1111–1144 in N. Craig, R. Craigie, M. Gellert, and A. Lambowitz, eds. Mobile DNA II. American Society of Microbiology Press, Washington, DC.
Elick, T. A., C. A. Bauser, and M. J. Fraser. 1996. Excision of the piggyBac transposable element in vitro is a precise event that is enhance by the expression of its encoded transposase. Genetica 98:33-41.
Franz, G., and W. Kunz. 1981. Intervening sequences in ribosomal RNA genes and bobbed phenotype in Drosophila hydei. Nature 292:638-664.
Gentile, K. L., W. D. Burke, and T. H. Eickbush. 2001. Multiple lineages of R1 retrotransposable elements can coexist in the rDNA loci of Drosophila. Mol. Biol. Evol. 18:235-245.
Gonzalez, P., and H. A. Lessios. 1999. Evolution of sea urchin retroviral-like (SURL) elements: evidence from 40 echinoid species. Mol. Biol. Evol. 16:938-952.
Gray, M. W., and M. N. Schnare. 1990. Evolution of the modular structure of rRNA. Pp. 589–597 in W. Hill, A. Dahlberg, R. A. Garrett, P. B. Moore, D. Schlessinger, and J. R. Warner, eds. The ribosome: structure, function and evolution. American Society of Microbiology Press, Washington, D.C.
Hall, T. 1999. BioEdit: a user friendly biological sequence alignment editor and analysis program for Windows 95/98/NT. Nucl. Acids. Symp. Ser. 41:95-98.
Hartl, D. L., A. R. Lohe, and E. R. Lozovskaya. 1997. Modern thoughts on an ancyent marinere: function, evolution, regulation. Annu. Rev. Genet. 31:337-358.
Hebert, P. D. N., and T. J. Crease. 1980. Clonal coexistence in Daphnia pulex (Leydig): another planktonic paradox. Science 207:1363-1365.
Hebert, P. D. N., and T. L. Finston. 1996. Genetic differentiation in Daphnia obtusa: a continental perspective. Freshwater Biol. 35:311-321.
Huelsenbeck, J. P. 2000. Mr.Bayes: Bayesian inference of phylogeny. Distributed by the author, Department of Biology, University of Rochester.
Jakubczak, J. L., W. D. Burke, and T. H. Eickbush. 1991. Retrotransposable elements R1 and R2 interrupt the rRNA genes of most insects. Proc. Natl. Acad. Sci. USA 88:3295-3299.
Kidd, S. J., and D. M. Glover. 1981. Drosophila melanogaster ribosomal DNA containing type II insertions is variably transcribed in different strains and tissues. J. Mol. Biol. 151:645-662.
Kidwell, M. G., and D. R. Lisch. 2001. Perspective: transposable elements, parasitic DNA, and genome evolution. Evolution 55:1-24.
Kimura, M. 1980. A simple method for estimating evolutionary rate of base substitutions through comparative studies of nucleotide sequences. J. Mol. Evol. 16:111-120.
Kumar, S., K. Tamura, and M. Nei. 1993. MEGA: molecular evolutionary genetics analysis. Version 1.02. The Pennsylvanian State University, University Park, Pa.
Lathe, W. C., III, W. D. Burke, D. G. Eickbush, and T. H. Eickbush. 1995. Evolutionary stability of the R1 retrotransposable element in the genus Drosophila. Mol. Biol. Evol. 12:1094-1105.
Leaver, M. J. 2001. A family of Tc1-like transposons from the genomes of fishes and frogs: evidence for horizontal transfer. Gene 271:203-214.
Legendre, P., Y. Desdevises, and E. Bazin. 2002. A statistical test for host-parasite coevolution. Syst. Biol. 51:217-234.
Lobo, N., X. Li, and M. J. Fraser. 1999. Transposition of the piggyBac element in embryos of Drosophila melanogaster, Aedes aegypti and Trichoplusia ni. Mol. Gen. Genet. 261:803-810.
Lohe, A. R., E. N. Moriyama, D. A. Lidholm, and D. L. Hartl. 1995. Horizontal transmission, vertical inactivation, and stochastic loss of mariner-like transposable elements. Mol. Biol. Evol. 12:62-72.
Long, E. O., and I. B. Dawid. 1979. Expression of ribosomal DNA insertions in Drosophila melanogaster. Cell 18:1185-1196.
Malik, H. S., and T. H. Eickbush. 1999. Retrotransposable elements R1 and R2 in the rDNA units of Drosophila mercatorum: abdominal abdomen revisited. Genetics 151:653-665.
Malik, H. S., W. D. Burke, and T. H. Eickbush. 1999. The age and evolution of non-LTR retrotransposable elements. Mol. Biol. Evol. 16:793-805.
Nei, M., and S. Kumar. 2000. Molecular evolution and phylogenetics. Oxford University Press, New York.
Omilian, A. R., and D. J. Taylor. 2001. Rate acceleration and long-branch attraction in a conserved gene of cryptic Daphniid (Crustacea) species. Mol. Biol. Evol. 18:2201-2212.
Page, R. D. M. 1990. Temporal congruence and cladistic analysis of biogeography and cospeciation. Syst. Zool. 39:205-226.
Penton, E. H., Hebert, P. D. N., and T. J. Crease. 2004. Mitochondrial DNA variation in North American populations of Daphnia obtusa: continentilism or cryptic endemism? Mol. Ecol. 13:97-107.
Penton, E. H., B. W. Sullender, and T. J. Crease. 2002. Pokey, a new DNA transposon in Daphnia (Cladocera: Crustacea). J. Mol. Evol. 55:664-673.
Pérez-González, C. E., W. D. Burke, and T. H. Eickbush. 2003. R1 and R2 retrotransposition and deletion in the rDNA loci on the X and Y chromosomes of Drosophila melanogaster. Genetics 165:675-685.
Pérez-González, C. E., and T. H. Eickbush. 2001. Dynamics of R1 and R2 elements in the rDNA locus of Drosophila simulans. Genetics 158:1557-1567.
Pérez-González, C. E., and T. H. Eickbush. 2002. Rates of R1 and R2 retrotransposition and elimination from the rDNA locus of Drosophila melanogaster. Genetics 162:799-811.
Person, W. R., T. Wood, Z. Zhang, and W. Miller. 1997. Comparison of DNA sequences with protein sequences. Genomics 46:24-36.
Posada, D., and K. A. Crandall. 1998. MODELTEST: testing the model of DNA substitution. Bioinformatics 14:817-818.
Robertson, H. M. 1993. The mariner element is widespread in insects. Nature 362:241-245.
Robertson, H. M., and D. J. Lampe. 1995. Recent horizontal transfer of a mariner transposable element among and between Diptera and Neuroptera. Mol. Biol. Evol. 12:850-862.
Robertson, H. M., and E. G. MacLeod. 1993. Five major subfamilies of mariner transposable elements in insects, including the Mediterranean fruit fly, and related arthropods. Insect Mol. Biol. 2:125-139.
Saitou, N., and M. Nei. 1987. The neighbour-joining method: a new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4:406-425.
Sarkar, A., C. Sim, Y. S. Hong, J. R. Hogan, M. J. Fraser, H. M. Robertson, and F. H. Collins. 2003. Molecular evolutionary analysis of the widespread piggyBac transposon family and related "domesticated" sequences. Mol. Genet. Genomics 270:173-180.
Shimodaira, H., and M. Hasegawa. 1999. Multiple comparisons of log-likelihoods with application to phylogenetic inference. Mol. Biol. Evol. 16:1114-1116.
Stuart-Rogers, C., and A. J. Flavell. 2001. The evolution of Ty1-copia group retrotransposons in gymnosperms. Mol. Biol. Evol. 18:155-163.
Sullender, B. W. 1993. Preliminary characterization and population genetic survey of the Daphnia rDNA transposable element, Pokey. Ph.D dissertation, University of Oregon, Eugene.
Sullender, B. W., and T. J. Crease. 2001. The behavior of a Daphnia pulex transposable element in cyclically and obligately parthenogenetic populations. J. Mol. Evol. 53:63-69.
Swofford, D. L. 2002. PAUP*: phylogenetic analysis using parsimony (*and other methods). Version 4. Sinauer Associates, Sunderland, Mass.
Templeton, A. R., H. Hollocher, S. Lawler, and J. S. Johnston. 1989. Natural selection and ribosomal DNA in Drosophila. Genome 31:296-303.
Wang, H. H., M. J. Fraser, and L. C. Cary. 1989. Transposon mutagenesis of baculoviruses: analysis of TFP3 lepidopteran insertions at the FP locus of nuclear polyhedrosis viruses. Gene 81:97-108.
Zupunski, V., F. Gubensek, and D. Kordis. 2001. Evolutionary dynamics and evolutionary history in the RTE clade of non-LTR retrotransposons. Mol. Biol. Evol. 18:1849-1863.(Erin H. Penton1 and Teres)
E-mail: tcrease@uoguelph.ca.
Abstract
Pokey is a member of the piggyBac (previously called the TTAA-specific) family of transposons and inserts into a conserved region of the large subunit ribosomal RNA gene. This location is a "hot spot" for insertional activity, as it is known to contain other arthropod transposable elements. However, Pokey is unique in that it is the first DNA transposon yet known to insert into this region. All other insertions are class I non-LTR retrotransposons. This study surveyed variation in Pokey elements through phylogenetic analysis of the 3' ends of Pokey elements from ribosomal DNA (rDNA) in species from the nominate subgenus of the genus Daphnia (Crustacea: Cladocera). The results suggest that Pokey has been stably, vertically inherited within rDNA over long periods of evolutionary time. No evidence was found to support horizontal transfer, which commonly occurs in other DNA transposons, such as P and mariner. Furthermore, Pokey has diverged into sublineages that have persisted across speciation events in some groups. In addition, a new highly divergent paralogous Pokey element was discovered in the rDNA of one species.
Key Words: transposon ? Pokey ? R1/R2 ? piggyBac ? rDNA ? Daphnia ? horizontal transfer
Introduction
Transposable element evolution within host genomes is a dynamic process that involves both vertical transmission within a species and horizontal transfer between species. A typical transposon life cycle is thought to consist of vertical transmission through progeny, the eventual silencing of active elements by the host, and stochastic loss of inactive elements. The latter two stages are both balanced by the reintroduction of elements through horizontal transfer (Lohe et al. 1995; Hartl, Lohe, and Lozovskaya 1997). Life cycles vary both within and between the different classes of transposons. Class I retrotransposons tend to exhibit long periods of vertical transmission (Malik, Burke, and Eickbush 1999; Stuart-Rogers and Flavell 2001), with rare interspecific transfer events (Gonzalez and Lessios 1999; Zupunski, Gubensek, and Kordis 2001). In contrast, class II DNA transposons, such as the P and mariner families, display more frequent cycles of loss and horizontal transfer, which, at times, leads to patchy distributions across species (Clark and Kidwell 1997; Hartl, Lohe, and Lozovskaya 1997).
Some transposons undergo stable vertical transmission, and, as a result, they are maintained within a species over long evolutionary periods (Eickbush and Eickbush 1995). Such cases illustrate not only that transposons can be a stable component of a host's genome but also that horizontal transfer is not always required to evade extinction. Often, cases of stable vertical transmission involve retrotransposable elements that have high copy numbers per genome, most of which are defective. R1 and R2, which insert into a conserved region of the arthropod large subunit ribosomal RNA (LSU rRNA) gene, are two such examples. They are both non–long terminal repeat (non-LTR) retrotransposons that have been found in a multitude of divergent arthropods (Jakubczak, Burke, and Eickbush 1991; Burke et al. 1993, 1998). Moreover, their phylogenetic relationships are consistent with those of their host species (Eickbush and Eickbush 1995), which suggests that they have been present in arthropods from the origin of the phylum, some 500 MYA (Burke et al. 1999) and are stable components of their genomes. These two elements insert in a sequence-specific manner within the LSU rRNA gene, at sites only 74 bp apart (Eickbush 2002).
Several other highly divergent families of transposons have been documented in arthropods within the conserved region of the LSU rRNA gene occupied by R1 and R2 (Burke, Muller, and Eickbush 1995; Sullender 1993). The high frequency of elements specific to this gene could be explained if rRNA genes are a suitable "habitat" for the persistence of insertion sequences. Indeed, Kidwell and Lisch (2001) suggested that genomes are composed of various "ecological niches" that are exploited by different types of transposons. The LSU rRNA gene occurs in tandemly repeated ribosomal DNA (rDNA) units, which form a multigene family that undergoes concerted evolution through gene conversion and unequal crossing over (Coen, Thoday, and Dover 1982; Arnheim 1983; Dvorak, Jue, and Lassner 1987). As a result, the transposons that occupy this niche are subjected to the same homogenization forces as are the genes themselves (Eickbush and Eickbush 1995). In fact, Eickbush and Eickbush (1995) suggested that it is these characteristics that make the rDNA "niche" ideal for the propagation and long-term persistence of transposable elements.
The presence of another site-specific transposon, Pokey, has been detected in the "hot spot" region of the LSU rRNA gene (Sullender 1993). In fact, Pokey inserts only base pairs away from R1 and R2 (fig. 1). Pokey is a class II DNA transposon that possesses terminal inverted repeats of 16 bp and a single 1.5-kb open reading frame (ORF) that codes for a putative transposase (Penton, Sullender, and Crease 2002). It creates a 4-bp target-site duplication (TTAA) on insertion, which makes it a member of the piggyBac family of transposons (Cary et al. 1989; Wang, Fraser, and Cary 1989; Beames and Summers 1990; Penton, Sullender, and Crease 2002; Sarkar et al. 2003). Pokey was originally found within a specific TTAA site in approximately 10% of the LSU rRNA genes of the cladoceran crustacean Daphnia pulex, although Pokey also inserts at many other genomic locations (Sullender 1993; Sullender and Crease 2001).
FIG. 1. Location of the arthropod transposable elements in the LSU rRNA gene. A portion of the Daphnia pulicaria gene is shown. Arrows indicate the insertion site of the various elements based on the 3'-end junction of the element within the rRNA gene. Vertical lines represent top and bottom strand cleavage sites generated by the endonuclease encoded by each element based on the R2 model of integration. The diagram is modified from Burke, Muller, and Eickbush (1995). ETS = external transcribed spacer, ITS = internal transcribed spacer, IGS = intergenic spacer, 18S = small subunit ribosomal RNA gene, and 28S = large subunit ribosomal RNA gene
Here, we report the results of a phylogenetic analysis of the 3' end of Pokey elements that were amplified from rDNA in species from the nominate subgenus of the genus Daphnia (Crustacea: Cladocera). In addition, we used sequences from two genes, the mitochondrial small subunit (mtSSU) rRNA gene and the nuclear LSU rRNA gene, to construct phylogenies of the "host" Daphnia species from which the elements are derived. We show that Pokey is widely distributed in this subgenus and that the phylogeny of the elements is consistent with that of the host species, which suggests that Pokey has been stably inherited within rDNA over long periods of evolutionary time. Furthermore, we find that Pokey has diverged into multiple sublineages that have persisted across speciation events.
Methods
Daphnia Samples
We analyzed 32 isolates representing 14 species of the subgenus Daphnia (table 1). Total genomic DNA was extracted by use of the Isoquick kit (Orca Research) from single animals that were flash frozen in liquid nitrogen in the field or from multiple animals that were propagated parthenogenetically from a single female in the laboratory by the standard culture technique of Hebert and Crease (1980).
Table 1 Daphnia Samples Used in This Study.
DNA Amplification and Sequencing
We used the polymerase chain reaction (PCR) to amplify an approximately 1,820-bp fragment of the 3' end of Pokey elements located in rDNA by use of an internal Pokey primer, Pok5026F (5'-TCGAACCTGCAGCCGGACGAATTTGCAG-3'), and a primer located in the LSU rRNA gene about 200 bp downstream of the element insertion site, 28SBR (5'-CGTCTCCCACTTATGCTACACCTC-3'). The Pokey primer is located in the ORF, which may code for a transposase (Penton, Sullender, and Crease 2002). A few samples that did not amplify well were amplified using an alternate reverse primer 28SR (5'-TCCATTCGTGCGCGTCACTAATTAGATGAC-3'), which is located only 46 bp downstream of the TTAA target site. All PCR reactions were of 50 μl total volume and contained 1.5 mM MgCl2, 5 pmol of each primer, 40 μM dNTPs, 10 mM Tris-HCl (pH8.3), 50 mM KCl, 10 to 50 ng of genomic DNA, and 1 unit of Taq DNA polymerase (Roche). The amplification reactions were performed in an MJ PTC-100 thermal cycler (MJ Research Inc.). The thermocycling profile consisted of 1 cycle of 1 min at 94°C, 35 cycles of 30 sec at 94°C, 30 sec at 55°C and 2 min at 72°C, with a final incubation of 5 min at 72°C.
To verify the taxon identification of each Daphnia DNA sample, we amplified a 596-bp region of the mtSSU rRNA gene by use of primers 12SA (5'-CCAGTACATCTACTTTGTTACGAC-3') and 12SB (5'-AAATCGTGCCAGCCGTCGCGG-3') (Colbourne and Hebert 1996) and sequenced the PCR product directly. The sequences were compared with those obtained by Colbourne and Hebert (1996) in their phylogenetic analysis of the genus Daphnia in North America. In addition, we amplified two regions of the LSU rRNA gene. A 694-bp region that spans the insertion site of Pokey was amplified by use of primers 28SF (5'-CTGCCCAGTGCTCTGAATGTCAAAGTGAAG-3') and 28SCR (5'-GATGTACCGCCCCAGTCAAACTCC-3'). A 1134-bp region upstream of the Pokey insertion site was amplified by use of primers 28S77F (5'-AACCTCGCGCCCGGTTGAGC-3') and 28S1211R (5'-TCCGACGATCGATTTGCACG-3'). The PCR conditions were identical to those described above.
All PCR products were electrophoresed on 0.8% TAE agarose gels, stained with ethidium bromide, and visualized under UV light. The DNA fragments were excised from the gel and purified using either a QIAEX II Agarose Gel Extraction kit (Qiagen) or a freeze/thaw method. The freeze/thaw protocol was as follows: after excision, the agarose slice was frozen in the top of a filter-plugged pipette tip (50 μL), thawed, and then spun at maximum speed for 10 min in a 1.5 mL microfuge tube. The resulting eluant was precipitated in ethanol. All purified samples were sequenced by use of 20 to 50 ng of template with 5 pmol of the primers PokF, 28SBR, 28SR, 12SA, 28SF, 28SCR, 28S77F, and 28S1211R as appropriate, using the ABI Prism TaqFS dye terminator kit (PerkinElmer). The sequences were resolved on an ABI 377 automated sequencer. Sequences that were well resolved on the electropherograms were only sequenced in one direction. However, sequences longer than 700 bp were sequenced from both ends so that overlapping data were available for the middle region of the fragments. Samples that provided low quality sequence data were cloned by use of the TOPO TA Cloning kit for Sequencing (Invitrogen). For this process we used high fidelity JumpStartTM Taq DNA polymerase (Sigma-Genosys) to generate the cloning template to avoid error associated with the misincorporation of nucleotides by the Taq DNA polymerase. When necessary, we sequenced these clones with an internal primer, Pok5338F (5'-TGTCTRGTGAAYAGCTGGATATGC-3').
Sequences of Pokey from the rDNA of Daphnia pulicaria (D. pulicaria-SK-C1 and D. pulicaria-SK-C2) were taken from GenBank (accession numbers AY115589 and AY115590). The Pokey sequence D. pulex-IL+IN-consensus is the consensus of 20 cloned sequences from two isolates of D. pulex from IL and IN. In addition, the Pokey sequences D. pulex-PQ-C1, D. obtusa NA1-TN-C11, and European D. pulex-GR2-C9 were taken from full-length rDNA Pokey clones generated for another study (Penton, unpublished data). Sequences of the LSU rRNA gene for D. pulicaria (accession number AF346514) and D. ambigua (accession number AF346513) were taken from Omilian and Taylor (2001).
Alignment and Phylogenetic Analyses
Pokey, mtSSU rRNA gene, and LSU rRNA gene sequences were all aligned by the Align program (Person et al. 1997) and/or by eye with the aid of Bioedit (Hall 1999). Use of the complete gene sequences reported by Omilian and Taylor (2001) in their phylogenetic reconstruction of daphniids facilitated alignment of the LSU rRNA gene sequences. Highly variable regions of expansion helices were removed from this alignment before phylogenetic analysis.
We performed a Bayesian phylogenetic analysis for all three data sets by application of MrBayes (http://morphbank.ebc.uu.sc/mrbayes) (Huelsenbeck 2000). We first chose the model of DNA substitution by characterizing the sequences with Modeltest version 3.0 (http://inbio.byu.edu/Faculty/kac/crandall_lab/modeltest.htm) (Posada and Crandall 1998). The results of these analyses were then incorporated into the Bayesian phylogenetic analyses. The number of generations run for all three data sets was 500,000. All trees constructed before confluence (10,000 generations) were discarded as "burn-in" (Huelsenbeck 2000).
We performed cladistic analyses by implementation of the maximum-parsimony criterion in PAUP version 4.0b10 (Swofford 2002) and use of the program's heuristic search algorithm and tree-bisection and reconnection (TBR) feature. The results of the Modeltest analyses were also incorporated into these analyses. Sequences were added randomly in 50 replicate trials, with one tree held at each step. Bootstrap values for maximum-parsimony trees were based upon 1,000 pseudoreplicates.
We used the Kimura two-parameter model (Kimura 1980) to estimate sequence divergence for all three genes in MEGA version 2.1 (Kumar, Tamura, and Nei 1993). Phenetic analysis of the resulting distance matrices was performed by use of the neighbor-joining (NJ) method in MEGA (Saitou and Nei 1987), with pairwise deletion of missing sites. The bootstrap percentages from 1,000 pseudoreplicates were calculated in MEGA. We constructed NJ trees separately for the different domains (coding versus noncoding) of the Pokey transposon to determine the presence of any region-specific differences in the pattern of Pokey sequence divergence.
We rooted all phylogenetic trees through the sequence of D. ambigua based on Colbourne and Hebert's (1996) phylogenetic analysis of the genus Daphnia from North America. We performed tests of neutrality on the ORF region of the Pokey sequences by application of the Z-test (Nei and Kumar 2000) as implemented in MEGA. The Nei-Gojobori p-distance model with pairwise deletion of missing data was used to estimate the number of the synonymous and nonsynonymous substitutions between pairs of sequences.
Statistical Analysis of Phylogenetic Congruency
We used three statistical approaches to test whether the topologies constructed from the Pokey sequences and the gene sequences from the Daphnia "host" species are congruent. The first approach, ParaFit (Legendre, Desdevises, and Bazin 2002), which is available on-line at http://www.fas.umontreal.ca/biol/casgrain/en/labo/parafit.html, evaluates the correlation between sequence divergence in the "parasite" Pokey and its host species. Both the overall coevolutionary structure (ParaFitGlobal statistic) and the significance of individual host-parasite links (ParaFitLink1 statistic) are evaluated. We first generated sequence divergence matrices for the Pokey sequences and the mtSSU rRNA gene sequences from the Daphnia hosts in MEGA and then translated them into principle coordinate data with the program DistPCoA (http://www.fas.umontreal.ca/biol/casgrain/en/labo/distpcoa.html). The principle coordinate data were then used in the ParaFit analysis. Correction for any negative eigenvalues was obtained by the Lingoes method. The number of permutations in the ParaFit analysis was 999.
The divergence estimates between Daphnia LSU rRNA gene sequences were too small to be evaluated with ParaFit. Thus, we also used the tree-mapping procedure of Page (1990) and the Shimodaira-Hasegawa (SH) test (Shimodaira and Hasegawa 1999) to evaluate congruence between the topology of the Pokey tree and the trees generated from the mtSSU rRNA and LSU rRNA genes of the Daphnia hosts. The tree-mapping analysis was done by the RECONCILE WITH TREE routine in the MAPPING program from the software package COMPONENT, available on-line at http://taxonomy.zoology.gla.ac.uk/rod/cpw/index.html. The program provides a host tree that is fully reconciled with the parasite tree under the assumption that no host switching by the parasite has occurred. When complete congruence exists between the host and parasite trees, the number of terminal nodes on the reconciled host tree will equal the number of host taxa. However, incongruencies with the parasite tree will require that host taxa be "gained" or "lost" multiple times on the reconciled tree. To evaluate the statistical significance of the association between host and parasite topologies, we used the RANDOM option to generate 1,000 random Daphnia trees, and then mapped each of these trees onto the Pokey tree as above. We compared the distribution of "gains" and "losses" for these 1,000 reconciled Daphnia trees to the number of "gains" and "losses" required for the observed Daphnia tree.
To perform the SH test, which was implemented in PAUP*, we compared the NJ tree generated from the Pokey sequences to constrained topologies generated from the mtSSU rRNA and the LSU rRNA gene sequences from the Daphnia hosts.
Results
Phylogenetic Analyses
All sequences generated for this study are available on GenBank. Accession numbers for the mtSSU rRNA gene sequences are AY626352 to AY626366, accession numbers for the LSU rRNA gene sequences are AY630599 to AY630618, and accession numbers for the Pokey sequences are AY630579 to AY630598. All of the alignments are available at the Molecular Biology and Evolution Web site as Supplementary Material online.
Pairwise sequence divergence between mtSSU rRNA gene sequences ranges from 0.2% to 21.2%. Of the total 502 nucleotide positions, 177 are polymorphic, 119 of which are phylogenetically informative. The phylogeny (fig. 2) produced from these sequences is consistent with that of Colbourne and Hebert (1996). However, support is low for a few nodes that define the relationships between major groups. Nonetheless, the monophyly of these major groups is well supported.
FIG. 2. Neighbor-joining phylogenies based on variation in Pokey and the mtSSU rRNA gene from species in the subgenus Daphnia. Solid lines denote significant coevolutionary relationships between parasite (Pokey) and host (Daphnia), wheras dotted lines represent nonsignificant host-parasite links. Numbers beside major nodes represent bootstrap support (1,000 pseudoreplicates) from neighbor-joining and maximum-parsimony analyses and clade credibility values from Bayesian analysis (NJ/MP/BA). Asterisks indicate no support in the MP or BA analyses for the topology shown. The scale bar indicates sequence divergence. Taxon labels are as shown in table 1. Pokey sequences obtained from cloned PCR products are denoted with a clone number as in D. pileata-OK-C1. Significant rate variation occurs across sites in Pokey based on the HKY85 model of DNA substitution (number of substitution types = 2) and the gamma model (P < 0.000001). The gamma distribution shape parameter is 0.5513 and the Ti/tv ratio is 0.9671. Significant rate variation occurs across sites in the mtSSU rRNA gene based on the TVM model of DNA substitution (number of substitution types = 6) and the gamma model (P < 0.000001). The gamma distribution shape parameter is 0.2382 and the substitution rate matrix is R[A-C] = 1.0834, R[A-G] = 11.7885, R[A-T] = 2.7280, R[C-G] = 0.1182, R[C-T] = 11.7885, R[G-T] = 1.0000
Because of the slow evolution of reproductive isolation in the genus Daphnia, interspecific hybridization is not uncommon (Colbourne and Hebert 1996). As mitochondria exhibit uniparental transmission, cases of hybridization could go undetected. In addition, support is low for some nodes in the mitochondrial trees. Thus, we also constructed a phylogeny based on a nuclear gene. For the combined LSU rRNA gene fragments, 142 nucleotides of a total 1,481 are polymorphic, 55 of which are phylogenetically informative. Pairwise sequence divergence for these sequences ranges from 0.1% to 6.4%. Phylogenies produced from the LSU rRNA gene (fig. 3) are generally consistent with the trees constructed from the mtSSU rRNA gene. Although support is low for a few nodes that define relationships among major groups, the monophyly of those major groups is well supported. The only major difference between the two Daphnia phylogenies concerns the sister group relationship between D. parvula/D. retrocurva and the D. obtusa group in the LSU rRNA gene tree. In the mtSSU rRNA gene tree, D. parvula/D. retrocurva clusters with D. catawba/D. minnehaha (fig. 2).
FIG. 3. Neighbor-joining phylogenies based on variation in Pokey and the LSU rRNA gene from species in the subgenus Daphnia. Dotted lines connect Pokey isolates to corresponding host genomes. These relationships were not tested statistically and are included for display only. Numbers beside major nodes represent bootstrap support (1,000 pseudoreplicates) from neighbor-joining and maximum-parsimony analyses and clade credibility values from Bayesian analysis (NJ/MP/BA). Asterisks indicate no support in the MP or BA analyses for the topology shown. The scale bar indicates sequence divergence. Significant rate variation occurs across sites in the LSU rRNA gene based on the TrN model of DNA substitution (number of substitution sites = 6) and the invariant-gamma model (P = 0.000002). The gamma distribution shape parameter is 0.7347, the proportion of invariable sites is 0.7437, and the substitution rate matrix is R[A-C] = 1.0000, R[A-G] = 3.9170, R[A-T] = 1.0000, R[C-G] = 1.0000, R[C-T] = 7.9080, R[G-T] = 1.0000. Details for the Pokey tree are given in figure 2
In the Pokey sequence alignment, 824 nucleotides of 1,711 are polymorphic, 525 of which are phylogenetically informative. The pairwise divergence between unique sequences ranges from 0.1% to 44.0%. This sequence contains approximately 355 bp of the 3' end of the ORF, which may encode a transposase and approximately 1,350 bp of noncoding DNA. The pairwise sequence divergence for the ORF is much smaller than that of the noncoding sequence and ranges from 0% to 18% (average of 6.5%). That of the noncoding region ranges from 0.1% to 54.7% (average of 23.2%).
Most Pokey sequences share the same stop codon. However, D. obtusa (NA1 and NA2) and D. catawba possess different deletions that both lead to the same downstream stop codon, which adds 7 and 8 amino acids (aa) to the proteins, respectively. On the other hand, D. pileata has a 1-bp insertion that leads to a premature stop codon, which truncates the protein by 3 aa. Finally, D. ambigua has a large deletion at the 3' end of the ORF, which leads to the absence of a nearby stop codon. Because of these differences, an ORF data set was constructed that contained the 315 nucleotide positions up to and including the codon directly upstream of the large deletion in the sequence of D. ambigua. This regions contains 78 variable nucleotide sites leading to 24 aa changes. The majority of the nucleotide changes (66.67%) are at third codon positions (52 of 78), whereas 17 (21.79%) and 9 (11.54%) are at first and second codon positions, respectively. The high proportion of third position differences suggests that the ORF is conserved because of functional constraint. Indeed, Penton, Sullender, and Crease (2002) and Sarkar et al. (2003) have shown that the putative protein encoded by this ORF is similar to the functional transposase of the piggyBac element. Moreover, Sarkar et al. (2003) found a putative DDD amino acid motif located in the middle of a conserved core region (D268, D346, and D447) of the transposases from the piggyBac family. The widely documented DDD motif is believed to be the functional catalytic domain of the transposase proteins from the Tc1 and mariner groups (Doak et al. 1994; Robertson and Lampe 1995). As this study only examined the 3' end of the ORF, we could not detect the entire DDD motif. However, the third aspartic acid (D) residue is present in all Pokey sequences examined here.
An overall Z-test of neutrality that included all Pokey sequences indicated that purifying selection, rather than positive selection, is responsible for the pattern of nucleotide substitution in the ORF across species (Z = 5.445, P<<<0.001). Because some species were represented by multiple sequences, we repeated the analysis but only included one sequence for each species (D. pulex-ON for D. pulex/D. pulicaria, Euro D. pulex-GR2-C9, D. obtusa NA1-OK-C1, D. obtusa NA2-IL2-C4, and D. parvula-ON). Again, the results strongly suggest that purifying selection is responsible for the pattern of nucleotide substitution observed among the species (Z = 6.53, P<<<0.001).
Trees constructed from the Pokey sequences by different phylogenetic methods are generally consistent with one another and the support is high for most nodes (fig. 2). Two additional Pokey trees were constructed; one based entirely on the coding regions from the ORF (see below) and the other based on the noncoding DNA (data not shown). These two phylogenies are generally consistent with the one shown in figure 2, which indicates a lack of region-specific differences in the pattern of Pokey sequence divergence.
Statistical Analysis of Phylogenetic Congruency
The ParaFit global test of an association between Pokey sequence divergence and that of its host's mtSSU rRNA gene is highly significant (i.e., coevolutionary) (ParaFitGlobal = 0.02092, P = 0.001). The analysis of individual host-parasite links (fig. 2) shows only the D. cheraphila link to be nonsignificant (i.e., random) (ParaFitLink1 = 0.00065, P = 0.115), even though the topology of the two trees is congruent in this region.
Fourteen Daphnia sequences appear on the mtSSU rRNA gene tree (Euro D. pulex was only included once in the topology) and 17 sequences appear on the Pokey tree used in the tree-mapping analysis. The reconciled Daphnia tree requires the addition of 33 nodes and the loss of 11 nodes. The frequency distribution of nodes added in 1,000 random Daphnia trees ranges from 66 to 151, and the distribution of nodes lost ranges from 31 to 74. Both of these ranges are significantly higher than the number of gains and losses observed in the reconciled Daphnia tree. Thus, the association between the host mtSSU rRNA gene tree and the Pokey tree is highly nonrandom. Twelve Daphnia sequences appear on the LSU rRNA gene tree (this gene was not analyzed in D. cheraphila and D. arenata), and, thus, the Pokey sequences were correspondingly reduced to 15. The reconciled Daphnia tree requires the addition of 23 nodes and the loss of 11 nodes. The frequency distribution of nodes added in 1,000 random Daphnia trees ranges from 38 to 105, and the distribution of nodes lost ranges from 21 to 60. Again, the association between the host LSU rRNA gene tree and the Pokey tree is highly nonrandom.
The only major difference between the Pokey tree and the Daphnia LSU rRNA gene tree is the position of D. pileata, but two major differences exist between the Pokey tree and the Daphnia mtSSU rRNA gene tree: (1) the relationships among D. pulex/D. pulicaria/D. arenata, Euro D. pulicaria, and Euro D. pulex and (2) the position of the D. parvula/D. retrocurva group. A separate SH test was used to determine whether each of these three differences was significant. A single representative of Pokey from each species was used in this analysis (D. pulex-ON for D. pulex/D. pulicaria/D. arenata, Euro D. pulex-GR2-C9, D. obtusa NA1-OK-C1, D. obtusa NA2-IL2-C4, and D. parvula-ON). The difference between the LSU rRNA gene topology and the Pokey topology is not significant, but both differences between the mtSSU rRNA gene topology and the Pokey topology are significant (table 2).
Table 3 Mean Sequence Divergence (Above the Diagonal) Among Pokey Elements trom Groups of Daphnia Identified on the Neighbor-Joining Phylogram.
Intraspecific Variation in Pokey
To determine the level of Pokey sequence divergence within a geographically widespread species, we analyzed a smaller segment (763 bp) of the 3' end of Pokey from additional isolates of D. obtusa from across its North American range (table 1 and fig. 4). Based on analysis of the mitochondrial cytochrome c oxidase subunit I gene, Penton, Hebert, and Crease (2004) showed that D. obtusa in the United States diverged into two morphologically cryptic species, denoted NA1 and NA2, at least 12 MYA. Furthermore, D. obtusa NA1 radiated into four lineages with largely allopatric distributions during the Pleistocene (< 1 MYA). Even so, the mean sequence divergence between Pokey sequences from all D. obtusa isolates, including the two species, is only 0.7%. However, cloned sequences from two isolates, NA1-TN and NA2-IL2, are substantially more divergent from the others with a mean sequence divergence of 1.2%. When these two isolates are removed from the analysis, the mean divergence between the remaining Pokey sequences decreases to 0.4%.
FIG. 4. Map showing the 13 collection sites of North American Daphnia obtusa used in this study
We cloned and sequenced the larger Pokey fragment (1,618 bp) from isolates NA1-NV, NA1-OK, NA1-TN, NA2-IL2, and NA2-OH. The mean divergence between these larger Pokey fragments is 3.9%. This increase reflects the fact that the additional sequence includes a hypervariable region. In this case, the NA1-TN and NA2-IL2 Pokey fragments show 6.6% sequence divergence from the other three sequences, whereas the mean divergence among the other sequences is only 0.9%. These results indicate the existence of at least two lineages of Pokey elements in D. obtusa and that both occur in each of the two species.
Inspection of the electropherograms generated by direct sequencing of PCR fragments from several of the other D. obtusa isolates shows that "double peaks" often occur at nucleotide positions where the two types of Pokey elements differ. In fact, the sequence in one highly variable region where most of the differences occur was impossible to read because of the presence of both of these lineages within these individuals. This situation is similar to the one seen in D. pulicaria, where two different lineages of Pokey elements, differing by 6% sequence divergence, were PCR amplified and cloned from the genome of a single individual (Penton, Sullender, and Crease 2002). These two sequences are included in the Pokey tree in figures 2 and 3 (D. pulicaria-SK-C1 and D. pulicaria-SK-C2). Note that the sequences from D. pulex are clearly more closely related to the SK-C1 sequence. To date, no indication has emerged that the second lineage (SK-C2) is present in the D. pulex isolates that have been sampled, but a conclusion that it has been lost from this species altogether is clearly premature.
Two isolates of D. parvula (table 1) were also included in the Pokey analysis, one from Ontario (ON) and a second from Arkansas (AR), and the divergence between them is 0.8%. In contrast, the sequence divergence between Pokey elements from different species is often substantial (table 3). For example, sequence divergence ranges from 39.1% to 42.6% (average of 41.0%) between Pokey from the most divergent taxon, D. ambigua, and from all of the other taxa. Overall, this finding suggests that each species contains one or more highly homogenous lineages of rDNA Pokey elements whose interspecific sequence divergence is highly correlated with that of their hosts.
A New Paralogous Pokey Lineage
Two of the cloned Pokey elements from D. obtusa NA2-OH that we sequenced were so divergent from the others that we were not able to completely align their noncoding regions. An NJ phylogeny produced from the ORF region (fig. 5) shows that these elements form a sister group to the sequences obtained from all of the other species in the subgenus, and, thus, they represent a new paralogous Pokey family in the rDNA of the subgenus Daphnia. This new family is hereafter denoted as PokeyB, and the original family is denoted as PokeyA.
FIG. 5. Neighbor-joining phylogeny based on a portion of the open reading frame from Pokey elements in the rDNA of species from the subgenus Daphnia. The sequence alignment includes a 315-bp segment from the 3' end of the putative Pokey transposase protein. Numbers beside nodes represent bootstrap support from 1,000 pseudoreplicates. The scale bar indicates nucleotide sequence divergence
We incorporated the ORF sequence of PokeyB into the alignment of known piggyBac elements (which includes PokeyA) generated by Sarkar et al. (2003), calculated a pairwise matrix of the number of amino acid differences, and then used it to generate an NJ tree (data not shown). PokeyA and PokeyB cluster with one another with strong bootstrap support (999/1,000 replicates) on this tree, which indicates that they are both members of the same major group of transposons.
The D. obtusa NA2-OH isolate from which PokeyB was isolated also contains PokeyA elements (fig. 5), so clearly these two families can coexist in the same genome. Analysis of the complete nucleotide fragment for PokeyB (fig. 6) reveals that the 3' end of its ORF is intact, although the location of its stop codon would produce a protein that is 11 aa shorter than the one produced by the PokeyA elements from most species. Sequence divergence between the consensus ORF of PokeyA and PokeyB is only 16%, whereas that of the noncoding DNA is greater than 50%. This finding suggests that the ORF is conserved across Pokey families because of functional constraint, and that PokeyB is likely to be an autonomous transposon family. A hypervariable region that is not alignable across the two Pokey families is also highly variable within PokeyA and is, thus, not likely to be necessary for transposon function.
FIG. 6. Schematic diagram of the alignment between the paralogous PokeyA and PokeyB families from species in the subgenus Daphnia. Percentages represent the amount of sequence divergence between different regions of the two families. Solid boxes represent the 16-bp 3'-terminal inverted repeats, and the vertically striped boxes denote a hypervariable region that is not alignable between the two families. Stippled boxes represent a segment of the 3' end of a putative transposase open reading frame. The total number of amino acid substitutions found between the consensus transposase sequences of PokeyB and PokeyA is given
Discussion
Colbourne and Hebert (1996) found weak support for the relationships among major groups in their phylogenetic reconstruction of North American Daphnia. Similarly, the relationships between major groups are not well supported in either the mtSSU rRNA gene or the LSU rRNA gene phylogenies constructed here (figs. 2 and 3). A comparison between these two host phylogenies reveals one major difference, the sister group relationship of D. parvula/D. retrocurva to the D. obtusa group in the LSU rRNA gene tree. Hybridization is unlikely to be the explanation for this occurrence, as these events have only been documented between species possessing mtSSU rRNA gene sequence divergence of less than 14% (Colbourne and Hebert 1996, and references therein). Other than this difference, and despite the low support of relationships among major groups on both trees, phylogenies based on the mtSSU rRNA gene and the LSU rRNA gene are consistent with one another. In addition, trees based on Pokey sequences are generally congruent with the phylogenies based on its host genome (figs. 2 and 3). Furthermore, the sister group relationship of D. parvula/D. retrocurva to the D. obtusa group is strongly supported on the Pokey tree. Indeed, the Pokey elements from all members of this monophyletic clade except D. cheraphila contain the same 74-bp insertion.
Two significant differences exist between the Pokey tree and the mtSSU rRNA gene tree, but the position of D. pileata is not well resolved in the latter. Moreover, the discrepancy in relationships among members of the D. pulex group could also be explained by hybridization, which is known to occur in this group (Colbourne and Hebert 1996). Overall, no conclusive evidence of horizontal transfer was detected, and the results of the ParaFit and Component analyses revealed a coevolutionary association between Pokey and its host. Altogether, these results suggest that Pokey has persisted in the rDNA locus of Daphnia species via stable vertical transmission.
Rare horizontal transfers of Pokey may occur, but they appear not to be necessary for the long-term survival of this element in the genomes of Daphnia species. The age of the genus Daphnia is on the order of 50 to100 Myr (Colbourne and Hebert 1996). Thus, we can infer that Pokey has existed in the rDNA of species in the subgenus Daphnia for long periods of evolutionary time. Whether this pattern of long-term persistence, or even if Pokey itself, occurs outside of this subgenus is unknown. Although PCR-based attempts to detect the presence of Pokey in species from the other two subgenera of Daphnia, Hyalodaphnia, and Ctenodaphnia, yielded negative results (unpublished data), the failure to detect the element could be a failure of the PCR primers to bind to sequences from such distantly related taxa. Studies that use Southern hybridization are now underway to determine the distribution of Pokey beyond the subgenus Daphnia.
The pattern of evolution in Pokey contrasts with the pattern documented in the two best-known DNA transposons, P and mariner. These two elements belong to vastly different superfamilies of class II elements, yet both exhibit short life cycles within a species and horizontal transfer of element copies between different species. For example, a horizontal transfer event of a P element from the genome of Drosophila willistoni to Drosophila melanogaster has been documented (Daniels et al. 1990; Clark and Kidwell 1997). In this case, P elements from the genomes of the two fly species differ by only a single nucleotide position despite the 30 to 50 Myr of divergence between them. Likewise, numerous horizontal transfer events have been described for mariner, a member of the DD34D variant of the DD35E motif group. For example, a mariner-like element from the earwig Forficula auricularia is more than 90% similar to that from the honeybee despite more than 265 Myr of divergence between these insect species (Robertson 1993; Robertson and MacLeod 1993). Horizontal transfer of mariner elements has even been demonstrated between species from distant phyla (Hartl, Lohe, and Lozovskaya 1997). Similarly, Leaver (2001) found evidence of horizontal transfer of another member of the DDD motif group, a Tc1-like element, between the genomes of frogs and fishes. Indeed, cases of horizontal transfer have been documented for Drosophila elements Minos (Arca and Savakis 2000) and Mos1-like (Brunet et al. 1999) as well as the Fusarium element Fot1 (Daboussi et al. 2002), all of which are members of the Tc1-mariner superfamily of elements.
Recent studies have questioned the prevalence of horizontal transfer and indeed, Capy, Anxolabehere, and Langin (1994) argue that phylogenetic inconsistencies may have been inaccurately attributed to horizontal transfer when they could be explained by other factors that are consistent with vertical transmission. For example, comparison of paralogous elements (i.e., different lineages), varying rates of sequence evolution of elements between species, and retention of ancestral polymorphisms (Capy, Anxolabehere, and Langin 1994) can lead to a transposon tree that is incongruent with that of the host species, even in the complete absence of horizontal transfer. The few discrepancies between the Pokey tree and the Daphnia gene trees in the present study are likely the result of one or more of the above factors, especially given the occurrence of multiple lineages of rDNA Pokey elements in the genome of both D. pulicaria (Penton, Sullender, and Crease 2002) and D. obtusa.
The occurrence of multiple, long-lived lineages such as we observed in Pokey has also been observed in the arthropod rDNA-specific non-LTR retrotransposons R1 and R2. For example, Gentile et al. (2001) detected two paralogous lineages of R1, A and B, in five species groups of Drosophila. All 35 species surveyed contained the A lineage, whereas 11 species, from three of the five species groups, also contained the B lineage. In addition, the A lineage had diverged into two sublineages, A1 and A2, in the melanogaster species group. All the species in this group had the A1 lineage, whereas the five species in the takahashii subgroup (which is within the melanogaster species group) also contained the A2 lineage. Results obtained by Burke et al. (1993) in an analysis of elements from five divergent insect species—Bombyx mori (Lepidoptera), D. melanogaster (Diptera), Sciara coprophila (Diptera), Popillia japonica (Colleoptera), and Nasonia vitripennis (Hymenoptera)—suggest that multiple paralogous lineages of R1 can persist for even longer periods of time. Four lineages of R1 were detected in N. vitripennis, but they were not monophyletic. One of them was most closely related to the element obtained from B. mori, whereas the other three formed a monophyletic cluster that grouped with the elements from D. melanogaster and S. coprophila (figure 6 in Burke et al. [1993]). Based on this limited sampling of taxa, at least two paralogous lineages of R1 appears to exist in insects.
As is the case for Pokey elements in Daphnia, intraspecific sequence divergence between copies of R1 and R2 elements from the same lineage in Drosophila is generally less than 1%, whereas sequence divergence between lineages is much higher (Gentile, Burke, and Eickbush 2001). Originally, such homogeneity of elements was thought to be maintained by concerted evolution within rDNA, but this hypothesis made an explanation of how new lineages could evolve difficult. However, Pérez-González and Eickbush (2001, 2002) showed that the sequence homogeneity within R1 and R2 lineages is most likely a function of rapid turnover of elements, which includes both elimination and insertion. Individual variants are rapidly eliminated by concerted evolution, which is primarily a consequence of intrachromosomal crossing over (Pérez-González, Burke, and Eickbush 2003), and then replaced by new copies so that sequence homogeneity is mainly a function of recent retrotransposition. Consequently, active elements give rise to progeny that diverge from one another independently, which provides opportunities for lineage sorting. The presence of a transposable element in an LSU rRNA gene results in a nonfunctional copy (Long and Dawid 1979; Kidd and Glover 1981). Thus, the number of LSU rRNA genes that can be inactivated by R1 and R2 without affecting the fitness of the host creates a limiting resource, especially at particular times during development when a very high proportion of active genes are required to meet the requirements of rRNA transcription. The resulting competition results in the persistance of only a small number of lineages within any one species (Pérez-González and Eickbush 2001, 2002).
Gentile, Burke, and Eickbush (2001) extracted DNA from single stocks of each of the Drosophila species that they analyzed, so they were not able to obtain information on geographic variation among R1 elements. The present study shows that sequence divergence within a lineage of Pokey is extremely low even across very broad geographic areas. This condition is difficult to attribute to common ancestry from a small pool of active elements, given the substantial divergence in allozymes (Hebert and Finston 1996) and mtDNA (Penton, Hebert, and Crease 2004) that is known to have occurred among the D. obtusa populations that were surveyed. More likely, concerted evolution accounts for such homogeneity, as it does for the rDNA itself. If this conclusion is correct, how then can new lineages of Pokey evolve and persist for such long periods of time?
Nothing is known about rates of Pokey transposition or its precise mechanism. However, if Pokey transposes by the same "cut and "paste" mechanism used by piggyBac (Elick, Bauser, Fraser 1996; Lobo, Li, and Fraser 1999) then transposition is not replicative, as is retrotransposition. Thus, increases in Pokey's copy number in rDNA could be facilitated by concerted evolution or by transposition of copies from other genomic locations, as active copies of Pokey are known to occur outside of rDNA (Sullender and Crease 2001). Thus, the possibility that such exchange could occur would also help explain how different lineages of Pokey are able to evolve despite the homogenizing effect of concerted evolution. Genomic copies of Pokey could act as a reservoir of new variants that have evolved independently of lineages currently occupying rDNA. A survey of sequence variation among rDNA and non-rDNA copies of Pokey screened from a cosmid library produced from a single individual of D. pulicaria is currently underway to determine if transposition does occur between rDNA and other genomic locations.
rDNA has been suggested to possess characteristics that make it ideal for the evolution of insertion sequences. Although the presence of a transposon within the LSU rRNA gene results in a nonfunctional copy, rDNA is highly repetitive, so individuals have more units than they need for survival. Thus, elements can exist to a certain threshold level in this specific location within a genome without any noticeable phenotypic effects at the organismal level (Eickbush and Eickbush 1995; Malik and Eickbush 1999). High levels of insertion of R1 and R2 have been exhibited in Drosophila mercatorum and D. melanogaster/D. hydei, where they are associated with the abnormal abdomen and bobbed phenotypes, respectively (Franz and Kunz 1981; Templeton et al. 1989; Lathe et al. 1995; Malik and Eickbush 1999). Overall, however, targeting a site within a multigene family reduces the risk that the element will insert into single-copy genes and adversely affect the phenotype. In addition, the rRNA genes are actively transcribed, ensuring expression of the element's transposase, and highly conserved, ensuring a population of uniform future target sites (Eickbush and Eickbush 1995). This feature is a definite advantage as transposons with nonspecific insertion sites are at risk of transposing into areas of the genome that are not transcriptionally active, such as heterochromatin.
Overall, the characteristics of rDNA combine to define a genomic location that is advantageous for the propagation of elements. Indeed, phylogenetic reconstruction that involves complete R2 elements from various arthropod species revealed that they have been present in these lineages from the origin of the phylum, some 500 Myr (Burke et al. 1999). A subsequent analysis of the RT domain from non-LTR retrotransposons identified 11 distinct clades and dated the origin of this superfamily of elements to the Precambrian era (Malik, Burke, and Eickbush 1999). No evidence for horizontal transfer was found either within or between clades. Malik, Burke, and Eickbush (1999) suggested that the lack of horizontal transfer and subsequent vertical nature of inheritance of non-LTR retrotransposons could be caused by their target-primed reverse transcription mechanism. However, Zupunski, Gubensek, and Kordis (2001) recognized cases of horizontal transfer in the RTE clade of non-LTR retrotransposons, which indicates that this superfamily of elements is indeed capable of horizontal transfer. Thus, the lack of horizontal transfer in R1 and R2 may be more a function of their occupation of rDNA than of their transposition mechanism. Indeed, Pokey is a very different type of element than R1 and R2, and yet its pattern of evolution in rDNA is very similar to theirs, which suggests that the rDNA location itself has a major impact on the evolution of the elements that insert within it.
The presence of multiple, divergent elements within a specific region of the rDNA unit suggests that it is a "hot spot" for insertional mutations. This region is small, only about 100 bp in length (fig. 1), and occurs in a core region (domain V) upstream of the D9 expansion segment (Gray and Schnare 1990) in a highly conserved section of the LSU rRNA gene, about one-third of the way upstream from its 3' end. The rDNA-specific transposons insert into both helices and into unpaired regions between helices. However, what specific features of this particular region of the rDNA unit that account for its susceptibility to insertion by transposons is still unclear. Sullender (1993) showed that Pokey inserts into one particular TTAA target site in this region, even though other target sites exist in the LSU rRNA gene; in fact, another TTAA is just downstream of the actual insertion site. If Pokey did insert there, we would be able to detect it with the PCR primers that we use.
Before the discovery of Pokey, no DNA transposon was known to have such site specificity (Eickbush and Malik 2002), although Pokey does not seem to be site specific in other genomic locations beyond the specificity for TTAA. We have sequenced several hundred base pairs of the 3' flanking region of four non-rDNA Pokey elements screened from the D. pulicaria cosmid library (unpublished data) and found no primary sequence similarity among these sequences and the LSU rRNA gene immediately downstream from the TTAA target site. However, further analysis is required to determine whether these sequences share other attributes to which Pokey is attracted. Even so, the fact that Pokey does occur at other genomic locations may have provided the advantage that it required to establish itself in the rDNA "hot spot," despite the occupation of this region by the highly successful R1 and R2. In fact, the insertion site of R2 begins one nucleotide downstream from Pokey's TTAA site (fig. 1), which suggests the possibility of competition between these elements. Preliminary PCR analysis suggests that R2 does occur in Daphnia (unpublished data). Thus, future study of the interaction between these elements within the rDNA of a single species may provide important insights into competition and survival patterns of transposable elements.
Supplementary Material
Both sequential (FastA) and interleaved Nexus files of the sequences used in this study are available at the Molecular Biology and Evolution Web site as Supplementary data.
Supplementary Data File 1. supplement-mbe040092–01.txt: Nucleotide FASTA file of 15 Daphnia mtSSU rRNA gene sequences and Interleaved NEXUS alignment file of 15 Daphnia nucleotide mtSSU rRNA gene sequences.
Supplementary Data File 2. supplement-mbe040092–02.txt: Nucleotide FASTA file of 12 Daphnia LSU rRNA gene sequences and Interleaved NEXUS alignment file of 12 Daphnia nucleotide LSU rRNA gene sequences.
Supplementary Data File 3. supplement-mbe040092–03.txt: Nucleotide FASTA file of 22 Pokey sequences from Daphnia and Interleaved NEXUS alignment file of 22 nucleotide Pokey sequences from Daphnia.
Table 2 Comparison of Different Pokey Topologies by the Shimodaira-Hasegawa Test.
Acknowledgements
Financial support for this work was provided by a research grant from the Natural Sciences and Engineering Research Council to T.J.C. J. Colbourne, P. Hebert, C. Prokopowich, D. Taylor, and L. Weider generously provided samples of Daphnia. We thank A. Holliss for sequencing, Jeremy DeWaard for his assistance with the SH tests, and Hope Hollocher, Denis Lynn, and two anonymous reviewers for comments on earlier drafts of the manuscript.
Literature Cited
Arca, B., and C. Savakis. 2000. Distribution of the transposable element Minos in the genus Drosophila. Genetica 108:263-267.
Arnheim, N. 1983. Concerted evolution of multigene families. Pp. 38–61 in M. Nei and R. K. Koehn, eds. Evolution of genes and proteins. Sinauer Associates, Sunderland, Mass.
Beames, B., and M. D. Summers. 1990. Sequence comparison of cellular and viral copies of host cell DNA insertions found in Autographa californica nuclear polyhedrosis virus. Virology 174:354-363.
Brunet, F., F. Godin, C. Bazin, and P. Capy. 1999. Phylogenetic analysis of Mos1-like transposable elements in the Drosophilidae. J. Mol. Evol. 49:760-768.
Burke, W. D., D. G. Eickbush, Y. Xiong, J. Jakuczak, and T. H. Eickbush. 1993. Sequence relationship of retrotransposable elements R1 and R2 within and between divergent insect species. Mol. Biol. Evol. 10:163-185.
Burke, W. D., H. S. Malik, J. P. Jones, and T. H. Eickbush. 1999. The domain structure and retrotransposition mechanism of R2 elements are conserved throughout arthropods. Mol. Biol. Evol. 16:502-511.
Burke, W. D., H. S. Malik, W. C. Lathe, III, and T. H. Eickbush. 1998. Are retrotransposons long-term hitchhikers? Nature 239:141-142.
Burke, W. D., F. Muller, and T. H. Eickbush. 1995. R4, a non-LTR retrotransposon specific to the large subunit rRNA genes of nematodes. Nucleic Acids Res. 23:4628-4634.
Capy, P., D. Anxolabehere, and T. Langin. 1994. The strange phylogenies of transposable elements: Are horizontal transfers the only explanation? Trends Genet. 10:7-12.
Cary, L. C., M. Goebel, H. H. Corsaro, H. H. Wang, E. Rosen, and M. J. Fraser. 1989. Transposon mutagenesis of baculovirus: analysis of Trichoplusia ni transposon IFP2 insertions within the FP-locus of nuclear polyhedrosis virus. Virology 172:156-169.
Clark, J. B., and M. G. Kidwell. 1997. A phylogenetic perspective on P transposable element evolution in Drosophila. Proc. Natl. Acad. Sci. USA 94:11428-11433.
Coen, E. S., J. M. Thoday, and G. Dover. 1982. Rate of turnover of structural variants in the rDNA gene family of Drosophila melanogaster. Nature 295:546-568.
Colbourne, J. K., and P. D. N. Hebert. 1996. The systematics of North American Daphnia (Crustacea: Anomopoda): a molecular phylogenetic approach. Philos. Trans. R Soc. Lond. B Biol. Sci. 351:349-360.
Daboussi, M. J., J. M. Daviere, S. Graziani, and T. Langin. 2002. Evolution of the Fot1 transposons in the genus Fusarium: discontinuous distribution and epigenetic inactivation. Mol. Biol. Evol. 19:510-520.
Daniels, S. B., K. R. Peterson, L. D. Strausbaugh, M. G. Kidwell, and A. Chovnick. 1990. Evidence for horizontal transmission of the P transposable element between Drosophila species. Genetics 124:339-355.
Doak, T. G., F. P. Doerder, C. L. Jahn, and G. Herrick. 1994. A proposed superfamily of transposase genes: transposon-like elements in ciliated protozoa and a common "D35E" motif. Proc. Natl. Acad. Sci. USA 91:942-946.
Dvorak, J., D. Jue, and M. Lassner. 1987. Homogenization of tandemly repeated nucleotide sequences by distance dependent nucleotide sequence conversion. Genetics 116:487-498.
Eickbush, T. H. 2002. R2 and related site-specific non-long terminal repeat retrotransposons. Pp. 813–835 in N. Craig, R. Craigie, M. Gellert, and A. Lambowitz, eds. Mobile DNA II. American Society of Microbiology Press, Washington, DC.
Eickbush, D. G., and T. H. Eickbush. 1995. Vertical transmission of the retrotransposable elements R1 and R2 during the evolution of the Drosophila melanogaster species subgroup. Genetics 139:671-684.
Eickbush, T. H., and H. S. Malik. 2002. Origins and evolution of retrotransposons. Pp. 1111–1144 in N. Craig, R. Craigie, M. Gellert, and A. Lambowitz, eds. Mobile DNA II. American Society of Microbiology Press, Washington, DC.
Elick, T. A., C. A. Bauser, and M. J. Fraser. 1996. Excision of the piggyBac transposable element in vitro is a precise event that is enhance by the expression of its encoded transposase. Genetica 98:33-41.
Franz, G., and W. Kunz. 1981. Intervening sequences in ribosomal RNA genes and bobbed phenotype in Drosophila hydei. Nature 292:638-664.
Gentile, K. L., W. D. Burke, and T. H. Eickbush. 2001. Multiple lineages of R1 retrotransposable elements can coexist in the rDNA loci of Drosophila. Mol. Biol. Evol. 18:235-245.
Gonzalez, P., and H. A. Lessios. 1999. Evolution of sea urchin retroviral-like (SURL) elements: evidence from 40 echinoid species. Mol. Biol. Evol. 16:938-952.
Gray, M. W., and M. N. Schnare. 1990. Evolution of the modular structure of rRNA. Pp. 589–597 in W. Hill, A. Dahlberg, R. A. Garrett, P. B. Moore, D. Schlessinger, and J. R. Warner, eds. The ribosome: structure, function and evolution. American Society of Microbiology Press, Washington, D.C.
Hall, T. 1999. BioEdit: a user friendly biological sequence alignment editor and analysis program for Windows 95/98/NT. Nucl. Acids. Symp. Ser. 41:95-98.
Hartl, D. L., A. R. Lohe, and E. R. Lozovskaya. 1997. Modern thoughts on an ancyent marinere: function, evolution, regulation. Annu. Rev. Genet. 31:337-358.
Hebert, P. D. N., and T. J. Crease. 1980. Clonal coexistence in Daphnia pulex (Leydig): another planktonic paradox. Science 207:1363-1365.
Hebert, P. D. N., and T. L. Finston. 1996. Genetic differentiation in Daphnia obtusa: a continental perspective. Freshwater Biol. 35:311-321.
Huelsenbeck, J. P. 2000. Mr.Bayes: Bayesian inference of phylogeny. Distributed by the author, Department of Biology, University of Rochester.
Jakubczak, J. L., W. D. Burke, and T. H. Eickbush. 1991. Retrotransposable elements R1 and R2 interrupt the rRNA genes of most insects. Proc. Natl. Acad. Sci. USA 88:3295-3299.
Kidd, S. J., and D. M. Glover. 1981. Drosophila melanogaster ribosomal DNA containing type II insertions is variably transcribed in different strains and tissues. J. Mol. Biol. 151:645-662.
Kidwell, M. G., and D. R. Lisch. 2001. Perspective: transposable elements, parasitic DNA, and genome evolution. Evolution 55:1-24.
Kimura, M. 1980. A simple method for estimating evolutionary rate of base substitutions through comparative studies of nucleotide sequences. J. Mol. Evol. 16:111-120.
Kumar, S., K. Tamura, and M. Nei. 1993. MEGA: molecular evolutionary genetics analysis. Version 1.02. The Pennsylvanian State University, University Park, Pa.
Lathe, W. C., III, W. D. Burke, D. G. Eickbush, and T. H. Eickbush. 1995. Evolutionary stability of the R1 retrotransposable element in the genus Drosophila. Mol. Biol. Evol. 12:1094-1105.
Leaver, M. J. 2001. A family of Tc1-like transposons from the genomes of fishes and frogs: evidence for horizontal transfer. Gene 271:203-214.
Legendre, P., Y. Desdevises, and E. Bazin. 2002. A statistical test for host-parasite coevolution. Syst. Biol. 51:217-234.
Lobo, N., X. Li, and M. J. Fraser. 1999. Transposition of the piggyBac element in embryos of Drosophila melanogaster, Aedes aegypti and Trichoplusia ni. Mol. Gen. Genet. 261:803-810.
Lohe, A. R., E. N. Moriyama, D. A. Lidholm, and D. L. Hartl. 1995. Horizontal transmission, vertical inactivation, and stochastic loss of mariner-like transposable elements. Mol. Biol. Evol. 12:62-72.
Long, E. O., and I. B. Dawid. 1979. Expression of ribosomal DNA insertions in Drosophila melanogaster. Cell 18:1185-1196.
Malik, H. S., and T. H. Eickbush. 1999. Retrotransposable elements R1 and R2 in the rDNA units of Drosophila mercatorum: abdominal abdomen revisited. Genetics 151:653-665.
Malik, H. S., W. D. Burke, and T. H. Eickbush. 1999. The age and evolution of non-LTR retrotransposable elements. Mol. Biol. Evol. 16:793-805.
Nei, M., and S. Kumar. 2000. Molecular evolution and phylogenetics. Oxford University Press, New York.
Omilian, A. R., and D. J. Taylor. 2001. Rate acceleration and long-branch attraction in a conserved gene of cryptic Daphniid (Crustacea) species. Mol. Biol. Evol. 18:2201-2212.
Page, R. D. M. 1990. Temporal congruence and cladistic analysis of biogeography and cospeciation. Syst. Zool. 39:205-226.
Penton, E. H., Hebert, P. D. N., and T. J. Crease. 2004. Mitochondrial DNA variation in North American populations of Daphnia obtusa: continentilism or cryptic endemism? Mol. Ecol. 13:97-107.
Penton, E. H., B. W. Sullender, and T. J. Crease. 2002. Pokey, a new DNA transposon in Daphnia (Cladocera: Crustacea). J. Mol. Evol. 55:664-673.
Pérez-González, C. E., W. D. Burke, and T. H. Eickbush. 2003. R1 and R2 retrotransposition and deletion in the rDNA loci on the X and Y chromosomes of Drosophila melanogaster. Genetics 165:675-685.
Pérez-González, C. E., and T. H. Eickbush. 2001. Dynamics of R1 and R2 elements in the rDNA locus of Drosophila simulans. Genetics 158:1557-1567.
Pérez-González, C. E., and T. H. Eickbush. 2002. Rates of R1 and R2 retrotransposition and elimination from the rDNA locus of Drosophila melanogaster. Genetics 162:799-811.
Person, W. R., T. Wood, Z. Zhang, and W. Miller. 1997. Comparison of DNA sequences with protein sequences. Genomics 46:24-36.
Posada, D., and K. A. Crandall. 1998. MODELTEST: testing the model of DNA substitution. Bioinformatics 14:817-818.
Robertson, H. M. 1993. The mariner element is widespread in insects. Nature 362:241-245.
Robertson, H. M., and D. J. Lampe. 1995. Recent horizontal transfer of a mariner transposable element among and between Diptera and Neuroptera. Mol. Biol. Evol. 12:850-862.
Robertson, H. M., and E. G. MacLeod. 1993. Five major subfamilies of mariner transposable elements in insects, including the Mediterranean fruit fly, and related arthropods. Insect Mol. Biol. 2:125-139.
Saitou, N., and M. Nei. 1987. The neighbour-joining method: a new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4:406-425.
Sarkar, A., C. Sim, Y. S. Hong, J. R. Hogan, M. J. Fraser, H. M. Robertson, and F. H. Collins. 2003. Molecular evolutionary analysis of the widespread piggyBac transposon family and related "domesticated" sequences. Mol. Genet. Genomics 270:173-180.
Shimodaira, H., and M. Hasegawa. 1999. Multiple comparisons of log-likelihoods with application to phylogenetic inference. Mol. Biol. Evol. 16:1114-1116.
Stuart-Rogers, C., and A. J. Flavell. 2001. The evolution of Ty1-copia group retrotransposons in gymnosperms. Mol. Biol. Evol. 18:155-163.
Sullender, B. W. 1993. Preliminary characterization and population genetic survey of the Daphnia rDNA transposable element, Pokey. Ph.D dissertation, University of Oregon, Eugene.
Sullender, B. W., and T. J. Crease. 2001. The behavior of a Daphnia pulex transposable element in cyclically and obligately parthenogenetic populations. J. Mol. Evol. 53:63-69.
Swofford, D. L. 2002. PAUP*: phylogenetic analysis using parsimony (*and other methods). Version 4. Sinauer Associates, Sunderland, Mass.
Templeton, A. R., H. Hollocher, S. Lawler, and J. S. Johnston. 1989. Natural selection and ribosomal DNA in Drosophila. Genome 31:296-303.
Wang, H. H., M. J. Fraser, and L. C. Cary. 1989. Transposon mutagenesis of baculoviruses: analysis of TFP3 lepidopteran insertions at the FP locus of nuclear polyhedrosis viruses. Gene 81:97-108.
Zupunski, V., F. Gubensek, and D. Kordis. 2001. Evolutionary dynamics and evolutionary history in the RTE clade of non-LTR retrotransposons. Mol. Biol. Evol. 18:1849-1863.(Erin H. Penton1 and Teres)