Functional Evolution of a cis-Regulatory Module
http://www.100md.com
《科学公立图书馆生物学》
1 Department of Ecology and Evolution, University of Chicago, Illinois, United States of America,2 Department of Genetics, University of Cambridge, United Kingdom
Lack of knowledge about how regulatory regions evolve in relation to their structure–function may limit the utility of comparative sequence analysis in deciphering cis-regulatory sequences. To address this we applied reverse genetics to carry out a functional genetic complementation analysis of a eukaryotic cis-regulatory module—the even-skipped stripe 2 enhancer—from four Drosophila species. The evolution of this enhancer is non-clock-like, with important functional differences between closely related species and functional convergence between distantly related species. Functional divergence is attributable to differences in activation levels rather than spatiotemporal control of gene expression. Our findings have implications for understanding enhancer structure–function, mechanisms of speciation and computational identification of regulatory modules.
Introduction
The annotation of genes from comparative sequence data rests on a fundamental evolutionary dictum, first elaborated by M. Kimura, that the rate of molecular evolution will be inversely related to the level of functional constraint. But the application of this principle would not be interpretable without a corresponding understanding of gene structure and organization (i.e., the genetic code and its degeneracy, the signals for initiation and termination of translation, intron/exon junction sequences, etc.). Knowledge of equivalent scope and depth does not exist for cis-regulatory sequences. These sequences often contain docking sites for transcription factors (TFs), but the number of binding sites and the spacing between them vary, and binding-site sequences are often degenerate to the point that they can only be characterized probabilistically. Even more striking is the lack of data relating functional evolution of gene expression to cis-regulatory sequence evolution. There are good reasons to expect the two may be only weakly correlated [1,2]: De novo binding sites can readily evolve [3]; individual TFs often bind at multiple locations and may be exchangeable, and the spacing between binding sites can rapidly evolve. Thus, despite recent progress [4,5], rules have yet to be elucidated for the functional molecular evolution of this critically important component of the genome.
The Drosophila gene even-skipped (eve) produces seven transverse stripes along the anterior–posterior (A–P) axis of a blastoderm embryo (Figure 1). Expression of these early stripes is regulated by five distinct cis-elements (Figure 2A). The best studied of them, the stripe 2 enhancer (S2E), contains multiple binding sites for five TFs, the activators bicoid and hunchback, and the repressors giant, Kruppel, and sloppy-paired [6,7,8]. Maternal deposition of bicoid mRNA in the anterior pole of the egg regulates expression of the other gap genes, which are expressed in broad A–P diffusion gradients. Spatiotemporal control of eve stripe 2 expression is brought about through the integration of these graded signals by the S2E.
We previously used a reporter transgene assay to investigate eve S2E functional evolution in three Drosophila species in addition to D. melanogaster. The sister taxa D. yakuba and D. erecta [9] are separated by approximately 5 million years ago (MYA), while the ancestor they share with D. melanogaster existed approximately 10–12 MYA. In contrast, D. pseudoobscura is a member of a different group and is believed to have split from the melanogaster clade approximately 40—60 MYA. As expected for a trait as ontogenetically important as primary pair-rule stripe formation, the temporal progression of eve stripe expression is nearly identical among the species (see Figure 1A–1D). This functional conservation of gene expression, however, is not reflected in patterns of sequence conservation (see Figures 2B, S1, and S2). Instead, S2E sequences from these species are substantially diverged, including large insertions and deletions in the spacers between known factor-binding sites, single nucleotide substitutions in binding sites, and even gains or losses of binding sites for the activators bicoid and hunchback.
Yet despite these evolved differences, reporter transgene analysis showed that spatiotemporal patterns of gene expression driven by S2Es of all four species are indistinguishable when placed in D. melanogaster [10], indicating that evolved changes in the enhancer have had little or undetectable impact on spatiotemporal control of gene expression. But further experiments with native and chimeric S2Es of D. melanogaster and D. pseudoobscura showed that this functional conservation required coevolved changes in the 5′ and 3′ halves of the enhancer [11], suggesting compensatory (i.e., adaptive) evolution. This functional evidence for adaptive substitution, together with indications that levels of gene expression might also differ among the four species' S2Es, raises questions about whether these orthologous enhancers are indeed functionally identical. To overcome limitations inherent in functionally interpreting the overlap of a reporter and native gene expression, here we report results of an in vivo complementation assay to investigate S2E performance. This approach allows us to put the functional equivalency hypothesis to a rigorous test.
(A–D) Embryos of four Drosophila species at early cellular blastoderm stage. EVE stained with immunoperoxidase DAB reaction enhanced by nickel.
(E–H) Df(eve) D. melanogaster embryos with two copies of transgenes containing eve S2E from four species fused to D. melanogaster eve coding region (0.9 to +1.85 kb) at blastoderm stage. Immunofluorescence-labeled EVE. The S2Eere-EVE (G) produces consistently weaker stripes than lines carrying S2Es from the other three species. (A and E) D. melanogaster, (B and F) D. yakuba, (C and G) D. erecta, and (D and H) D. pseudoobscura.
Results
Strategy and Proof of Principle
First, we created a fly line, EVEΔS2E, in which the native eve S2E was deleted (see Figure 2A). We then attempted to complement, that is, rescue this lethal mutation with the introduction of a transgene, denoted S2E-EVE, containing an eve S2E from one of the four species (D. melanogaster, D. yakuba, D. erecta, or D. pseudoobscura) linked to a functional eve promoter and coding region (Figure 2B). This allowed us to compare both viabilities and developmental consequences among lines differing only in the evolutionary source of their S2E. By genetically manipulating rescue-transgene copy number (Figure 2C), effects of EVE abundance on viability and development could also be investigated.
We created the eve S2E deficiency mutant by removing a 480-bp fragment corresponding to the minimal stripe 2 element (MSE; see Figure S1) from a 15-kb cloned copy of the eve locus [12]. A transgene containing the complete fragment is capable of rescuing eve null mutant flies to fertile adulthood [12]. EVEΔS2E is functionally a null allele for stripe 2, as evidenced by the expression of the segment polarity gene, engrailed (en). Establishment of en 14-stripe pattern is a complex process that includes involvement by eve early stripes [13,14]. Eve stripe 2 corresponds to parasegment 3, which is bordered by en stripes 3 and 4. We hypothesized that these en stripes might be developmental indicators of early eve stripe 2 expression. Indeed EVEΔS2E embryos lacking a functional S2E (Figure 3A–3F) produce a short parasegment 3 and vestigial en stripe 4 (Figure 3F). This defect alone is almost certainly a lethal condition.
Transgenes containing precisely orthologous S2Es from each of the four species linked to the D. melanogaster eve promoter and coding region were introduced onto the third chromosome. The fragment we chose to investigate is 692 bp in length in D. melanogaster (see Figure S1). It contains the central MSE, and every other previously identified TF-binding site in the S2E region. Notably, this fragment contains completely conserved sequences at its 5′ and 3′ ends in all four species, thus ensuring that we could compare precisely orthologous fragments. As expected, all four S2E-EVE transgenes express a single early eve stripe in the expected spatial location (see Figure 1E–1H).
Having created the EVEΔS2E chromosome line and the S2E-EVE rescue third chromosome lines, we could then produce flies carrying EVEΔS2E; S2E-EVE in a doubly balanced configuration (see Figure 2C). Crossing this line with itself or with another line carrying an independent copy of the same S2E allowed us to estimate relative survival to adulthood of offspring carrying one or two copies of the rescue transgene. EVEΔS2E homozygotes are embryonic lethal, whereas flies carrying two copies of the D. melanogaster S2Emel-EVE transgene in an EVEΔS2E genetic background rescue approximately 34% of flies to adulthood (Figure 4). This is approximately the same rescue percentages found for the same genotype (P[EVEG84], R13), which contains the wild-type eve locus (including the native S2E) [12]. This implies that the fragment we used to drive stripe 2 eve expression is complete and that it can function normally when removed from its native context. Importantly, our negative control, S2E0-EVE, does not rescue, indicating that the rescue transgene requires this enhancer to drive eve stripe 2 expression.
Functional Equivalence of the D. melanogaster and D. pseudoobscura S2Es
We evaluated the ability of S2E-EVE rescue constructs to complement the embryonic lethal EVEΔS2E deletion by estimating survival to adulthood, based on a genetic design used extensively in Drosophila evolutionary genetics [15]. Viability measurements were made by crossing two independent lines of each rescue transgene to reduce potential recessive fitness effects caused by the site of rescue-transgene insertion. Offspring with two copies of the transgene are doubly hemizygous; few deleterious effects of transgene insertion were observed in these flies (compare, for example, EVEΔS2E, R13/CyO; S2E-EVE/S2E-EVE versus EVEΔS2E, R13/CyO; S2E-EVE/TM3 survivors in Table S1). Rescue abilities of S2Es from different species can be compared quantitatively because the viability of each S2E-EVE transgene is calculated relative to a standard genotype present in every cross.
(A) Summary map of the eve locus and eve S2E deletion transgene (EVEΔS2E). Adam and Apple are adjacent open reading frames [40]. The late element (Auto) and early stripe enhancers are shown.
(B) S2E-EVE transgenes used to rescue eve function. The rescue EVE locus used is the D. melanogaster eve flanked by 0.9 kb of 5′ and approximately 0.6 kb of 3′ of endogenous sequence. The S2Eo-EVE does not have any S2E sequences and is a negative control. The known trans-factor-binding sites in the S2E from D. melanogaster: five bicoid (circles), three hunchback (ovals), six Kruppel (squares), three giant (rectangles), and one sloppy-paired (triangle) binding site. Symbols representing sites 100% conserved compared to D. melanogaster are open, while those diverged are shaded gray. Note the evolutionary gain of novel but functionally necessary [6] activator (bicoid and hunchback) binding sites (red) in D. melanogaster lineage. Full sequences are shown in Figures S1 and S2.
(C) Example of a cross between independent rescue lines and relevant offspring genotypes for the viability assay (see Materials and Methods for details). Genetic notation b: mutant black; yellow box: native eve; R13 and X'd out yellow box: eveR13 lethal mutant; P(S2EΔEVE): eve 6.4 to 8.4 kb without S2E; P(S2E A1-EVE) and P(S2EA2-EVE) are two independent rescue-transgene inserts with S2E from species A.
(A–E) Immunofluorescence labeling of time-staged early EVEΔS2E homozygous embryos. This developmental sequence, which corresponds roughly from the initialization of cellularization (A) to its completion (E), takes approximately 45 min at 25 oC in wild-type flies [41].
(F) Expression of en in same genotype at stage 10. Arrows mark third and fourth en stripes. Note the short interval between en stripes 3 and 4 (parasegment 3) and the reduced fourth stripe.
(G) EVE expression in stripe 2 during the developmental series around cellularization, where times 1–5 correspond to pictures in A–E. Stage 1 is early cellularization, while the process has been completed for embryos in class 5. The series is comparable to time classes 4–8 on the FlyEx Web site (http://flyex.ams.sunysb.edu/flyex/) [34]. Estimated least square means (± SE) for EVEΔS2E/Cy stock and wild-type line w1118; note the Cy/Cy homozygote is essentially wild-type. Early eve pair-rule expression is not known to be autoregulated (as occurs in postcellularization stages), and we observe a 2-fold difference in early stripe expression, with an additive component (a) of 0.62 and negligible dominance deviation (d/a) = 0.01, for the first two stages. This dosage dependency is lost after the cellularization stage (3), presumably because all embryos carry two copies of the autoregulatory element.
Rescue percentages to adulthood of EVEΔS2E homozygotes with one or two copies of rescue construct from the four species, and the negative control, denoted on x-axis. Each bar represents percentages summarized over sexes and reciprocal crosses (full data in Table S1).
S2Es from the four species exhibited large differences in rescue abilities that follow neither a phylogenetic trend nor net sequence divergence (Figure 4). The S2E of the most distantly related species, D. pseudoobscura, is completely conserved at only three of 18 TF-binding sites identified in D. melanogaster and is missing two of them entirely (see Figures 2B and S2). It is also nearly 25% longer due to insertions and deletions in the spacers between binding sites. Yet in terms of rescue ability it is indistinguishable from the D. melanogaster S2E.
Functional Divergence of S2Es from Closely Related Species
Given the complete functional conservation of the D. pseudoobscura S2E, we were surprised to discover the failure of the D. erecta transgene to restore viability in EVEΔS2E homozygotes (see Figure 4). The inability of the doubly hemizygous S2Eere-EVE genotype to rescue cannot be due to deleterious effects of transgene insertion, because the presence of each single transgene has minimal impact on viability (see Table S1). Two additional independent transformants were also investigated, neither of which produced viable adult flies. We conclude, therefore, that the D. erecta sequence, although precisely orthologous to the D. melanogaster and D. pseudoobscura S2E fragments, is nonfunctional when placed in a D. melanogaster embryonic context.
The D. yakuba's S2E also exhibits a rescue defect in that two copies of the rescue transgene are required for robust rescue. Flies carrying a single copy of the D. yakuba rescue transgene are less than half as viable as flies carrying one copy of either the D. melanogaster or D. pseudoobscura rescue transgene. A smaller dosage effect on viability of approximately 20% is seen with the S2Es of D. melanogaster and D. pseudoobscura. Since the spatiotemporal expression of eve stripe 2 must be the same for flies carrying one or two copies of a transgene, eve stripe 2 expression level alone must have a measurable influence on fitness.
As expected, embryos carrying one or two copies of either the D. melanogaster or the functionally equivalent D. pseudoobscura S2E rescue transgene exhibit a wild-type en staining pattern, indicating a normal parasegment 3 (Figure 5A–5I). In contrast, the D. erecta S2E exhibits an en pattern defect similar to the one produced in embryos lacking eve stripe 2 expression (i.e., EVEΔS2E homozygote). The inability to drive normal en expression provides further evidence that the D. erecta S2E is a weak (or nonfunctional) enhancer in the D. melanogaster genetic background.
(A, C–I) The en pattern in homozygous EVEΔS2E and (B) wild-type (w1118) specimens at stages 9–11. All strains (except [B]) are homozygous for Df(eve) P(EVEΔS2E) second chromosomes, with the third chromosome differing only by rescue transgenes: (A) no rescue transgenes; (C) P(mel 36)/P(mel 36) is a S2Emel-EVE stock; (D) P(yak 74)/TM3 Sb and (E) P(yak 74)/P(yak 74) are S2Eyak-EVE stocks; (F) P(S2Eo-EVE)/P(S2Eo-EVE) has no S2E; both (G) P(ere 41)/P(ere 41) and (H) P(ere 21)/P(ere 21) are S2Eere-EVE transgenic stocks; and (I) P(pse 91)/P(pse 91) is a S2Epse-EVE stock. Note the variation in distance between third and forth en stripes (arrows) and relative level of en expression in the fourth stripe. Only the first seven parasegments of the en pattern are show (except in [A]). The en protein was visualized by an immunoperoxidase DAB reaction enhanced by nickel. mel: D. melanogaster; yak: D. yakuba; ere: D. erecta; pse: D. pseudoobscura. S2Eo-EVE lacks a S2E.
The D. yakuba S2E also exhibits an en phenotype that correlates with its ability to rescue (Figure 5D and 5E). With two copies of the enhancer present, embryos exhibit a robust en stripe 4, indistinguishable from wild-type. But with only one copy present, en stripe 4 expression is shifted anteriorly relative to its neighbors, an indication that parasegment 3 is not forming properly. Some of these embryos survive to adulthood since we do observe one-copy adults in our viability experiment, albeit at a lower than expected percentage. Although adult flies are superficially “normal,” we can observe subtle morphological defects (mouthparts and thoracic structures) in the segments corresponding to parasegment 3.
Differences in eve S2E Expression Levels
To test whether differential gene expression might be the critical functional difference between the S2Es, we quantified eve stripe 2 protein in early embryos. The experimental design allowed us to normalize eve stripe 2 expression in individually stained embryos relative to stripe 3, thus facilitating comparison across embryos and genotypes. We also developed a PCR method to ascertain the genotype of individually stained embryos.
We validated the quantification procedure by comparing eve stripe 2 expression levels in embryos carrying zero, one, or two copies of the S2E in its native position in a wild-type eve locus—that is, EVEΔS2E/EVEΔS2E, EVEΔS2E/Cy, and Cy/Cy embryos, respectively, and a homozygous w1118 line (see Figure 3G). The expected dose dependence is observed in response to EVEΔS2E copy number prior to cellularization, followed by a shift to dose independence as control of eve stripe expression is transferred to the late (autoregulatory) element. Unexpectedly, a weak early stripe 2 (estimated to be approximately 20% of the wild-type level) can be detected in EVEΔS2E homozygotes; we do not know what drives this stripe.
Normalized stripe 2 expression in early embryos carrying S2Es from D. erecta and D. pseudoobscura is consistent with adult viability (Figure 6). The D. erecta S2E-driven eve expression is too weak to observe statistically significant expression comparing embryos containing zero, one, or two copies of the rescue transgene. Note, however, that this transgene does drive weak eve stripe 2 expression in a fully eve null background (see Figure 1G). Formally, we observe statistically significant effects of gene “dose,” S2E “species” of origin, and most notably a “dose × species” interaction on stripe 2 expression by a mixed-model analysis of variance (ANOVA) (see Tables 1 and S2). Therefore, the major functional evolutionary difference between these enhancers is likely to reside in their activation strengths.
Discussion
Evolution of Enhancer Structure–Function
The D. melanogaster S2E rescue transgene, and its considerably diverged D. pseudoobscura ortholog, each restore complete eve stripe 2 biological activity when placed in a genetic background lacking a native S2E. The DNA fragment we investigated, therefore, entails both the biological and evolutionary units of enhancer function. We chose this fragment based on its extensive prior characterization, including genetic, reverse genetic, and footprinting analyses [6,16,17,18]. In particular, Stanojevic et al.'s [18] TF footprinting data appear to have nicely delineated the functional enhancer.
Fluorescence-labeled antibody staining of EVE in embryos with zero (A, C, and E) or two (B, D, and F) copies of rescue transgene. A dose effect is seen in D. pseudoobscura line 91, (A and B), while none is observed in D. erecta line 41 (C and D) or 21 (E and F). (G) These effects are significant when comparing EVE protein quantity (least square means ± SE) in stripe 2 (Dose × Species, F = 4.69(2, 100.44), p = 0.01; see Tables 1 and S2) D. pseudoobscura (black circles, n = 59) and D. erecta embryos (open circles, n = 71). For D. pseudoobscura the estimated additive component (a) = 0.37 and dominance deviation(d/a) = 0.17.
Our previous experiments with S2Es of these two species demonstrated that both intact enhancers, but not the chimeras between them, drive the correct spatiotemporal pattern of reporter gene expression [11]. The rescue experiments reported here extend this finding by showing that the two orthologs are in fact biologically indistinguishable. These new results reinforce our contention that the phenotypic character—early stripe 2 expression—must be under stabilizing selection. The character itself remains unchanged over evolutionary time despite substitutions in nearly all the TF-binding sites, the gain and loss of some of them, and considerable change in the spacing between sites. This suggests to us that unlike proteins, where functional conservation usually means selective constraint on important amino acids (such as the active site of an enzyme), enhancers have a more flexible architecture that allows modification, and perhaps even turnover, of their “active” sites. Dissimilarities in the structure–function of enhancers and proteins result in different emergent “rules” of molecular evolution.
But the fact that the D. melanogaster and D. pseudoobscura S2Es are biologically indistinguishable does not necessarily imply that enhancer function has been evolutionary static. Rather, the similar biological activities appear to be the result of convergence. In particular, phylogenetic analysis of S2E sequences indicates that the bcd-3 binding site in D. melanogaster was acquired only recently in the lineage leading to D. melanogaster (see Figure 2B). (There are also lineage-specific deletions in the spacers flanking both sides of the bcd-3 site in the D. melanogaster lineage, which shift the proximal and distal repressors giant and Kruppel binding sites, respectively, closer to this bicoid site. These length changes may have coevolved to enable or increase local repression of this novel activator site.) The bcd-3 site was shown by Small et al. [6] to be required for MSE stripe expression. It seems likely, therefore, that the ancestral S2E lacking this binding site would not properly activate stripe 2 expression in D. melanogaster. Perhaps the sensitivity of the enhancer (or more precisely, the fragment investigated) to activator signals has oscillated over evolutionary time, in which case the similarity between the two distantly related species' S2Es would be an example of functional convergence.
The fact that the S2E fragment from D. erecta is essentially unresponsive to the D. melanogaster morphogen-gradient environment, but the precisely orthologous segment from D. melanogaster (and D. pseudoobscura) responds properly, proves that this fragment must contain evolved differences of functional significance between the species. The lack of biological activity of the D. erecta transgene in D. melanogaster should perhaps come as no surprise, however: Its lower sensitivity to activation may represent the ancestral state of the enhancer. What is surprising is the rapidity with which these functional differences evolve.
Phylogenetic footprinting of distantly related species can readily identify strongly conserved motifs [19] but runs the risk of not detecting enhancers that have retained their function but have evolved structurally. To overcome this, a technique called phylogenetic shadowing—the comparison of noncoding sequences among closely related species—has recently emerged [9]. Our results show that there is no necessary relationship between enhancer phylogenetic (or sequence) relatedness and functional similarity. Closely related species cannot be assumed to be more functionally conserved than distantly related species in enhancer structure–function.
Why Is the D. erecta S2E Transgene Not Functional in D. melanogaster
D. erecta produces a native early eve stripe 2. Why then does the S2E fragment from this species not produce a robust early stripe when placed in D. melanogaster The first possibility is that the fragment we investigated no longer contains a functional enhancer and has been replaced by an equivalent enhancer somewhere else in the eve locus. This possibility can easily be ruled out: The overall architecture of the eve locus, including all of its 5′ and 3′ enhancers, is well conserved, and there is no new cluster of the appropriate TF-binding sites that could act as a S2E. Another unlikely possibility is that the locus has been duplicated, and the fragment we investigated has become functionally inert (i.e., equivalent to a pseudogene). There is no indication of a duplicated eve locus in the D. erecta genome, and all features of the eve locus (including its S2E) are intact and do not indicate any degeneration.
This leads us to conclude that the D. erecta fragment used in our experiments contains the S2E. We can consider three additional possibilities. The first is that this fragment is no longer the complete biological unit, that is, novel binding sites have evolved in this species distal or proximal to this fragment, which have become assimilated into the active enhancer by a process we call accretion. As Figure 7 shows, patser, a binding-site prediction program [20] identifies a single potential bicoid-binding site 135 bp upstream of Block-A (Figure S1), the distal end of the D. erecta S2E transgene. This potential site contains an unconventional bicoid-binding motif, T(Michael Z. Ludwig, Arnar )
Lack of knowledge about how regulatory regions evolve in relation to their structure–function may limit the utility of comparative sequence analysis in deciphering cis-regulatory sequences. To address this we applied reverse genetics to carry out a functional genetic complementation analysis of a eukaryotic cis-regulatory module—the even-skipped stripe 2 enhancer—from four Drosophila species. The evolution of this enhancer is non-clock-like, with important functional differences between closely related species and functional convergence between distantly related species. Functional divergence is attributable to differences in activation levels rather than spatiotemporal control of gene expression. Our findings have implications for understanding enhancer structure–function, mechanisms of speciation and computational identification of regulatory modules.
Introduction
The annotation of genes from comparative sequence data rests on a fundamental evolutionary dictum, first elaborated by M. Kimura, that the rate of molecular evolution will be inversely related to the level of functional constraint. But the application of this principle would not be interpretable without a corresponding understanding of gene structure and organization (i.e., the genetic code and its degeneracy, the signals for initiation and termination of translation, intron/exon junction sequences, etc.). Knowledge of equivalent scope and depth does not exist for cis-regulatory sequences. These sequences often contain docking sites for transcription factors (TFs), but the number of binding sites and the spacing between them vary, and binding-site sequences are often degenerate to the point that they can only be characterized probabilistically. Even more striking is the lack of data relating functional evolution of gene expression to cis-regulatory sequence evolution. There are good reasons to expect the two may be only weakly correlated [1,2]: De novo binding sites can readily evolve [3]; individual TFs often bind at multiple locations and may be exchangeable, and the spacing between binding sites can rapidly evolve. Thus, despite recent progress [4,5], rules have yet to be elucidated for the functional molecular evolution of this critically important component of the genome.
The Drosophila gene even-skipped (eve) produces seven transverse stripes along the anterior–posterior (A–P) axis of a blastoderm embryo (Figure 1). Expression of these early stripes is regulated by five distinct cis-elements (Figure 2A). The best studied of them, the stripe 2 enhancer (S2E), contains multiple binding sites for five TFs, the activators bicoid and hunchback, and the repressors giant, Kruppel, and sloppy-paired [6,7,8]. Maternal deposition of bicoid mRNA in the anterior pole of the egg regulates expression of the other gap genes, which are expressed in broad A–P diffusion gradients. Spatiotemporal control of eve stripe 2 expression is brought about through the integration of these graded signals by the S2E.
We previously used a reporter transgene assay to investigate eve S2E functional evolution in three Drosophila species in addition to D. melanogaster. The sister taxa D. yakuba and D. erecta [9] are separated by approximately 5 million years ago (MYA), while the ancestor they share with D. melanogaster existed approximately 10–12 MYA. In contrast, D. pseudoobscura is a member of a different group and is believed to have split from the melanogaster clade approximately 40—60 MYA. As expected for a trait as ontogenetically important as primary pair-rule stripe formation, the temporal progression of eve stripe expression is nearly identical among the species (see Figure 1A–1D). This functional conservation of gene expression, however, is not reflected in patterns of sequence conservation (see Figures 2B, S1, and S2). Instead, S2E sequences from these species are substantially diverged, including large insertions and deletions in the spacers between known factor-binding sites, single nucleotide substitutions in binding sites, and even gains or losses of binding sites for the activators bicoid and hunchback.
Yet despite these evolved differences, reporter transgene analysis showed that spatiotemporal patterns of gene expression driven by S2Es of all four species are indistinguishable when placed in D. melanogaster [10], indicating that evolved changes in the enhancer have had little or undetectable impact on spatiotemporal control of gene expression. But further experiments with native and chimeric S2Es of D. melanogaster and D. pseudoobscura showed that this functional conservation required coevolved changes in the 5′ and 3′ halves of the enhancer [11], suggesting compensatory (i.e., adaptive) evolution. This functional evidence for adaptive substitution, together with indications that levels of gene expression might also differ among the four species' S2Es, raises questions about whether these orthologous enhancers are indeed functionally identical. To overcome limitations inherent in functionally interpreting the overlap of a reporter and native gene expression, here we report results of an in vivo complementation assay to investigate S2E performance. This approach allows us to put the functional equivalency hypothesis to a rigorous test.
(A–D) Embryos of four Drosophila species at early cellular blastoderm stage. EVE stained with immunoperoxidase DAB reaction enhanced by nickel.
(E–H) Df(eve) D. melanogaster embryos with two copies of transgenes containing eve S2E from four species fused to D. melanogaster eve coding region (0.9 to +1.85 kb) at blastoderm stage. Immunofluorescence-labeled EVE. The S2Eere-EVE (G) produces consistently weaker stripes than lines carrying S2Es from the other three species. (A and E) D. melanogaster, (B and F) D. yakuba, (C and G) D. erecta, and (D and H) D. pseudoobscura.
Results
Strategy and Proof of Principle
First, we created a fly line, EVEΔS2E, in which the native eve S2E was deleted (see Figure 2A). We then attempted to complement, that is, rescue this lethal mutation with the introduction of a transgene, denoted S2E-EVE, containing an eve S2E from one of the four species (D. melanogaster, D. yakuba, D. erecta, or D. pseudoobscura) linked to a functional eve promoter and coding region (Figure 2B). This allowed us to compare both viabilities and developmental consequences among lines differing only in the evolutionary source of their S2E. By genetically manipulating rescue-transgene copy number (Figure 2C), effects of EVE abundance on viability and development could also be investigated.
We created the eve S2E deficiency mutant by removing a 480-bp fragment corresponding to the minimal stripe 2 element (MSE; see Figure S1) from a 15-kb cloned copy of the eve locus [12]. A transgene containing the complete fragment is capable of rescuing eve null mutant flies to fertile adulthood [12]. EVEΔS2E is functionally a null allele for stripe 2, as evidenced by the expression of the segment polarity gene, engrailed (en). Establishment of en 14-stripe pattern is a complex process that includes involvement by eve early stripes [13,14]. Eve stripe 2 corresponds to parasegment 3, which is bordered by en stripes 3 and 4. We hypothesized that these en stripes might be developmental indicators of early eve stripe 2 expression. Indeed EVEΔS2E embryos lacking a functional S2E (Figure 3A–3F) produce a short parasegment 3 and vestigial en stripe 4 (Figure 3F). This defect alone is almost certainly a lethal condition.
Transgenes containing precisely orthologous S2Es from each of the four species linked to the D. melanogaster eve promoter and coding region were introduced onto the third chromosome. The fragment we chose to investigate is 692 bp in length in D. melanogaster (see Figure S1). It contains the central MSE, and every other previously identified TF-binding site in the S2E region. Notably, this fragment contains completely conserved sequences at its 5′ and 3′ ends in all four species, thus ensuring that we could compare precisely orthologous fragments. As expected, all four S2E-EVE transgenes express a single early eve stripe in the expected spatial location (see Figure 1E–1H).
Having created the EVEΔS2E chromosome line and the S2E-EVE rescue third chromosome lines, we could then produce flies carrying EVEΔS2E; S2E-EVE in a doubly balanced configuration (see Figure 2C). Crossing this line with itself or with another line carrying an independent copy of the same S2E allowed us to estimate relative survival to adulthood of offspring carrying one or two copies of the rescue transgene. EVEΔS2E homozygotes are embryonic lethal, whereas flies carrying two copies of the D. melanogaster S2Emel-EVE transgene in an EVEΔS2E genetic background rescue approximately 34% of flies to adulthood (Figure 4). This is approximately the same rescue percentages found for the same genotype (P[EVEG84], R13), which contains the wild-type eve locus (including the native S2E) [12]. This implies that the fragment we used to drive stripe 2 eve expression is complete and that it can function normally when removed from its native context. Importantly, our negative control, S2E0-EVE, does not rescue, indicating that the rescue transgene requires this enhancer to drive eve stripe 2 expression.
Functional Equivalence of the D. melanogaster and D. pseudoobscura S2Es
We evaluated the ability of S2E-EVE rescue constructs to complement the embryonic lethal EVEΔS2E deletion by estimating survival to adulthood, based on a genetic design used extensively in Drosophila evolutionary genetics [15]. Viability measurements were made by crossing two independent lines of each rescue transgene to reduce potential recessive fitness effects caused by the site of rescue-transgene insertion. Offspring with two copies of the transgene are doubly hemizygous; few deleterious effects of transgene insertion were observed in these flies (compare, for example, EVEΔS2E, R13/CyO; S2E-EVE/S2E-EVE versus EVEΔS2E, R13/CyO; S2E-EVE/TM3 survivors in Table S1). Rescue abilities of S2Es from different species can be compared quantitatively because the viability of each S2E-EVE transgene is calculated relative to a standard genotype present in every cross.
(A) Summary map of the eve locus and eve S2E deletion transgene (EVEΔS2E). Adam and Apple are adjacent open reading frames [40]. The late element (Auto) and early stripe enhancers are shown.
(B) S2E-EVE transgenes used to rescue eve function. The rescue EVE locus used is the D. melanogaster eve flanked by 0.9 kb of 5′ and approximately 0.6 kb of 3′ of endogenous sequence. The S2Eo-EVE does not have any S2E sequences and is a negative control. The known trans-factor-binding sites in the S2E from D. melanogaster: five bicoid (circles), three hunchback (ovals), six Kruppel (squares), three giant (rectangles), and one sloppy-paired (triangle) binding site. Symbols representing sites 100% conserved compared to D. melanogaster are open, while those diverged are shaded gray. Note the evolutionary gain of novel but functionally necessary [6] activator (bicoid and hunchback) binding sites (red) in D. melanogaster lineage. Full sequences are shown in Figures S1 and S2.
(C) Example of a cross between independent rescue lines and relevant offspring genotypes for the viability assay (see Materials and Methods for details). Genetic notation b: mutant black; yellow box: native eve; R13 and X'd out yellow box: eveR13 lethal mutant; P(S2EΔEVE): eve 6.4 to 8.4 kb without S2E; P(S2E A1-EVE) and P(S2EA2-EVE) are two independent rescue-transgene inserts with S2E from species A.
(A–E) Immunofluorescence labeling of time-staged early EVEΔS2E homozygous embryos. This developmental sequence, which corresponds roughly from the initialization of cellularization (A) to its completion (E), takes approximately 45 min at 25 oC in wild-type flies [41].
(F) Expression of en in same genotype at stage 10. Arrows mark third and fourth en stripes. Note the short interval between en stripes 3 and 4 (parasegment 3) and the reduced fourth stripe.
(G) EVE expression in stripe 2 during the developmental series around cellularization, where times 1–5 correspond to pictures in A–E. Stage 1 is early cellularization, while the process has been completed for embryos in class 5. The series is comparable to time classes 4–8 on the FlyEx Web site (http://flyex.ams.sunysb.edu/flyex/) [34]. Estimated least square means (± SE) for EVEΔS2E/Cy stock and wild-type line w1118; note the Cy/Cy homozygote is essentially wild-type. Early eve pair-rule expression is not known to be autoregulated (as occurs in postcellularization stages), and we observe a 2-fold difference in early stripe expression, with an additive component (a) of 0.62 and negligible dominance deviation (d/a) = 0.01, for the first two stages. This dosage dependency is lost after the cellularization stage (3), presumably because all embryos carry two copies of the autoregulatory element.
Rescue percentages to adulthood of EVEΔS2E homozygotes with one or two copies of rescue construct from the four species, and the negative control, denoted on x-axis. Each bar represents percentages summarized over sexes and reciprocal crosses (full data in Table S1).
S2Es from the four species exhibited large differences in rescue abilities that follow neither a phylogenetic trend nor net sequence divergence (Figure 4). The S2E of the most distantly related species, D. pseudoobscura, is completely conserved at only three of 18 TF-binding sites identified in D. melanogaster and is missing two of them entirely (see Figures 2B and S2). It is also nearly 25% longer due to insertions and deletions in the spacers between binding sites. Yet in terms of rescue ability it is indistinguishable from the D. melanogaster S2E.
Functional Divergence of S2Es from Closely Related Species
Given the complete functional conservation of the D. pseudoobscura S2E, we were surprised to discover the failure of the D. erecta transgene to restore viability in EVEΔS2E homozygotes (see Figure 4). The inability of the doubly hemizygous S2Eere-EVE genotype to rescue cannot be due to deleterious effects of transgene insertion, because the presence of each single transgene has minimal impact on viability (see Table S1). Two additional independent transformants were also investigated, neither of which produced viable adult flies. We conclude, therefore, that the D. erecta sequence, although precisely orthologous to the D. melanogaster and D. pseudoobscura S2E fragments, is nonfunctional when placed in a D. melanogaster embryonic context.
The D. yakuba's S2E also exhibits a rescue defect in that two copies of the rescue transgene are required for robust rescue. Flies carrying a single copy of the D. yakuba rescue transgene are less than half as viable as flies carrying one copy of either the D. melanogaster or D. pseudoobscura rescue transgene. A smaller dosage effect on viability of approximately 20% is seen with the S2Es of D. melanogaster and D. pseudoobscura. Since the spatiotemporal expression of eve stripe 2 must be the same for flies carrying one or two copies of a transgene, eve stripe 2 expression level alone must have a measurable influence on fitness.
As expected, embryos carrying one or two copies of either the D. melanogaster or the functionally equivalent D. pseudoobscura S2E rescue transgene exhibit a wild-type en staining pattern, indicating a normal parasegment 3 (Figure 5A–5I). In contrast, the D. erecta S2E exhibits an en pattern defect similar to the one produced in embryos lacking eve stripe 2 expression (i.e., EVEΔS2E homozygote). The inability to drive normal en expression provides further evidence that the D. erecta S2E is a weak (or nonfunctional) enhancer in the D. melanogaster genetic background.
(A, C–I) The en pattern in homozygous EVEΔS2E and (B) wild-type (w1118) specimens at stages 9–11. All strains (except [B]) are homozygous for Df(eve) P(EVEΔS2E) second chromosomes, with the third chromosome differing only by rescue transgenes: (A) no rescue transgenes; (C) P(mel 36)/P(mel 36) is a S2Emel-EVE stock; (D) P(yak 74)/TM3 Sb and (E) P(yak 74)/P(yak 74) are S2Eyak-EVE stocks; (F) P(S2Eo-EVE)/P(S2Eo-EVE) has no S2E; both (G) P(ere 41)/P(ere 41) and (H) P(ere 21)/P(ere 21) are S2Eere-EVE transgenic stocks; and (I) P(pse 91)/P(pse 91) is a S2Epse-EVE stock. Note the variation in distance between third and forth en stripes (arrows) and relative level of en expression in the fourth stripe. Only the first seven parasegments of the en pattern are show (except in [A]). The en protein was visualized by an immunoperoxidase DAB reaction enhanced by nickel. mel: D. melanogaster; yak: D. yakuba; ere: D. erecta; pse: D. pseudoobscura. S2Eo-EVE lacks a S2E.
The D. yakuba S2E also exhibits an en phenotype that correlates with its ability to rescue (Figure 5D and 5E). With two copies of the enhancer present, embryos exhibit a robust en stripe 4, indistinguishable from wild-type. But with only one copy present, en stripe 4 expression is shifted anteriorly relative to its neighbors, an indication that parasegment 3 is not forming properly. Some of these embryos survive to adulthood since we do observe one-copy adults in our viability experiment, albeit at a lower than expected percentage. Although adult flies are superficially “normal,” we can observe subtle morphological defects (mouthparts and thoracic structures) in the segments corresponding to parasegment 3.
Differences in eve S2E Expression Levels
To test whether differential gene expression might be the critical functional difference between the S2Es, we quantified eve stripe 2 protein in early embryos. The experimental design allowed us to normalize eve stripe 2 expression in individually stained embryos relative to stripe 3, thus facilitating comparison across embryos and genotypes. We also developed a PCR method to ascertain the genotype of individually stained embryos.
We validated the quantification procedure by comparing eve stripe 2 expression levels in embryos carrying zero, one, or two copies of the S2E in its native position in a wild-type eve locus—that is, EVEΔS2E/EVEΔS2E, EVEΔS2E/Cy, and Cy/Cy embryos, respectively, and a homozygous w1118 line (see Figure 3G). The expected dose dependence is observed in response to EVEΔS2E copy number prior to cellularization, followed by a shift to dose independence as control of eve stripe expression is transferred to the late (autoregulatory) element. Unexpectedly, a weak early stripe 2 (estimated to be approximately 20% of the wild-type level) can be detected in EVEΔS2E homozygotes; we do not know what drives this stripe.
Normalized stripe 2 expression in early embryos carrying S2Es from D. erecta and D. pseudoobscura is consistent with adult viability (Figure 6). The D. erecta S2E-driven eve expression is too weak to observe statistically significant expression comparing embryos containing zero, one, or two copies of the rescue transgene. Note, however, that this transgene does drive weak eve stripe 2 expression in a fully eve null background (see Figure 1G). Formally, we observe statistically significant effects of gene “dose,” S2E “species” of origin, and most notably a “dose × species” interaction on stripe 2 expression by a mixed-model analysis of variance (ANOVA) (see Tables 1 and S2). Therefore, the major functional evolutionary difference between these enhancers is likely to reside in their activation strengths.
Discussion
Evolution of Enhancer Structure–Function
The D. melanogaster S2E rescue transgene, and its considerably diverged D. pseudoobscura ortholog, each restore complete eve stripe 2 biological activity when placed in a genetic background lacking a native S2E. The DNA fragment we investigated, therefore, entails both the biological and evolutionary units of enhancer function. We chose this fragment based on its extensive prior characterization, including genetic, reverse genetic, and footprinting analyses [6,16,17,18]. In particular, Stanojevic et al.'s [18] TF footprinting data appear to have nicely delineated the functional enhancer.
Fluorescence-labeled antibody staining of EVE in embryos with zero (A, C, and E) or two (B, D, and F) copies of rescue transgene. A dose effect is seen in D. pseudoobscura line 91, (A and B), while none is observed in D. erecta line 41 (C and D) or 21 (E and F). (G) These effects are significant when comparing EVE protein quantity (least square means ± SE) in stripe 2 (Dose × Species, F = 4.69(2, 100.44), p = 0.01; see Tables 1 and S2) D. pseudoobscura (black circles, n = 59) and D. erecta embryos (open circles, n = 71). For D. pseudoobscura the estimated additive component (a) = 0.37 and dominance deviation(d/a) = 0.17.
Our previous experiments with S2Es of these two species demonstrated that both intact enhancers, but not the chimeras between them, drive the correct spatiotemporal pattern of reporter gene expression [11]. The rescue experiments reported here extend this finding by showing that the two orthologs are in fact biologically indistinguishable. These new results reinforce our contention that the phenotypic character—early stripe 2 expression—must be under stabilizing selection. The character itself remains unchanged over evolutionary time despite substitutions in nearly all the TF-binding sites, the gain and loss of some of them, and considerable change in the spacing between sites. This suggests to us that unlike proteins, where functional conservation usually means selective constraint on important amino acids (such as the active site of an enzyme), enhancers have a more flexible architecture that allows modification, and perhaps even turnover, of their “active” sites. Dissimilarities in the structure–function of enhancers and proteins result in different emergent “rules” of molecular evolution.
But the fact that the D. melanogaster and D. pseudoobscura S2Es are biologically indistinguishable does not necessarily imply that enhancer function has been evolutionary static. Rather, the similar biological activities appear to be the result of convergence. In particular, phylogenetic analysis of S2E sequences indicates that the bcd-3 binding site in D. melanogaster was acquired only recently in the lineage leading to D. melanogaster (see Figure 2B). (There are also lineage-specific deletions in the spacers flanking both sides of the bcd-3 site in the D. melanogaster lineage, which shift the proximal and distal repressors giant and Kruppel binding sites, respectively, closer to this bicoid site. These length changes may have coevolved to enable or increase local repression of this novel activator site.) The bcd-3 site was shown by Small et al. [6] to be required for MSE stripe expression. It seems likely, therefore, that the ancestral S2E lacking this binding site would not properly activate stripe 2 expression in D. melanogaster. Perhaps the sensitivity of the enhancer (or more precisely, the fragment investigated) to activator signals has oscillated over evolutionary time, in which case the similarity between the two distantly related species' S2Es would be an example of functional convergence.
The fact that the S2E fragment from D. erecta is essentially unresponsive to the D. melanogaster morphogen-gradient environment, but the precisely orthologous segment from D. melanogaster (and D. pseudoobscura) responds properly, proves that this fragment must contain evolved differences of functional significance between the species. The lack of biological activity of the D. erecta transgene in D. melanogaster should perhaps come as no surprise, however: Its lower sensitivity to activation may represent the ancestral state of the enhancer. What is surprising is the rapidity with which these functional differences evolve.
Phylogenetic footprinting of distantly related species can readily identify strongly conserved motifs [19] but runs the risk of not detecting enhancers that have retained their function but have evolved structurally. To overcome this, a technique called phylogenetic shadowing—the comparison of noncoding sequences among closely related species—has recently emerged [9]. Our results show that there is no necessary relationship between enhancer phylogenetic (or sequence) relatedness and functional similarity. Closely related species cannot be assumed to be more functionally conserved than distantly related species in enhancer structure–function.
Why Is the D. erecta S2E Transgene Not Functional in D. melanogaster
D. erecta produces a native early eve stripe 2. Why then does the S2E fragment from this species not produce a robust early stripe when placed in D. melanogaster The first possibility is that the fragment we investigated no longer contains a functional enhancer and has been replaced by an equivalent enhancer somewhere else in the eve locus. This possibility can easily be ruled out: The overall architecture of the eve locus, including all of its 5′ and 3′ enhancers, is well conserved, and there is no new cluster of the appropriate TF-binding sites that could act as a S2E. Another unlikely possibility is that the locus has been duplicated, and the fragment we investigated has become functionally inert (i.e., equivalent to a pseudogene). There is no indication of a duplicated eve locus in the D. erecta genome, and all features of the eve locus (including its S2E) are intact and do not indicate any degeneration.
This leads us to conclude that the D. erecta fragment used in our experiments contains the S2E. We can consider three additional possibilities. The first is that this fragment is no longer the complete biological unit, that is, novel binding sites have evolved in this species distal or proximal to this fragment, which have become assimilated into the active enhancer by a process we call accretion. As Figure 7 shows, patser, a binding-site prediction program [20] identifies a single potential bicoid-binding site 135 bp upstream of Block-A (Figure S1), the distal end of the D. erecta S2E transgene. This potential site contains an unconventional bicoid-binding motif, T(Michael Z. Ludwig, Arnar )