当前位置: 首页 > 医学版 > 期刊论文 > 基础医学 > 分子生物学进展 > 2004年 > 第8期 > 正文
编号:11255053
Non-African Populations of Drosophila melanogaster Have a Unique Origin
     Ecole Pratique des Hautes Etudes, Laboratoire d'Ecologie, Université Pierre et Marie Curie, Paris, France

    E-mail: emmanuelle.baudry@ese.u-psud.fr.

    Abstract

    Drosophila melanogaster is widely used as a model in DNA variation studies. Patterns of polymorphism have, however, been affected by the history of this species, which is thought to have recently spread out of Africa to the rest of the world. We analyzed DNA sequence variation in 11 populations, including four continental African and seven non-African samples (including Madagascar), at four independent X-linked loci. Variation patterns at all four loci followed neutral expectations in all African populations, but departed from it in all non-African ones due to a marked haplotype dimorphism at three out of four loci. We also found that all non-African populations show the same major haplotypes, though in various frequencies. A parsimonious explanation for these observations is that all non-African populations are derived from a single ancestral population having undergone a substantial reduction of polymorphism, probably through a bottleneck. Less likely alternatives involve either selection at all four loci simultaneously (including balancing selection at three of them), or admixture between two divergent populations. Small but significant structure was observed among African populations, and there were indications of differentiation across Eurasia for non-African ones. Since population history may result in non-equilibrium variation patterns, our study confirms that the search for footprints of selection in the D. melanogaster genome must include a sufficient understanding of its history.

    Key Words: Drosophila melanogaster ? DNA polymorphism ? bottleneck ? African ? geographic subdivision

    Introduction

    Understanding the forces that shape the variation of natural populations is a major goal of evolutionary biology. Intraspecific polymorphism data are usually analyzed in reference to the neutral theory (Kimura 1983) within the framework of the Wright-Fisher model, which assumes a random-mating population of constant size. Departure from the predictions of this model can be caused by demographic events including population bottlenecks and population structure, or by the action of natural selection. It is notoriously difficult to distinguish these two factors. For instance, a reduced level of polymorphism at a given gene can be caused by the action of natural selection, through a selective sweep at this gene or at a neighboring gene (Maynard Smith and Haigh 1974; Kaplan, Hudson, and Langley 1989; Stephan, Wiehe, and Lenz 1992), or by a recent decrease in population size (Rogers and Harpending 1992). Identifying the forces that shape DNA polymorphism, and especially ascertaining the importance of natural selection, therefore makes it necessary to determine as exactly as possible the demographic history of the species under study.

    The wealth of information available in D. melanogaster through genomics and functional genetics has made it a convenient model that is widely used in population genetics. However, our understanding of its population history still largely rests on studies using characters that are potential targets of natural selection, including biometrical traits (Teissier 1957), enzyme variation (Singh, Choudhary, and David 1987), chromosome inversions (Lemeunier et al. 1985; Aulard, David, and Lemeunier 2002), and mitochondrial polymorphism (Hale and Singh 1987, 1991). Biogeographical and systematic studies suggest that the cosmopolitan D. melanogaster originated from Africa (Lachaise et al. 1988) and colonized the rest of the world relatively recently (David and Capy 1988). Estimated times of colonization range from about ten thousands years ago for Europe (Bénassi and Veuille 1995) to a few hundred years for American and Australian populations (David and Capy 1988).

    The history of this species may confound population genetics studies in a number of ways. For instance, we still do not know for sure whether population structure exists within Africa (Lemeunier et al. 1994). If so, studies using different African samples of fruit flies may lead to conflicting results. This would also compromise combining African samples into a single set. Using different non-African populations may also be questioned, if the spread of D. melanogaster out of Africa involved different sets of invaders. Increasing interests in population genetic programs based on the sequencing of DNA variation at an unprecedented scale make it necessary to specify more thoroughly the main features of the D. melanogaster population model. This includes defining which D. melanogaster populations are ancestral and which are not, whether or not African populations of this species are structured, whether or not non-African populations were derived from African ones through a single or several colonizing events, and whether or not structure exists between derived populations.

    To document these points, we analyzed DNA sequence variation at several genes in several populations. We chose to conduct this study on X-linked genes, because the autosomes of D. melanogaster are highly polymorphic for chromosomal inversions. Inversion frequencies vary extensively across populations between West Africa and East Africa (Aulard, David, and Lemeunier 2002) and may be responsible for the structure that is sometimes observed (Bènassi and Veuille 1995; Michalakis and Veuille 1996; Veuille et al. 1998) and sometimes not (Aguadé 1998, 1999) between Ivory Coast (West Africa) and Malawi (East Africa) depending on loci. Of course, conclusions derived from X-linked loci are strictly valid for such loci only, and should be confirmed by independent evidence for autosomal loci.

    Material and Methods

    Data Collection

    We used eleven samples, originating from Ivory Coast, Niger, Kenya, Zimbabwe, Madagascar, China, India, Saudi Arabia, Russia, France, and the United States. Below are given for each sample the collector, date and place of collection: Ivory Coast, D. Lachaise, Lamto, 1989; Niger, J. David, Niamey, 2000; Kenya, C. Montchamp, 2002; Zimbabwe, D. Lachaise, Harare, 1996; Madagascar, M. Veuille, Tananarive, 2000; China, J. David, Beijing, 1992; India, J. David, New Dehli, 1993; Saudi Arabia, D. Lachaise, At-Ta'if National Wildlife Research Center, 1993; Russia, J. David, Ghergebil, 1992; France, M. Veuille, Cognac, 1990; United States, M. Veuille, Cambridge, MA, 1999.

    All samples were frozen as soon as possible after collection and stored at –80°C until DNA extraction, to minimize the loss of lines, cross-contamination, or mislabeling. Each sample was comprised of 14 D. melanogaster males, except the Zimbabwe sample, which consisted of 13 males. A D. simulans male from Tanzania was used as an outgroup. Genomic DNA was prepared from single flies using a QIAgen DNAeasy kit, following the manufacturer's instructions (Valencia, Calif.). For each individual, we collected sequence data at four loci: Sex-lethal, vermilion, sevenless, and runt (table 1). They are located in highly recombining regions, to minimize the probability of hitchhiking (Maynard Smith and Haigh 1974; Kaplan, Hudson, and Langley 1989; Stephan, Wiehe, and Lenz 1992) or background selection (Charlesworth, Morgan, and Charlesworth 1993; Hudson and Kaplan 1995; Charlesworth 1996). All four studied regions encode both intron and exon (table 1). PCR primers were designed using the reference D. melanogaster sequence and are available from the authors. Genomic DNA was PCR-amplified, sequenced directly using a Big-Dye sequencing kit and run on an ABI 377 automatic sequencer (Perkin-Elmer). All sequences were proofread and aligned manually using Proseq Version 2.9 (Filatov 2002). All polymorphic sites were independently checked by two authors. Aligned sequences were deposited into GenBank under accession numbers AY459615-AY459768 (Sex-lethal), AY458248-AY458401 (vermilion), AY459769-AY459922 (sevenless), and AY459923-AY460076 (runt).

    Table 1 X-Linked Loci Used in This Study.

    Data Analysis

    The DnaSP program, version 3.99 (Rozas and Rozas 1999), was used for most analyses. For each locus in each population we estimated the population mutation parameter per nucleotide site = 4Neu (where Ne is the effective population size and u is the mutation rate) using Watterson's estimate w(1975), based on the number of segregating sites and the nucleotide diversity (Tajima 1983). They were calculated on silent sites only (i.e., synonymous and noncoding sites), to allow comparisons between loci. Indels were excluded from all analyses.

    We calculated the number of haplotypes per sample and we used the four-gamete rule (Hudson and Kaplan 1985) to infer the minimum number of recombination events in the history of each sample. Several tests have been developed to determine significant departures of sequence data from neutral evolution. We used Tajima's D (1989), which summarizes the frequency spectrum, and Fu's FS (1997), which compares the observed number of haplotypes in a sample to the expected number under neutrality. These authors originally proposed to calculate the values of the test statistics expected under the neutral model without taking recombination into account. Although conservative, this leads to a substantial reduction of the power of the tests (Wall 1999). We therefore performed Tajima's and Fu's tests using the coalescent simulations implemented in DnaSP with a recombination rate of 5 x 10–9 event/generation/bp. This value is much lower than the estimates that we obtained by fitting a polynomial curve between the standard genetic distance and the amount of DNA along the X-chromosome (e.g., Kliman and Hey 1993; Comeron, Kreitman, and Aguadé 1999), and which range from 2.7 to 4.9 x 10–8 event/generation/bp. Using a recombination rate of 5 x 10–9 event/generation/bp should therefore be conservative while markedly increasing the power of the tests compared to the no recombination case (Wall 1999). Given this underestimation of the recombination rate, we performed a one-sided Fu's FS test, i.e., we tested data for a reduced number of haplotype only, and not for an excess, as this would not have been a conservative test.

    Genetic differentiation among populations was analyzed by means of the FST (Hudson, Slatkin, and Maddison 1992) statistic. It was calculated for each gene separately (results not shown) and for the four genes simultaneously by treating each polymorphic site as an independent locus (Hudson, Slatkin, and Maddison 1992). We used 10,000 permutations to determine whether the observed values of the statistics were statistically significant (Hudson, Boos, and Kaplan 1992).

    Relationships between haplotypes were illustrated using neighbor-joining trees (Saitou and Nei 1987) built by running MEGA version 2.1 (Kumar, Tamura, and Nei 1993) on Kimura's two-parameter distance (Kimura 1980). We did not use character-based analyses, as the presence of recombination in intraspecific data sets is misinterpreted as homoplasy by parsimony or maximum likelihood methods (Posada and Crandall 2002). Bootstrap values were generated from 10,000 replicates.

    Results

    DNA Polymorphism in African and Non-African Populations

    We aligned 583, 621, 526, and 604 bp of Sex-lethal, vermilion, sevenless, and runt sequences, respectively, in 153 X-chromosomes (see online Supplementary Material for alignments of polymorphic sites). Summary statistics of molecular variation and neutrality tests are shown in table 2 and figure 1. It will appear below that the Madagascar sample shows a variation pattern largely similar to that of non-African populations. "African" will therefore apply below to strictly continental African samples, while Madagascar will be included in "non-African" ones. Levels of variability differ sharply between African and non-African populations. At the four loci, w ranges from 1.43% to 3.29% in African populations (mean = 2.13%, weighting each of the loci and each of the populations equally), and from 0.00% to 2.02% in non-African ones (mean = 0.69%). The level of polymorphism differs markedly over loci in non-African populations (fig. 1), since vermilion, sevenless, and runt show an average polymorphism above 0.72%, whereas Sex-lethal shows a mean w of only 0.19%. The variation observed in non-African populations appears to be largely a subset of African variation, since the 39 polymorphic sites shared between African and non-African populations represent 76.0% of the total non-African variation but only 23.4% of the African one.

    Table 2 Summary Statistics of Molecular Variation and Neutrality Tests in Eleven D. melanogaster Populations.

    FIG. 1. Nucleotide diversity () at silent sites, Tajima's D, and Fu's FS statistics at four loci in 11 D. melanogaster populations. The Sex-lethal, vermilion, sevenless, and runt loci are represented with light grey, black, white, and dark grey bars, respectively

    Very few amino acid polymorphisms were observed. None of the four loci presented fixed nonsynonymous differences between D. melanogaster and D. simulans. The sevenless and vermilion genes showed two and three intraspecific low frequency replacement polymorphisms, respectively, whereas runt and Sex-lethal showed none. These observations are consistent with the action of purifying selection on most nonsynonymous sites in the four gene regions.

    Levels of Linkage Desequilibrium and Neutrality Tests

    The pattern of variability in non-African populations of D. melanogaster suggests strong haplotype structure at three of the four loci. Within each population, almost all sequences for vermilion, sevenless, and runt can be grouped into two haplotype families (see online Supplementary Material). At the runt locus, the two classes of haplotypes are almost equally represented, whereas at vermilion and sevenless one of the haplotype family is predominant. In contrast, Sex-lethal presents a very low level of polymorphism in non-African populations. In agreement with this observation, the observed number of haplotypes is usually very small given the sample size and the high number of segregating sites in the first three loci. Similarly, the minimum number of recombination events, Rm, is null for most loci in non-African populations. In contrast, no grouping into haplotype families appears at any loci in any of the African populations and recombination estimates are consistently high (table 2).

    To determine whether the data were in agreement with the expectations of the neutral model, we performed Tajima's D test and Fu's FS test (table 2 and fig. 1). No significant values were observed for any of these for African populations, though Fu's FS values were generally negative. On the contrary, non-African populations showed numerous significant departures from neutrality. Tajima's D showed six positive and four negative significant values. Both types of results are usually produced by the presence of two classes of haplotypes, as mentioned above. Positive values are observed when the two haplotype classes are evenly represented in a sample, whereas negative values are caused by rare haplotypes. Fu's FS test presents 14 significantly positive values in non-African populations, which confirms a general deficit of haplotypes given the number of polymorphic sites.

    Relationships Between Haplotypes

    We built neighbor-joining trees (Saitou and Nei 1987) to visualize relationships between haplotypes (fig. 2). The most noticeable characteristic of the trees is that a large majority of non-African chromosomes belong to one (Sex-lethal) or two (runt, vermilion, and sevenless) haplotype families (hereafter called "major haplotypes"), which are shared between populations. For example, 39 out of 84 chromosomes from non-African populations show the same haplotype at runt (e.g., Fra1), 27 chromosomes present another one (e.g., Rus51), seven chromosomes differ from one of these two haplotypes by only one or two substitutions, which are shared by no other population (e.g., Chi8 and Chi13), and two chromosomes present a haplotype that probably arose by recombination between the two frequent haplotypes (e.g., Chi19). The nine remaining chromosomes show either a haplotype identical to an African one (e.g., Chi12) or a haplotype of undetermined origin. A similar pattern is observed at the other three genes, except that Sex-lethal shows only one frequent haplotype in non-African populations. The population from Madagascar presents an intermediate situation. At each of the four loci, major non-African haplotypes are at a high frequency but between three and six chromosomes show different haplotypes, presumably of African origin.

    FIG. 2. Neighbor-joining trees of the 153 sampled D. melanogaster X-chromosomes based on Sex-lethal, vermilion, sevenless, and runt sequences. Bootstrap values in percent are given above nodes when greater than 50%. Numbers beside each haplotype indicate how many individuals possess it, if greater than one. For easier visualization, non-African individuals are underlined and the frequent haplotypes present in these populations (see text) are marked by a curly bracket

    Genetic Differentiation Between Populations

    We estimated genetic differentiation among populations using FST (Hudson, Slatkin, and Maddison 1992) and the number of shared polymorphic sites between populations. Table 3 summarizes the values of the statistics for the pooled genes. Note that significance values must be taken cautiously, given the number of tests performed. The populations can be divided into two highly differentiated groups: African populations and non-African populations (including Madagascar). Pairwise comparisons between African and non-African samples showed a mean FST value of 0.259 and were all significant at the 0.01 level. Within the African group, we observed a moderate but highly significant differentiation (FST = 0.047, P = 0.0021). The differentiation level seems higher within the non-African group (FST = 0.149, P < 10–4). Note, however, that FST values are strongly influenced by the level of within-population diversity (Charlesworth 1998). The higher values of the statistics observed in comparisons involving one or several non-African population might partly result from their lower level of polymorphism. Interestingly, the Non-African group is heterogeneous. The "western" populations from the U.S., France, Saudi Arabia, and Russia are not genetically differentiated (FST = 0.034, P = 0.15; table 4). In contrast, the population from India is genetically different from all other non-African populations, except Madagascar. The Indian population shows the same haplotypes as the other non-African populations but at different frequencies. For example, two haplotypes of runt are observed at frequencies of 71.4% and 28.6% in France and of 7.1% and 71.4% in India, respectively, resulting in a very high differentiation (FST = 0.492) between these two populations at this locus. The population from China seems more closely related to the western group than to the Indian population (however, it is significantly different from the population of France).

    Table 3 Pairwise Estimation of FST (Upper Right Matrix) and Number of Shared Polymorphic Sites (Lower Left Matrix) at the Four Loci.

    Table 4 Tests of Differentiation Among Groups of Populations at the Four Loci.

    Note that when the four loci are considered separately (results not shown), similar global tendencies are observed but individual comparisons may vary. For example, a significant differentiation is observed at each locus between almost all African and Non-African populations. In contrast, the population from Ivory Coast is significantly different from all other African populations when the four loci are considered simultaneously (table 3), but this pattern is observed only at the sevenless locus when the loci are considered separately (not shown). Data on more loci would be needed for a better appraisal of differentiation in African populations.

    Discussion

    Demographic Versus Selective Effects in Non-African Populations

    We found three important features in non-African populations of D. melanogaster. One of them is well documented; the other two are mostly new. First, consistent with previous reports (Begun and Aquadro 1993, 1995; Schl?tterer, Vogl, and Tautz 1997; Langley et al. 2000; Andolfatto 2001; Kauer et al. 2001), we found that non-African populations of D. melanogaster are less variable than African ones and that most segregating sites in non-African populations are also polymorphic in Africa. This is consistent with the accepted hypothesis that D. melanogaster originated from Africa and expanded its range to the rest of the world (David and Capy 1988; Lachaise et al. 1988). The differences between African and non-African populations have been interpreted in two major ways: the polymorphism of the non-African populations was either reduced through selective adaptation to new environments, or through demographic events having taken place during range expansion, including bottlenecks and founder events (David and Capy 1988; Begun and Aquadro 1993; Langley et al. 2000; Odgers et al. 2001).

    Second, we observed that two haplotypic classes were present in most populations at three of the four loci. This is unlikely to be a result of a bias in our sampling. A similar haplotype dimorphism had been mentioned by Teeter et al. (2000). They analyzed a worldwide sample of D. melanogaster mainly composed of non-African individuals and observed two divergent classes of haplotypes at 13 out of 20 loci. Single locus studies have also often reported haplotype dimorphism in non-African populations (Begun and Aquadro 1994; Begun and Aquadro 1995; Kirby and Stephan 1995; Cirera and Aguadé 1997; Andolfatto and Kreitman 2000; Lazzaro and Clark 2001; Zurovcova and Ayala 2002; Balakirev et al. 2003). This feature was not always easily recognized, and it was not easy to interpret in single-locus studies that were often designed to detect selection in candidate genes. However, its constancy across a number of case studies makes it a salient trait of non-African populations that is especially apparent in our study.

    Third, the frequent haplotypes found in non-African populations (either one or two) are the same in all such populations, even though they occur in various proportions. This observation is new, because, to our knowledge, our study is the first documenting multiloci DNA sequence variation in a large number of population samples.

    Demographic or selective effects could, in principle, both explain haplotype dimorphism and the presence of the same haplotypes in all derived populations. At least two demographic explanations can account for these features. A non-African population may have undergone a very strong bottleneck or series of bottlenecks, resulting in the survival of only a small number of haplotypes, before colonizing the rest of the world. Some loci of the non-African population therefore appear to be monomorphic while other ones are dimorphic. On the other hand, the non-African populations may have formed relatively recently by admixture of two divergent populations, as suggested by Teeter et al. (2000). This mixed stock then happened to be the common ancestor of all non-African populations surveyed in our study. This second scenario is more complicated than the former, but it does not explain more facts and is therefore not the most parsimonious explanation of the data.

    Alternatively, natural selection alone could have caused the observed pattern of polymorphism. It seems reasonable to suppose that colonizing populations of D. melanogaster should have genetically adapted to their new environmental conditions. If selective sweeps have been common during this adaptation, then we also expect to frequently detect the footprint of partial sweeps. Such incomplete sweeps are for example expected when recombination takes place between the region surveyed and the selected site during the selective stage, and they can produce a haplotype structure (Depaulis, Mousset, and Veuille, in press). However, Fay and Wu's H test (2000) that was designed to specifically detect such events is almost never significant in our dataset (not shown). Furthermore, we have analyzed populations from a wide geographical range, extending from the Far East to North America. It seems unlikely that the environmental conditions prevailing in these areas would select the same alleles in all populations, both because the environmental conditions probably vary spatially and because similar selection pressures in different populations would not necessarily select the same alleles at the same loci. Finally, we could hypothesize that the dominant model of selection in D. melanogaster is balancing selection. Long-term coexistence of two alleles has been documented in D. melanogaster Adh, where it leaves a variation pattern distinct from that of typical selective sweeps (Kreitman and Hudson 1991). Balancing selection could similarly have produced the haplotype dimorphism observed at runt, vermilion, and sevenless in our non-African samples. However, balancing selection has been actively searched for years in Drosophila populations (reviewed in Lewontin 1974) and was found to be uncommon. In conclusion, even though we can exclude natural selection at none of the surveyed loci, this hypothesis seems unnecessary as a framework for explaining the overall pattern observed in our data. While other explanations cannot be ruled out, a severe bottleneck occurring during the out of Africa colonization is a parsimonious explanation of these X-chromosome polymorphism data.

    When the autosomes are considered, a more complex pattern becomes apparent. Analyses based on sequence polymorphism (reviewed in Andolfatto 2001) and a large number of microsatellite loci (Kauer et al. 2001) have demonstrated that X-chromosomes are more variable than autosomes in African populations of D. melanogaster, whereas non-African populations show a markedly lower level of polymorphism on the X-chromosome compared to the autosomes. Similarly, non-African populations of D. simulans, a closely related species that also recently spread out of Africa, also presents lower levels of variability on the X-chromosome (Begun and Whitley 2000). Three main hypotheses, not mutually exclusive, have been proposed to explain these unexpected features. First, the lower levels of diversity of African autosomes might be related to the presence of chromosomal inversions (e.g., Andolfatto 2001; Mousset et al. 2003). The autosomes of D. melanogaster show numerous inversions that are frequent in Africa and rare in temperate regions, whereas inversions are uncommon on the X-chromosome. Navarro et al. (2000) demonstrated that variability levels within inversions could be reduced until the inversion reaches equilibrium. Alternatively, the dissimilar polymorphism pattern observed in African and non-African populations might indicate that different selective forces have been operating on them. For example, Kauer et al. (2001) suggest that background selection shapes the neutral variability in the ancestral populations, whereas the adaptation of non-African populations to a new environment led to multiple selective sweeps. Finally, the polymorphism data suggest that the X-chromosomes of D. melanogaster suffered a much more severe reduction of polymorphism than the autosomes during the spread out of Africa. Wall, Andolfatto, and Przeworski (2002) have showed that a similar discrepancy in D. simulans is consistent with a simple bottleneck model, under a restricted range of parameter values. In conclusion, it will be necessary to gather more polymorphism data, particularly from the autosomes of African populations, to confirm the present findings.

    Age of Non-African Populations

    Based on biogeographical and ecological evidence, it is commonly thought that D. melanogaster expanded its range out of Africa in conjunction with the rise of agriculture after the Neolithic revolution, not earlier than 10,000 years ago (David and Capy 1988; Lachaise et al. 1988; Bénassi and Veuille 1995). This hypothesis can be tested using our data. In this purpose, we considered a simple scenario where an African population underwent a severe bottleneck T years ago and subsequently colonized various non-African habitats. We determined how many of the mutations observed in the non-African populations are likely to have appeared after this bottleneck. We identified these neomutations using the following criteria: they should be present in one non-African population and absent from the African populations; they should be derived, that is, not present in D. simulans; and they should be carried by a haplotype, that is, except for the neomutation itself, one of the frequent non-African haplotypes. Using these three criteria, we found that the six non-African populations (we excluded Madagascar, since its variation pattern suggests important admixture between African and non-African stocks) show a total of 12 mutations that likely took place after the bottleneck. If we suppose that the bottleneck has been immediately followed by exponential growth, then the genealogy of the sample resembles a star phylogeny (Slatkin and Hudson 1991). In this particular case, the post-bottleneck length of the genealogy of a sample composed of n individuals is n times T. Given that we examined a total of 1,446 silent sites in 6 x 14 = 84 individuals, and assuming a mean silent mutation rate of 15.4 x 10–9 substitutions/year/pb (Li 1997), we obtain an estimated age of T = 12/(1,446 x 84 x 15.4 x 10–9) = 6 415 years for the bottleneck. Note that if the assumption of exponential growth immediately after the bottleneck does not hold, then the genealogies of the samples will be shorter and the real age will consequently be older. As a result, 6,400 years is the minimum age of the bottleneck, although with considerable uncertainty. Our estimation is therefore in agreement with the hypothesis of a post-Neolithic expansion.

    Differentiation Within Groups of Populations

    The fact that African samples do not deviate from neutral mutation-drift equilibrium suggests that these populations have been established a long time ago and underwent little or no change in demographic regime ever since. It is tempting to consider the hypothesis of a differentiation between African populations. Such a differentiation has been found with microsatellites from chromosome 2 between Ivory Coast and Malawi (Michalakis and Veuille 1996) but could possibly be ascribed to changes in inversion frequencies across populations (Veuille et al. 1998). However, this factor should be negligible here, since we considered only X-linked loci (see above). In this study, a global test on all four populations is significant. In pairwise comparisons, the Ivory Coast population is the furthest from the other populations, both geographically and genetically. It shows significant differentiation from the other three populations when considering all four loci. When considering the loci separately, however, each locus presents a different pattern. Our study thus suggests that some differentiation exists for the X-chromosome within Africa, but a conclusive study on this subject will require more loci and probably more populations.

    Though the vast majority of non-African individuals carry the same frequent haplotypes, rare alleles are also shared with African populations: at the runt locus for the U.S. and China; at the runt, Sex-lethal, and vermilion loci for India; and at all loci for Madagascar. They can have two origins: either they survived the bottleneck that gave birth to non-African populations, or they represent subsequent immigrants from African origin. Note that the second possibility would agree with hypotheses by Caracristi and Schl?tterer (2003) that the U.S. population results from the admixture of European and African flies, perhaps originating from South America or the Caribbean.

    In this study, we included the Madagascar sample in "non-African" populations. This result clearly indicates that D. melanogaster is not native to Madagascar, as postulated by Lachaise et al. (1988) on the basis of biogeographical data. This is in contrast with results obtained in D. simulans (Baudry, Derome, Huet, and Veuille, unpublished results) that suggest that the latter species was established long ago in this island. An unexpected observation in derived populations is that Chinese and Indian populations somehow depart from the other derived populations, since FST is usually significant among them (including between them).

    Finally, an interesting question concerns the origin of the stock that invaded the rest of the world. Assuming that our sample is representative of African populations, the geographical origin of non-African D. melanogaster can be estimated using as a criterion the number of shared haplotypes between African and non-African samples. Out of seven frequent non-African haplotypes (two at vermilion, sevenless, and runt and one at Sex-lethal), six are also present in the Kenya sample, whereas between one and two are also present in the other three African populations. This suggests that the non-African populations originated from east-Africa, although, here too, more data are needed to confirm this hypothesis.

    Conclusion

    A number of molecular population genetics studies have been carried out in D. melanogaster to check the selective neutrality of polymorphism patterns at given loci. Departure from neutrality is often encountered in non-African populations (see, e.g., Kirby and Stephan 1996; Lazzaro and Clark 2001; Balakirev, Balakirev, and Ayala 2002; Zurovcova and Ayala 2002). Polymorphism data are usually interpreted within the framework of the neutral Wright-Fisher model, which assumes randomly mating and constant population size. Our data, like that of previous studies, indicate that Non-African populations are not at mutation-drift equilibrium but have probably undergone a severe bottleneck, followed in some cases by admixture. The effect of natural selection on polymorphism patterns cannot, therefore, be simply determined using the standard neutral Wright-Fisher model. It should include a model allowing for past demographic changes, which could be developed only when DNA sequence data on many loci become available. We also found significant structure in African populations using a small number of loci. This suggests that the use of mixes of African sample lines as representative samples of a so-called "African population" should be avoided.

    Acknowledgements

    We thank Frantz Depaulis, Sylvain Mousset, and two anonymous reviewers for helpful comments on a previous version of the manuscript. E.B. was supported by a grant from Ecole Pratique des Hautes Etudes. The research was supported by a grant from the CNRS Programme des séquen?age to M.V. and Christophe Terzian.

    Literature Cited

    Aguadé, M. 1998. Different forces drive the evolution of the Acp26Aa and Acp26Ab accessory gland genes in the Drosophila melanogaster species complex. Genetics 150:1079-1089.

    Aguadé, M. 1999. Positive selection drives the evolution of the Acp29AB accessory gland protein in Drosophila. Genetics 152:543-551.

    Andolfatto, P. 2001. Contrasting patterns of X-linked and autosomal nucleotide variation in Drosophila melanogaster and Drosophila simulans. Mol. Biol. Evol. 18:279-290.

    Andolfatto, P., and M. Kreitman. 2000. Molecular variation at the In(2L)t proximal breakpoint site in natural populations of Drosophila melanogaster and D. simulans. Genetics 154:1681-1691.

    Aulard, S., J. David, and F. Lemeunier. 2002. Chromosomal inversion polymorphism in Afrotropical populations of Drosophila melanogaster. Genet. Res. 79:49-63.

    Balakirev, E., V. Chechetkin, V. Lobzin, and F. Ayala. 2003. DNA polymorphism in the beta-Esterase gene cluster of Drosophila melanogaster. Genetics 164:533-544.

    Balakirev, E. S., E. I. Balakirev, and F. J. Ayala. 2002. Molecular evolution of the Est-6 gene in Drosophila melanogaster: contrasting patterns of DNA variability in adjacent functional regions. Gene 288:167-177.

    Begun, D., and C. Aquadro. 1993. African and North American populations of Drosophila melanogaster are very different at the DNA level. Nature 365:548-550.

    Begun, D., and C. Aquadro. 1994. Evolutionary inferences from DNA variation at the 6-phosphogluconate dehydrogenase locus in natural populations of Drosophila: selection and geographic differentiation. Genetics 136:155-171.

    Begun, D., and C. Aquadro. 1995. Molecular variation at the vermilion locus in geographically diverse populations of Drosophila melanogaster and D. simulans. Genetics 140:1019-1032.

    Begun, D., C. Aquadro, and P. Whitley. 2000. Reduced X-linked nucleotide polymorphism in Drosophila simulans. Proc. Natl. Acad. Sci. U.S.A. 97:5960-5965.

    Bénassi, V., and M. Veuille. 1995. Comparative population structuring of molecular and allozyme variation of Drosophila melanogaster Adh between Europe, west Africa and east Africa. Genet. Res. 65:95-103.

    Caracristi, G., and C. Schl?tterer. 2003. Genetic differentiation between American and European Drosophila melanogaster populations could be attributed to admixture of African alleles. Mol. Biol. Evol. 20:792-799.

    Charlesworth, B. 1996. Background selection and patterns of genetic diversity in Drosophila. Genet. Res. 68:131-149.

    Charlesworth, B. 1998. Measures of divergence between populations and the effect of forces that reduce variability. Mol. Biol. Evol. 15:538-543.

    Charlesworth, B., M. T. Morgan, and D. Charlesworth. 1993. The effect of deleterious mutations on neutral molecular variation. Genetics 134:1289-1303.

    Cirera, S., and M. Aguadé. 1997. Evolutionary history of the sex-peptide (Acp70A) gene region in Drosophila melanogaster. Genetics 147:189-197.

    Comeron, J. M., M. Kreitman, and M. Aguadé. 1999. Natural selection on synonymous sites is correlated with gene length and recombination in Drosophila. Genetics 151:239-149.

    David, J. R., and P. Capy. 1988. Genetic variation of Drosophila melanogaster natural populations. Trends Genet. 4:106-111.

    Depaulis, F., S. Mousset, and M. Veuille. 2003. Power of neutrality tests to detect bottlenecks and hitchhiking J. Mol. Evol. 57:(Suppl. 1): S190-200.

    Fay, J. C., and C. I. Wu. 2000. Hitchhiking under positive Darwinian selection. Genetics 155:1405-1413.

    Filatov, D. A. 2002. ProSeq: A software for preparation and evolutionary analysis of DNA sequence data sets. Mol. Ecol. Notes 2:621-624.

    Fu, Y. 1997. Statistical tests of neutrality of mutations against population growth, hitchhiking and background selection. Genetics 147:915-925.

    Hale, L. R., and R. S. Singh. 1987. Mitochondrial DNA variation and genetic structure in populations of Drosophila melanogaster. Mol. Biol. Evol. 4:622-637.

    Hale, L. R., and R. S. Singh. 1991. A comprehensive study of genic variation in natural populations of Drosophila melanogaster. IV. Mitochondrial DNA variation and the role of history vs. selection in the genetic structure of geographic populations. Genetics. 129:103-117.

    Hudson, R. R., D. D. Boos, and N. L. Kaplan. 1992. A statistical test for detecting geographic subdivision. Mol. Biol. Evol. 9:138-151.

    Hudson, R. R., and N. L. Kaplan. 1985. Statistical properties of the number of recombination events in the history of a sample of DNA sequences. Genetics 111:147-164.

    Hudson, R. R., and N. L. Kaplan. 1995. Deleterious background selection with recombination. Genetics 141:1605-1617.

    Hudson, R. R., M. Slatkin, and W. P. Maddison. 1992. Estimation of levels of gene flow from DNA sequence data. Genetics 132:583-589.

    Kaplan, N. L., R. R. Hudson, and C. H. Langley. 1989. The "hitchhiking effect" revisited. Genetics 123:887-899.

    Kauer, M., B. Zangerl, D. Dieringer, and C. Schl?tterer. 2001. Chromosomal patterns of microsatellite variability contrast sharply in African and non-African populations of Drosophila melanogaster. Genetics 160:247-256.

    Kimura, M. 1980. A simple method for estimating evolutionary rates of base substitution through comparative studies of nucleotide sequences. J. Mol. Evol. 16:111-120.

    Kirby, D. A., and W. Stephan. 1995. Haplotype test reveals departures from neutrality in a segment of the white gene of Drosophila melanogaster. Genetics 141:1483-1490.

    Kirby, D. A., and W. Stephan. 1996. Multi-locus selection and the structure of variation at the white gene of Drosophila melanogaster. Genetics 144:635-645.

    Kliman, R. M., and J. Hey. 1993. Reduced natural selection associated with low recombination in Drosophila melanogaster. Mol. Biol. Evol. 10:1239-1258.

    Kreitman, M., and R. R. Hudson. 1991. Inferring the evolutionary histories of the Adh and Adh-dup loci in Drosophila melanogaster from patterns of polymorphism and divergence. Genetics 127:565-582.

    Kumar, S., K. Tamura, and M. Nei. 1993. MEGA: Molecular evolutionary genetics analysis.

    Lachaise, D., M. Cariou, J. R. David, F. Lemeunier, L. Tsacas, and M. Ashburner. 1988. Historical biogeography of the Drosophila melanogaster species subgroup. Pp. 159–225 in M.K. Hecht, B. Wallace, and G. T. Prance, eds. Evolutionary biology, Vol. 22. Plenum Press, New York.

    Langley, C., B. Lazzaro, W. Phillips, E. Heikkinen, and J. Braverman. 2000. Linkage disequilibria and the site frequency spectra in the su(s) and su(w(a)) regions of the Drosophila melanogaster X chromosome. Genetics 156:1837-1852.

    Lazzaro, B., and A. Clark. 2001. Evidence for recurrent paralogous gene conversion and exceptional allelic divergence in the Attacin genes of Drosophila melanogaster. Genetics 159:659-71.

    Lemeunier F., S. Aulard, V. Benassi, and M. Veuille. 1994. Fruitfly origins. Nature 365:548-550.

    Lemeunier, F., J. David, L. Tsacas, and L. Ashburner. 1985. The Drosophila melanogaster species group. Pp. 147–256 in M. Ashburner and H. Carson, Jr., eds. The genetics and biology of Drosophila, Vol. 3. Academic Press, New York.

    Lewontin R. C. 1974. The genetics basis of evolutionary change. Columbia University Press, New York.

    Li, W. H. 1997. Molecular evolution. Sinauer Associates, Sunderland, Mass.

    Maynard Smith, J., and J. Haigh. 1974. The hitch-hiking effect of a favourable gene. Genet. Res. 23:23-35.

    Michalakis, Y., and M. Veuille. 1996. Length variation of CAG/CAA trinucleotide repeats in natural populations of Drosophila melanogaster and its relation to the recombination rate. Genetics 143:1713-1725.

    Mousset S., L. Brazier, M. M. Cariou, F. Chartois, F. Depaulis, and M. Veuille. 2003. Evidence of a high rate of selective sweeps in African Drosophila melanogaster. Genetics. 163:599-609.

    Navarro, A., A. Barbadilla, and A. Ruiz. 2000. Effect of inversion polymorphism on the neutral nucleotide variability of linked chromosomal regions in Drosophila. Genetics: 155:685-698.

    Odgers, W. A., C. F. Aquadro, C. W. Coppin, M. J. Healy, and J. G. Oakeshott. 2001. Nucleotide polymorphism in the Est6 promoter, which is widespread in derived populations of Drosophila melanogaster, changes the level of Esterase 6 expressed in the male ejaculatory duct. Genetics 162:785-797.

    Posada, D., and K. Crandall. 2002. The effect of recombination on the accuracy of phylogeny estimation. J. Mol. Evol. 54:396-402.

    Roders, R. A., and H. Harpending. 1992. Population growth makes waves in the distribution of pairwise genetic differences. Mol. Biol. Evol. 9:552-569.

    Rozas, J., and R. Rozas. 1999. DnaSP version 3: an integrated program for molecular population genetics and molecular evolution analysis. Bioinformatics 15:174-175.

    Saitou, N., and M. Nei. 1987. The neighbor-joining method: A new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4:406-425.

    Schl?tterer, C., C. Vogl, and D. Tautz. 1997. Polymorphism and locus-specific effects on polymorphism at microsatellite loci in natural Drosophila melanogaster populations. Genetics 146:309-320.

    Singh, R., M. Choudhary, and J. David. 1987. Contrasting patterns of geographic variation in the cosmopolitan sibling species Drosophila melanogaster and Drosophila simulans. Biochem. Genet. 4:27-40.

    Slatkin, M., and R. R. Hudson. 1991. Pairwise comparisons of mitochondrial DNA sequences in stable and exponentially growing populations. Genetics 129:555-562.

    Stephan, W., T. H. E. Wiehe, and M. W. Lenz. 1992. The effect of strongly selected substitutions on neutral polymorphism—analytical results based on diffusion theory. Theor. Popul. Biol. 41:237-254.

    Tajima, F. 1983. Evolutionary relationship of DNA sequences in finite populations. Genetics 105:437-460.

    Tajima, F. 1989. Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics 123:585-595.

    Teeter, K., M. Naeemuddin, R. Gasperini, E. Zimmerman, K. White, R. Hoskins, and G. Gibson. 2000. Haplotype dimorphism in a SNP collection from Drosophila melanogaster. J. Exp. Zool. 288:63-75.

    Teissier, G. 1957. Discriminative biometrical characters in French and Japanese Drosophila melanogaster. Proceedings of the XIVth International Genetics Symposia, 502–506.

    Veuille M., V. Benassi, S. Aulard, and F. Depaulis. 1998. Allele-specific population structure of Drosophila melanogaster alcohol dehydrogenase at the molecular level. Genetics 149:971-981.

    Wall, J. D. 1999. Recombination and the power of statistical tests of neutrality. Genet. Res. 74:65-79.

    Wall, J. D., P. Andolfatto, and M. Przeworski. 2002. Testing models of selection and demography in Drosophila simulans. Genetics 162:203-216.

    Watterson, G. A. 1975. On the number of segregating sites in genetical models without recombination. Theor. Popul. Biol. 7:256-276.

    Zurovcova, M., and F. Ayala. 2002. Polymorphism patterns in two tightly linked developmental genes, Idgf1 and Idgf3, of Drosophila melanogaster. Genetics. 162:177-188.(Emmanuelle Baudry, Barbar)