当前位置: 首页 > 期刊 > 《分子生物学进展》 > 2005年第3期 > 正文
编号:11176504
Nonhomogeneous Model of Sequence Evolution Indicates Independent Origins of Primary Endosymbionts Within the Enterobacteriales (-Proteobacte
http://www.100md.com 《分子生物学进展》
     Josephine Bay Paul Center for Comparative Molecular Biology and Evolution, Marine Biological Laboratory, Woods Hole, Massachusetts

    Correspondence: E-mail: herbeck@u.washington.edu.

    Abstract

    Standard methods of phylogenetic reconstruction are based on models that assume homogeneity of nucleotide composition among taxa. However, this assumption is often violated in biological data sets. In this study, we examine possible effects of nucleotide heterogeneity among lineages on the phylogenetic reconstruction of a bacterial group that spans a wide range of genomic nucleotide contents: obligately endosymbiotic bacteria and free-living or commensal species in the -Proteobacteria. We focus on AT-rich primary endosymbionts to better understand the origins of obligately intracellular lifestyles. Previous phylogenetic analyses of this bacterial group point to the importance of accounting for base compositional variation in estimating relationships, particularly between endosymbiotic and free-living taxa. Here, we develop an approach to compare susceptibility of various phylogenetic reconstruction methods to the effects of nucleotide heterogeneity. First, we identify candidate trees of -Proteobacteria groEL and 16S rRNA using approaches that assume homogeneous and stationary base composition, including Bayesian, maximum likelihood, parsimony, and distance methods. We then create permutations of the resulting candidate trees by varying the placement of the AT-rich endosymbiont Buchnera. These permutations are evaluated under the nonhomogeneous and nonstationary maximum likelihood model of Galtier and Gouy, which allows equilibrium base content to vary among examined lineages. Our results show that commonly used phylogenetic methods produce incongruent trees of the Enterobacteriales, and that the placement of Buchnera is especially unstable. However, under a nonhomogeneous model, various groEL and 16S rRNA phylogenies that separate Buchnera from other AT-rich endosymbionts (Blochmannia and Wigglesworthia) have consistently and significantly higher likelihood scores. Blochmannia and Wigglesworthia appear to have evolved from secondary endosymbionts, and represent an origin of primary endosymbiosis that is independent from Buchnera. This application of a nonhomogeneous model offers a computationally feasible way to test specific phylogenetic hypotheses for taxa with heterogeneous and nonstationary base composition.

    Key Words: nucleotide composition ? Buchnera ? insect endosymbionts ? Enterobacteriales ? phylogeny

    Introduction

    Phylogenetic inference can be confounded by various evolutionary factors, including unequal substitution rates among sites (Yang 1993), unequal transition and transversion rates (Kimura 1980), and substitution saturation among excessively divergent taxa. In addition, most nucleotide substitution models assume homogeneity of nucleotide composition among taxa (Felsenstein 1988), although this assumption is easily violated in nonsimulated data sets. While it is known that variable base composition among taxa can distort estimations of substitution rates (Tourasse and Li 1999), the effects of variable base composition on phylogenetic analysis is not entirely clear (Mooers and Holmes 2000; Conant and Lewis 2001). Simulation studies (Conant and Lewis 2001; Rosenberg and Kumar 2003) suggest that nucleotide heterogeneity among taxa does not negatively affect parsimony, distance, and likelihood methods. Conant and Lewis (2001) found that only extreme nucleotide bias and long branch lengths can lead parsimony to incorrect phylogenetic inference. Yet, analyses of biological data sets have shown that parsimony (Loomis and Smith 1990; Steel, Lockhart, and Penny 1995), distance (Lockhart et al. 1994; Galtier and Gouy 1995), and maximum likelihood methods (Chang and Campbell 2000) can mistakenly group unrelated species with similar GC contents. Thus, base composition heterogeneity is considered a potential problem for phylogenetic reconstruction.

    Several approaches have been proposed to account for variable base content among taxa. Hasegawa (Hasegawa and Hashimoto 1993; Hasegawa et al. 1993) found that nucleotide heterogeneity in 16S rRNA biased the phylogeny of deep diverging eukaryotes and suggested that amino acid sequences are more reliable. However, amino acid composition is also affected by nucleotide compositional bias (Foster, Jermiin, and Hickey 1997; Singer and Hickey 2000). Attempts to correct for this bias include LogDet (Lockhart et al. 1994), a distance method that transforms the substitution matrix to produce additive distances. The LogDet method does not consider rate variation among sites, and, similar to other distance methods, it performs poorly in analyses of taxa with moderate amounts of substitution saturation (Mooers and Holmes 2000). Galtier and Gouy (1995, 1998) have developed a nonhomogeneous Markov model of nucleotide substitution that allows equilibrium base composition to vary among lineages. This method modifies Tamura's (1992) substitution model of unequal transition and transversion rates and unequal nucleotide content (GC and AT), to include variable substitution rates among sites and variable GC content among branches. This model has been used in a maximum likelihood framework to estimate ancestral GC content of thermophilic organisms (Galtier, Tourasse, and Gouy 1999), and to infer phylogenies of Drosophilidae taxa (Tarrio, Rodriguez-Trelles, and Ayala 2001) and weevil endosymbiotic bacteria (Lefèvre et al. 2004).

    Such nonhomogeneous models may be particularly important in inferring phylogenies for bacteria, which show an exceptionally wide range of base compositional biases (ranging from 25% to 75% GC content; Sueoka 1962). Intracellular bacterial mutualists and pathogens have the lowest known genomic GC contents, and they represent several phylogenetically independent lineages (Moran and Wernegreen 2000; Shigenobu et al. 2000; Charles, Heddi, and Rahbe 2001; Moran 2002). Their AT-richness is most extreme at third codon positions and intergenic spacers, suggesting a strong effect of directional mutational pressure, or biased changes between GC and AT pairs (Muto and Osawa 1987; Sueoka 1988; Sueoka 1992). The Enterobacteriales includes free-living or gut-associated species (Escherichia coli, Salmonella typhimurium, Shigella flexneri, and Yersinia pestis) with moderate base compositions, as well as endosymbionts that form primary (obligate) and secondary (facultative, transient) associations with insects and that are relatively AT-biased (Moran and Telang 1998; Baumann, Moran, and Baumann 2000). The AT-bias of several primary endosymbionts within the Enterobacteriales is quite severe, at 26% GC (Shigenobu et al. 2000) in the aphid endosymbiont Buchnera, 22% GC in the tsetse fly endosymbiont Wigglesworthia (Akman et al. 2002), and 27% GC in the ant endosymbiont Blochmannia (Gil et al. 2003).

    In addition to their extreme AT-bias, primary endosymbionts are characterized by severe genome reduction. Among Buchnera, Blochmannia, and Wigglesworthia, genome sizes range from 450 kb (Gil et al. 2002) to 800 kb (Wernegreen, Lazarus, and Degnan 2002) compared to the 6 to 5.3 Mb for Escherichia coli genomes (Bergthorsson and Ochman 1995). Like other intracellular bacteria, these endosymbionts also experience fast rates of sequence evolution (Moran 1996; Woolfit and Bromham 2003), especially at nonsynonymous sites (Clark, Moran, and Baumann 1999; Wernegreen and Moran 1999), deleterious changes at the 16S rRNA gene (Lambert and Moran 1998), and AT-biased amino acid changes (Moran 1996; Clark, Moran, and Baumann 1999; Palacios and Wernegreen 2002; Herbeck, Wall, and Wernegreen 2003; Rispe et al. 2004). Mechanisms driving these shared features of endosymbiont genomes may include a combination of relaxed selection in the host intracellular environment, strong genetic drift resulting in part from the repeated vertical transmission through insect host generations and decreased effective population sizes (Moran 1996; Funk, Wernegreen, and Moran 2001; Abbot and Moran 2002; Mira and Moran 2002; Herbeck et al. 2003), and increased background mutation rates resulting from the loss of DNA repair loci during genome reduction (Mira, Ochman, and Moran 2001).

    The phylogenies of the -Proteobacteria and Enterobacteriales specifically have received considerable attention, owing in part to their ecological importance, the medical relevance of several species, and the diverse lifestyles this group represents (Lawrence, Ochman, and Hartl 1991; Sproer et al. 1999; Wertz et al. 2003; Canb?ck, Tamas, and Andersson 2004). Of particular interest for comparative genomic studies is the relative position of Buchnera, Blochmannia, and Wigglesworthia, as the full genome sequences of these taxa are now available (Shigenobu et al. 2000; Akman et al. 2002; Tamas et al. 2002; Van Ham et al. 2003; Gil et al. 2003). However, the phylogenetic position of these and other AT-rich endosymbionts has proved difficult to recover and often varies among studies. Published phylogenies that include Buchnera, Wigglesworthia, and/or Blochmannia often group them as sister taxa or within a clade that includes only endosymbionts (e.g., Schr?der et al. 1996; Heddi et al. 1998; Spaulding and von Dohlen 1998; Gil et al. 2003; Lerat, Daubin, and Moran 2003; Woolfit and Bromham 2003; Canb?ck, Tamas, and Andersson 2004). One exception is a phylogenetic study that considered heterogeneity in nucleotide biases in this group and that suggested a paraphyletic relationship among several primary endosymbionts, grouping Buchnera apart from Blochmannia and Wigglesworthia (Charles, Heddi, and Rahbe 2001). In addition, a tree estimated under a nonhomogeneous model (NJ-nh) in Lerat, Daubin, and Moran (2003; fig. 2) positioned Buchnera and Wigglesworthia in separate clades, prompting these authors to consider alternative topologies in which the two endosymbionts were not sister taxa. However, these candidate topologies were rejected as significantly less likely than topologies in which the two endosymbionts are sister taxa, based on an SH test (Shimodaira and Hasegawa 1999) implemented with a homogeneous model.

    Table 2 The 32-Taxa 16S rRNA Data Set Used for Phylogenetic Analysis, with GenBank Accession Numbers and Nucleotide Content for All Sites (1320 bp) and Variable Sites (504 bp)

    Phylogenetic Analysis

    Our approach was to estimate candidate starting trees using standard methods of phylogeny reconstruction that search tree space extensively. We then varied the placement of Buchnera across each starting tree and evaluated all possible permutations under a nonhomogeneous maximum likelihood model of sequence evolution (Galtier and Gouy 1995) (fig. 1).

    Table 4 Top Log Likelihood Scores, under the Galtier and Gouy Maximum Likelihood Model of Nonhomogeneous Nucleotide Substitution (ML-nh), of 234 Possible groEL Phylogenies

    We then compared –ln(L)nh scores of trees developed under a given method, noting the placement of Buchnera as polyphyletic or monophyletic relative to the Blochmannia and Wigglesworthia clade. This comparison determined whether various placements of Buchnera on the starting topologies significantly improved –ln(L)nh. Separation of Buchnera from Blochmannia and Wigglesworthia improved –ln(L)nh scores for MP, ML-gtr, and NJ-nh relative to scores for the original trees (which group all endosymbionts together). Notably, the tree with the best –ln(L)nh (the original MB tree) is polyphyletic. We also explored whether alternative placements of Buchnera had distinct likelihood distributions, regardless of the underlying topology. For MB, MP, ML-gtr, and NJ-nh, the distribution of –ln(L)nh for monophyletic permutations was typically lower than those of polyphyletic permutations of the same starting tree, and it was significantly different for MB and NJ-nh (Mann-Whitney U-test, P < 0.001*). This indicates that, across various starting topologies, trees that position Buchnera apart from other primary endosymbionts have better likelihood scores under the nonhomogeneous model than do trees that group primary endosymbionts together.

    Of the 234 groEL topologies considered, those with the top 10 –ln(L)nh scores included nine trees based on the initial MB phylogeny and one permutation of the NJ-nh topology. Resampling estimated log likelihood analysis shows no statistical support for the single top tree over the next nine topologies, but any one of the top 10 topologies is statistically supported over every 10th tree of the top 100 sampled. (The 10th tree is not significantly better than the 11th, but is significantly better than the 20th, 30th, etc.). The majority of topologies with the greatest nonhomogeneous likelihood place Buchnera separate from Blochmannia and Wigglesworthia, including 24 of the best-scoring 25 and 47 of the best-scoring 50 topologies. The majority-rule consensus of the top 10 trees under the NH model places the Buchnera clade sister to the free-living Erwinia herbicola, while Wigglesworthia and Blochmannia group with the Sitophilus oryzae primary endosymbiont and Sodalis (fig. 3). Consensus of groEL trees with the top 50–ln(L)nh scores does not change the placement of Buchnera relative to Blochmannia and Wigglesworthia, and it differs only in the level of resolution given to the placement of Buchnera within the free-living enteric clade (see figure 3 for consensus of top 10 and top 25 trees).

    FIG. 3.— Majority-rule consensus tree of the top 10 and top 25 topologies with the highest –ln(L)nh score under the ML-nh model, for groEL. Underlined taxa are insect endosymbionts, boldface taxa are primary endosymbionts. Frequencies shown are for first for the 10 and 25 top-scoring trees (10 trees/25 trees).

    Phylogeny of 16S rRNA

    The 16S rRNA analysis shows many of the same trends as groEL. First, the starting topologies estimated under the five homogeneous methods and NJ-nh vary considerably, and methods differ in their placement of Buchnera with or apart from Wigglesworthia and Blochmannia (fig. 4). Interestingly, some methods that group primary endosymbionts together at groEL (e.g., MP, ML-gtr), show polyphyletic relationships at 16S rRNA, and vice versa (e.g., MB). For 16S rRNA, the MB and ML-gtr topologies are identical except for the placement of Buchnera. The endosymbiont of Bemisia tabaci is basal in all 16S rRNA phylogenies, as expected.

    Table 5 Top Log Likelihood Scores, under the Galtier and Gouy Maximum Likelihood Model of Nonhomogeneous Nucleotide Substitution (ML-nh), of All Possible Buchnera Placements on the 16S rRNA Phylogeny, of 342 Total Topologies

    FIG. 5.— Majority-rule consensus tree of the top 10 and top 25 topologies with the –ln(L)nh scores for 16S rRNA. Underlined taxa are insect endosymbionts, boldface taxa are primary endosymbionts. Frequencies shown are for first for the top ten trees, then the top 25 trees (10 trees/25 trees).

    We also analyzed a 1,280-bp, 30-taxa, 16S rRNA data set that differed from the 32-taxa data set by lacking two endosymbionts and replacing one free-living taxon. Because the results from this 30-taxa data set only corroborate the results and interpretation of the 32-taxa data set described above, we therefore have placed this information in an online supplement.

    Discussion

    We have focused on the evolutionary relationships of AT-rich primary endosymbionts, using groEL and 16S rRNA genes and a nonhomogeneous substitution model that accounts for variable AT-bias among taxa. Our goal in this study was to develop a approach that takes advantage of (1) the tree-searching algorithms implemented by several standard phylogenetic methods, and (2) the utility of nonhomogeneous models to evaluate relationships among taxa with widely different base compositions. We first demonstrated that variable base composition strongly affects estimates of relationships among free-living and endosymbiotic Enterobacteriales, as the likelihood ratio tests show ML-nh is the best fit to both groEL and 16S rRNA data sets. Our main result is that a nonhomogeneous likelihood model supports the separation of Buchnera from Blochmannia and Wigglesworthia. Three lines of evidence presented here support that conclusion. First, homogeneous models gave incongruent topologies and sometimes placed Buchnera, Blochmannia, and Wigglesworthia as monophyletic, as found in previous studies. However, the nonhomogeneous likelihood scores (–ln(L)nh) of such "monophyletic" trees was always improved in a permutation that separated Buchnera from Blochmannia and Wigglesworthia. Second, across all standard phylogenetic methods, various permutations that separate Buchnera had overall better –ln(L)nh, an improvement that was often statistically significant. Third, for both 16S rRNA and groEL, the consensus trees of phylogenies with the best –ln(L)nh scores separate Buchnera from a clade that includes Wigglesworthia, Blochmannia, and other insect endosymbionts.

    The results of this study also shed light on which standard phylogenetic methods may be most robust for data sets with base composition variation. The consensus trees under the nonhomogeneous model are most similar to the original phylogenies estimated by MB (for 16S) and ML-gtr (for groEL). This observed robustness of MB and ML-gtr is consistent with a recent simulation study showing that a maximum-likelihood approach is best able to handle heterogeneity (Rosenberg and Kumar 2003). Interestingly, NJ-nh separated Buchnera from Blochmannia and Wigglesworthia at both groEL and 16S rRNA, suggesting that this method successfully accounts for base compositional variation. However, other aspects of this tree were apparently quite poor, as the NJ-nh tree (and permutations thereof) had lower –ln(L)nh than phylogenies based on most other methods. This result suggests that implementation of a nonhomogeneous model should not be limited to only an NJ approach.

    Comparison with Previously Published Phylogenies

    Our results do not provide a single best Enterobacteriales phylogeny, as the groEL and 16S rRNA consensus trees show slight differences (e.g., in the placement of Erwinia spp.) that may reflect the different taxon sets. The lability of Buchnera observed across our original phylogenies (figs. 2 and 4) is also reflected by the varying placements of Buchnera across published trees (Schr?der et al. 1996; Charles, Heddi, and Rahbe 2001; Gil et al. 2003; Canb?ck, Tamas, and Andersson 2004). Despite the variable position of Buchnera in our starting trees, evaluation of these placements (and many more, in permutations of these trees) under the nonhomogeneous models supports the grouping of Buchnera with a clade of enteric bacteria, apart from an endosymbiont clade that includes Blochmannia and Wigglesworthia.

    The fact that our results conflict with those of several previous studies warrants a comparison of the phylogenetic models and taxon sampling employed. First, previous studies suggesting monophyly of Buchnera, Wigglesworthia, and/or Blochmannia often employ homogeneous models. Similarly, we also found that the best trees of some homogeneous models group these endosymbionts closely together (e.g., ML-gtr and MP for 16S rRNA, and MP and MB for groEL). Previous studies that implemented nonhomogeneous models have typically used NJ-nh, and they either produced trees that separated Buchnera and Wigglesworthia but were statistically rejected by other methods (Lerat, Daubin, and Moran 2003), or they produced a tree in which these two endosymbionts were sister taxa (Canb?ck, Tamas, and Andersson 2004). In contrast, our use of nonhomogeneous model is likelihood-based rather than exclusively NJ-based.

    Second, several previous analyses have taken advantage of the numerous loci available in full genome sequences, but generally they included fewer taxa in the analysis (16 or fewer). The inclusion of numerous genes or entire genomes has obvious benefits for studies of phylogenomics (Canb?ck, Tamas, and Andersson 2004) and tests of lateral gene transfer (Daubin, Moran, and Ochman, 2003; Lerat, Daubin, and Moran 2003). However, for phylogenetic reconstruction per se, the trade-off between more genes or more taxa is not always clear, especially when taxa evolve at different rates and exhibit different compositional biases. Increased taxon sampling has been shown to be of greater benefit to phylogenetic accuracy than increased sequence length (Graybeal 1998; Pollock et al. 2002; Hillis et al. 2003). That is, more genes per taxon may actually cause convergence upon, and increased confidence in, the wrong tree. This is particularly true for data sets subject to long branch attraction (Hillis 1998), a potential issue in analyses of rapidly evolving, AT-rich taxa. For this reason, we included many endosymbionts (both primary and secondary) that are closely related to free-living and commensal enterics. Of course, the choice of loci is critical when few genes are considered. Notably, the inclusion of a genome-wide sample allowed Canb?ck, Tamas, and Andersson (2004) to examine the correlation of gene conservation to particular tree topologies. They show that relatively conserved (and relatively GC-rich) genes have more reliable phylogenetic signals and support the recent divergence of Buchnera from E. coli and Salmonella (Canb?ck, Tamas, and Andersson 2004). This result supports our use of the highly conserved genes groEL and 16S rRNA. Our results further suggest that even highly conserved genes will benefit from the use of nonhomogeneous models in phylogeny reconstruction.

    Implications for the Evolution of Endosymbiosis

    Our results have two implications for the evolution of endosymbiosis within the Enterobacteriales. First, Blochmannia and Wigglesworthia apparently represent an origin of primary endosymbiosis that is independent from Buchnera. Genome sequence data may also support this independent transition to primary endosymbiosis, as the three fully sequenced endosymbiont genomes share only 50% of their genes, or 70% for any pairwise comparisons between genera (Gil et al. 2003). Because transition to primary endosymbiosis is thought to impose immediate, severe genome reduction through large genome deletion events, such endosymbionts may rapidly become constrained to their particular host association (Moran and Mira 2001; Van Ham et al. 2003). This genome reduction may impose severe constraints on extracellular existence, and it may limit switching among hosts with different nutritional physiologies. The phylogenies presented here cannot distinguish whether Blochmannia and Wigglesworthia acquired the primary endosymbiotic lifestyle independently of each other. However, given that these two genomes share just 70% of their genes and are highly specialized to the nutritional physiology of their respective hosts, independent acquisitions of obligate endosymbiosis is the more likely possibility.

    Second, Blochmannia and Wigglesworthia are part of a diverse clade consisting of secondary endosymbionts of insects, suggesting that primary endosymbionts may evolve from secondaries. Koga, Tsuchida, and Fukatsu (2003) provide experimental evidence that secondary endosymbionts may move into the symbiotic niche of primaries. Specifically, they showed that a facultative endosymbiotic -Proteobacterium infected the cytoplasm of bacteriocytes of aphid hosts from which Buchnera had been eliminated, and that it thus compensated for the essential roles of Buchnera. A second example of the potential for primary endosymbionts to evolve from secondary endosymbionts is that of Sodalis glossinidius and the Sitophilus oryzae primary endosymbiont. The close phylogenetic association of this secondary endosymbiont of tsetse flies and this obligate mutualist of weevils shown here (figs. 3 and 5) is further corroborated by their shared maintenance and expression of a type III secretion system, which likely was acquired prior to their divergence (Dale et al. 2002). Although these endosymbionts are associated with distinct insect hosts, Sodalis is still capable of horizontal transmission (Aksoy, Chen, and Hypsa 1997; Dale and Maudlin 1999). Understanding specific routes by which diverse endosymbioses are established will require more extensive taxon sampling of facultative and primary endosymbiont lineages; however, the current study suggests that primary endosymbiosis has originated more often than previously thought, and it may represent the end of an evolutionary spectrum between the facultative and obligate intracellular lifestyles.

    Acknowledgements

    The authors thank N. Galtier and M. Gouy for providing the NHML program for implementation of nonhomogeneous models. We are grateful to F. Rodriguez-Trelles for invaluable assistance applying these models and sharing modified source codes. We thank A.G. McArthur for providing clusterpaup and other phylogenetic analysis programs; D. M. Hillis for helpful discussion of taxon sampling and related issues; and two anonymous reviewers for useful comments on an earlier version of this manuscript. This work was supported by grants to J.J.W. from the National Institutes of Health (NIH R01 GM62626–01), the National Science Foundation (DEB 0089455), and the Josephine Bay Paul and C. Michael Paul Foundation.

    References

    Abbot, P., and N. A. Moran. 2002. Extremely low levels of genetic polymorphism in endosymbionts (Buchnera) of aphids (Pemphigus). Mol. Ecol. 11:2649–2660.

    Akman, L., A. Yamashita, H. Watanabe, K. Oshima, T. Shiba, M. Hattori, and S. Aksoy. 2002. Genome sequence of the endocellular obligate symbiont of tsetse flies, Wigglesworthia glossinidia. Nat. Genet. 32:402–407.

    Aksoy, S., X. Chen, and V. Hypsa. 1997. Phylogeny and potential transmission routes of midgut-associated endosymbionts of tsetse (Diptera: Glossinidae). Insect. Mol. Biol. 6:183–190.

    Baumann, P., N. A. Moran, and L. Baumann. 2000. Bacteriocyte-associated endosymbionts of insects. in M. Dworkin, ed. The prokaryotes, a handbook on the biology of bacteria; ecophysiology, isolation, identification, applications, Springer-Verlag, New York.

    Bergthorsson, U., and H. Ochman. 1995. Heterogeneity of genome sizes among natural isolates of Escherichia coli. J. Bacteriol. 177:5784–5789.

    Canb?ck, B., I. Tamas, and S. G. E. Andersson. 2004. A phylogenomic study of endosymbiotic bacteria. Mol. Biol. Evol. 21:1110–1122.

    Chang, B. S., and D. L. Campbell. 2000. Bias in phylogenetic reconstruction of vertebrate rhodopsin sequences. Mol. Biol. Evol. 17:1220–1231.

    Charles, H., A. Heddi, and Y. Rahbe. 2001. A putative insect intracellular endosymbiont stem clade, within the Enterobacteriaceae, inferred from phylogenetic analysis based on a heterogeneous model of DNA evolution. C. R. Acad. Sci. III 324:489–494.

    Chenna, R., H. Sugawara, T. Koike, R. Lopez, T. J. Gibson, D. G. Higgins, and J. D. Thompson. 2003. Multiple sequence alignment with the Clustal series of programs. Nucleic Acids Res. 31:3497–3500.

    Clark, M. A., N. A. Moran, and P. Baumann. 1999. Sequence evolution in bacterial endosymbionts having extreme base compositions. Mol. Biol. Evol. 16:1586–1598.

    Cole, J., B. Chai, T. Marsh, R. Farris, Q. Wang, S. Kulam, S. Chandra, D. McGarrell, T. Schmidt, G. Garrity, and J. Tiedje. 2003. The Ribosomal Database Project (RDP-II): previewing a new autoaligner that allows regular updates and the new prokaryotic taxonomy. Nucleic Acids Res. 31:442–443.

    Conant, G. C., and P. O. Lewis. 2001. Effects of nucleotide composition bias on the success of the parsimony criterion in phylogenetic inference. Mol. Biol. Evol. 18:1024–1033.

    Dale, C., and I. Maudlin. 1999. Sodalis gen. nov. and Sodalis glossinidius sp. nov., a microaerophilic secondary endosymbiont of the tsetse fly Glossinia morsitans morsitans. Int. J. Syst. Bacteriol. 49(Pt 1):267–275.

    Dale, C., G. R. Plague, B. Wang, H. Ochman, and N. A. Moran. 2002. Type III secretion systems and the evolution of mutualistic endosymbiosis. Proc. Natl. Acad. Sci. USA 99:12397–12402.

    Daubin, V., N. A. Moran, and H. Ochman. 2003. Phylogenetics and the cohesion of bacterial genomes. Science 301:829–832.

    Felsenstein, J. 1981. Evolutionary trees from DNA sequences: a maximum likelihood approach. J. Mol. Evol. 17:368–376.

    ———. 1988. Phylogenies from molecular sequences: inference and reliability. Annu. Rev. Gen. 22:212–219.

    Foster, P. G., L. S. Jermiin, and D. A. Hickey. 1997. Nucleotide composition bias affects amino acid content in proteins coded by animal mitochondria. J. Mol. Evol. 44:282–288.

    Funk, D. J., J. J. Wernegreen, and N. A. Moran. 2001. Intraspecific variation in symbiont genomes: bottlenecks and the aphid-Buchnera association. Genetics 157:477–489.

    Galtier, N., and M. Gouy. 1995. Inferring phylogenies from DNA sequences of unequal base compositions. Proc. Natl. Acad. Sci. USA 92:11317–11321.

    ———. 1998. Inferring pattern and process: maximum-likelihood implementation of a nonhomogeneous model of DNA sequence evolution for phylogenetic analysis. Mol. Biol. Evol. 15:871–879.

    Galtier, N., M. Gouy, and C. Gautier. 1996. SEAVIEW and PHYLO_WIN: two graphic tools for sequence alignment and molecular phylogeny. Comput. Appl. Biosci. 12:543–548.

    Galtier, N., N. Tourasse, and M. Gouy. 1999. A nonhyperthermophilic common ancestor to extant life forms. Science 283:220–221.

    Gil, R., B. Sabater-Munoz, A. Latorre, F. J. Silva, and A. Moya. 2002. Extreme genome reduction in Buchnera spp.: toward the minimal genome needed for symbiotic life. Proc. Natl. Acad. Sci. USA 99:4454–4458.

    Gil, R., F. J. Silva, E. Zientz, F. Delmotte, F. Gonzalez-Candelas, A. Latorre, C. Rausell, J. Kamerbeek, J. Gadau, B. H?lldobler, R. C. van Ham, R. Gross, and A. Moya. 2003. The genome sequence of Blochmannia floridanus: comparative analysis of reduced genomes. Proc. Natl. Acad. Sci. USA 100:9388–9393.

    Graybeal, A. 1998. Is it better to add taxa or characters to a difficult phylogenetic problem? Syst. Biol. 47:9–17.

    Hasegawa, M., and T. Hashimoto. 1993. Ribosomal RNA trees misleading? Nature 361:23.

    Hasegawa, M., T. Hashimoto, J. Adachi, N. Iwabe, and T. Miyata. 1993. Early branchings in the evolution of eukaryotes: ancient divergence of Entamoeba that lacks mitochondria revealed by protein sequence data. J. Mol. Evol. 36:380–388.

    Heddi, A., H. Charles, C. Khatchadourian, G. Bonnot, and P. Nardon. 1998. Molecular characterization of the principal symbiotic bacteria of the weevil Sitophilus oryzae: a peculiar G + C content of an endocytobiotic DNA. J. Mol. Evol. 47:52–61.

    Herbeck, J. T., D. J. Funk, P. H. Degnan, and J. J. Wernegreen. 2003. A conservative test of genetic drift in the endosymbiotic bacterium Buchnera: slightly deleterious mutations in the chaperonin groEL. Genetics 165:1651–1660.

    Herbeck, J. T., D. P. Wall, and J. J. Wernegreen. 2003. Gene expression level influences amino acid usage, but not codon usage, in the tsetse fly endosymbiont Wigglesworthia. Microbiology 149:2585–2596.

    Hillis, D. M. 1998. Taxonomic sampling, phylogenetic accuracy, and investigator bias. Syst. Biol. 47:3–8.

    Hillis, D. M., D. D. Pollock, J. A. McGuire, and D. J. Zwickl. 2003. Is sparse taxon sampling a problem for phylogenetic inference? Syst. Biol. 52:124–126.

    Kimura, M. 1980. A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences. J. Mol. Evol. 16:111–120.

    Kishino, H., and M. Hasegawa. 1989. Evaluation of the maximum likelihood estimate of the evolutionary tree topologies from DNA sequence data, and the branching order in Hominoidea. J. Mol. Evol. 29:170–179.

    Koga, R., T. Tsuchida, and T. Fukatsu. 2003. Changing partners in an obligate symbiosis: a facultative endosymbiont can compensate for loss of the essential endosymbiont Buchnera in an aphid. Proc. R. Soc. Lond. Ser. B Biol. Sci. 270:2543–2550.

    Lambert, J. D., and N. A. Moran. 1998. Deleterious mutations destabilize ribosomal RNA in endosymbiotic bacteria. Proc. Natl. Acad. Sci. USA 95:4458–4462.

    Lefèvre, C., H. Charles, A. Vallier, B. Delobel, B. Farrell, and A. Heddi. 2004. Endosymbiont phylogenesis in the Dryophthoridae weevils: evidence for bacterial replacement. Mol. Biol. Evol. 21:965–973.

    Lerat, E., V. Daubin, and N. A. Moran. 2003. From gene trees to organismal phylogeny in the Prokaryotes: the case of the gamma-Proteobacteria. PLoS Biol. Oct. 1:E19.

    Lawrence, J. G., H. Ochman, and D. L. Hartl. 1991. Molecular and evolutionary relationships among enteric bacteria. J. Gen. Microbiol. 137:1911–1921.

    Lockhart, P. J., M. A. Steel, M. D. Hendy, and D. Penny. 1994. Recovering evolutionary trees under a more realistic model of sequence evolution. Mol. Biol. Evol. 11:605–612.

    Loomis, W. F., and D. W. Smith. 1990. Molecular phylogeny of Dictyostelium discoideum by protein sequence comparison. Proc. Natl. Acad. Sci. USA 87:9093–9097.

    Maddison, D., and W. Maddison. 2002. MacClade: analysis of phylogeny and character evolution. Sinauer Associates, Sunderland, Mass.

    Mira, A., H. Ochman, and N. A. Moran. 2001. Deletional bias and the evolution of bacterial genomes. Trends Genet. 17:589–596.

    Mira, A., and N. A. Moran. 2002. Estimating transmission size and population bottlenecks in maternally transmitted endosymbiotic bacteria. Microb. Ecol. 44:137–143.

    Mooers, A. O., and E. C. Holmes. 2000. The evolution of base composition and phylogenetic inference. Trends Ecol. Evol. 15:365–369.

    Moran, N. A. 1996. Accelerated evolution and Muller's rachet in endosymbiotic bacteria. Proc. Natl. Acad. Sci. USA 93:2873–2878.

    ———. 2002. Microbial minimalism: genome reduction in bacterial pathogens. Cell 108:583–586.

    Moran, N., and A. Telang. 1998. Bacteriocyte-associated symbionts of insects. Bioscience 48:295–304.

    Moran, N. A., and A. Mira. 2001. The process of genome shrinkage in the obligate symbiont Buchnera aphidicola. Genome Biol. 2:RESEARCH0054.

    Moran, N. A., and J. J. Wernegreen. 2000. Lifestyle evolution in symbiotic bacteria: insights from genomics. Trends Ecol. Evol. 15:321–326.

    Muto, A., and S. Osawa. 1987. The guanine and cytosine content of genomic DNA and bacterial evolution. Proc. Natl. Acad. Sci. USA 84:166–169.

    Palacios, C., and J. J. Wernegreen. 2002. A strong effect of AT mutational bias on amino acid usage in Buchnera is mitigated at high expression genes. Mol. Biol. Evol. 19:1575–1584.

    Pollock, D. D., D. J. Zwickl, J. A. McGuire, and D. M. Hillis. 2002. Increased taxon sampling is advantageous for phylogenetic inference. Syst. Biol. 51:664–671.

    Posada, D., and K. A. Crandall. 1998. MODELTEST: testing the model of DNA substitution. Bioinformatics 14:817–818.

    Rispe, C., F. Delmotte, R. C. van Ham, and A. Moya. 2004. Mutational and selective pressures on codon and amino acid usage in Buchnera, endosymbiotic bacteria of aphids. Genome Res. 14:44–53.

    Ronquist, F., and J. P. Huelsenbeck. 2003. MrBayes 3: Bayesian phylogenetic inference under mixed models. Bioinformatics 19:1572–1574.

    Rosenberg, M. S., and S. Kumar. 2003. Heterogeneity of nucleotide frequencies among evolutionary lineages and phylogenetic inference. Mol. Biol. Evol. 20:610–621.

    Schr?der, D., H. Deppisch, M. Obermayer, G. Krohne, E. Stackebrandt, B. H?lldobler, W. Goebel, and R. Gross. 1996. Intracellular endosymbiotic bacteria of Camponotus species (carpenter ants): systematics, evolution and ultrastructural characterization. Mol. Microbiol. 21:479–489.

    Shigenobu, S., H. Watanabe, M. Hattori, Y. Sakaki, and H. Ishikawa. 2000. Genome sequence of the endocellular bacterial symbiont of aphids Buchnera sp. APS. Nature 407:81–86.

    Shimodaira, H., and M. Hasegawa. 1999. Multiple comparisons of log-likelihoods with applications to phylogenetic inference. Mol. Biol. Evol. 16:1114–1116.

    Singer, G. A., and D. A. Hickey. 2000. Nucleotide bias causes a genomewide bias in the amino acid composition of proteins. Mol. Biol. Evol. 17:1581–1588.

    Spaulding, A. W., and C. D. von Dohlen. 1998. Phylogenetic characterization and molecular evolution of bacterial enndosymbionts in psyllids (Hemiptera: Sternorrhyncha). Mol. Biol. Evol. 15:1506–1513.

    Sproer, C., U. Mendrock, J. Swiderski, E. Lang, and E. Stackebrandt. 1999. The phylogenetic position of Serratia, Buttiauxella and some other genera of the family Enterobacteriaceae. Int. J. Syst. Evol. Microbiol. 49:1433–1438.

    Steel, M. A., P. J. Lockhart, and D. Penny. 1995. A frequency-dependent significance test for parsimony. Mol. Phylogenet. Evol. 4:64–71.

    Sueoka, N. 1962. On the genetic basis of heterogeneity of DNA content. Proc. Natl. Acad. Sci. USA 48:582–592.

    ——— 1988. Directional mutation pressure and neutral molecular evolution. Proc. Natl. Acad. Sci. USA 85:2653–2657.

    ———. 1992. Directional mutation pressure, selective constraints, and genetic equilibria. J. Mol. Evol. 34:95–114.

    Swofford, D. L. 2002. PAUP*. Phylogenetic analysis using parsimony (*and other methods). Version 4. Sinauer Associates, Sunderland, Mass.

    Tamas, I., L. Klasson L, B. Canb?ck, A. K. Naslund, A. S. Eriksson, J. J. Wernegreen, J. P. Sandstrom, N. A. Moran, and S. G. Andersson. 2002. 50 million years of genomic stasis in endosymbiotic bacteria. Science 296:2376–2379.

    Tamura, K. 1992. Estimation of the number of nucleotide substitutions when there are strong transition-transversion and G+C content biases. Mol. Biol. Evol. 9:678–687.

    Tarrio, R., F. Rodriguez-Trelles, and F. J. Ayala. 2001. Shared nucleotide composition biases among species and their impact on phylogenetic reconstructions of the Drosophilidae. Mol. Biol. Evol. 18:1464–1473.

    Tourasse, N. J., and W. H. Li. 1999. Performance of the relative-rate test under nonstationary models of nucleotide substitution. Mol. Biol. Evol. 16:1068–1078.

    Van Ham, R. C., J. Kamerbeek, C. Palacios, C. Rausell, F. Abascal, U. Bastolla, J. M. Fernandez, L. Jimenez, M. Postigo, F. J. Silva et al. (16 co-authors). 2003. Reductive genome evolution in Buchnera aphidicola. Proc. Natl. Acad. Sci. USA 100:581–586.

    Wernegreen, J. J., A. B. Lazarus, and P. H. Degnan. 2002. Small genome of Candidatus Blochmannia, the bacterial endosymbiont of Camponotus, implies irreversible specialization to an intracellular lifestyle. Microbiology 148:2551–2556.

    Wernegreen, J. J., and N. A. Moran. 1999. Evidence for genetic drift in endosymbionts (Buchnera): analyses of protein-coding genes. Mol. Biol. Evol. 16:83–97.

    Wertz, J. E., C. Goldstone, D. M. Gordon, and M. A. Riley. 2003. A molecular phylogeny of enteric bacteria and implications for a bacterial species concept. J. Evol. Biol. 16:1236–1248.

    Woolfit, M., and L. Bromham. 2003. Increased rates of sequence evolution in endosymbiotic bacteria and fungi with small effective population sizes. Mol. Biol. Evol. 20:1545–155.

    Yang, Z. 1993. Maximum-likelihood estimation of phylogeny from DNA when substitution rates differ over sites. Mol. Biol. Evol. 10:1396–1401.(Joshua T. Herbeck1, Patri)