当前位置: 首页 > 期刊 > 《分子生物学进展》 > 2005年第3期 > 正文
编号:11176498
Physicochemical Evolution and Molecular Adaptation of the Cetacean and Artiodactyl Cytochrome b Proteins
http://www.100md.com 《分子生物学进展》
     * Department of. Integrative Biology, Department Microbiology and Molecular Biology, Department of Physiology and Developmental Biology, and Department of Computer Science, Brigham Young University, Provo, Utah

    Correspondence: E-mail: david_mcclellan@byu.edu.

    Abstract

    Cetaceans have most likely experienced metabolic shifts since evolutionarily diverging from their terrestrial ancestors, shifts that may be reflected in the proteins such as cytochrome b that are responsible for metabolic efficiency. However, accepted statistical methods for detecting molecular adaptation are largely biased against even moderately conservative proteins because the primary criterion involves a comparison of nonsynonymous and synonymous substitution rates (dN/dS); they do not allow for the possibility that adaptation may come in the form of very few amino acid changes. We apply the MM01 model to the possible molecular adaptation of cytochrome b among cetaceans because it does not rely on a dN/dS ratio, instead evaluating positive selection in terms of the amino acid properties that comprise protein phenotypes that selection at the molecular level may act upon. We also apply the codon-degeneracy model (CDM), which focuses on evaluating overall patterns of nucleotide substitution in terms of base exchange, codon position, and synonymy to estimate the overall effect of selection. Using these relatively new models, we characterize the molecular adaptation that has occurred in the cetacean cytochrome b protein by comparing revealed amino acid replacement patterns to those found among artiodactyls, the modern terrestrial mammals found to be most closely related to cetaceans. Our findings suggest that several regions of the cetacean cytochrome b protein have experienced molecular adaptation. Also, these adaptations are spatially associated with domain structure, protein function, and the structure and function of the cytochrome bc1 complex and its constituents. We also have found a general correlation between the results of the analytical software programs TreeSAAP (which implements the MM01 model) and CDM (which implements the codon-degeneracy model).

    Key Words: Molecular adaptation ? cytochrome b ? Artiodactyla ? Cetacea ? protein evolution ? physicochemical amino acid properties

    Introduction

    The standard method for detecting positive selection is with one of many variations of the nonsynonymous to synonymous rate ratio (dN/dS) model. The current paradigm maintains that positive selection has been detected when the rate of nonsynonymous nucleotide change exceeds the rate of synonymous nucleotide change. Furthermore, dN/dS will fail to exceed 1.0 if, on average, nonsynonymous substitutions "do not offer a selective advantage" (Yang et al. 2000). Recent methods that measure selection on individual codon/amino acid sites (e.g., Suzuki and Gojobori 1999; Siltberg and Liberles 2002; Yang and Nielson 2002; Yang and Swanson 2002) or within structural domains or on individual phylogenetic branches (e.g., Gaucher, Miyamoto, and Benner 2003) help avoid problems associated with combining sites under positive selection with those that are highly conserved by subdividing the data. Furthermore, use of sliding windows (e.g., Fares et al. 2002; Lynn et al. 2004) and multiple models (e.g., Koshi, Mindell, and Goldstein 1999; Gaucher, Miyamoto, and Benner 2003; Ronquist and Huelsenbeck 2003) allow for independent estimates of local molecular evolutionary trends resulting in more realistic characterizations of evolutionary dynamics across the primary protein structure. However, one cannot conclude that positive selection has not taken place if dN/dS < 1.0, because even single amino acid changes can be adaptive if they are biochemically superior to extant alternatives. Using dN/dS as the sole method by which to detect positive selection is too conservative to detect single adaptive amino acid changes and is, thus, extremely limited in scope. This is especially true for the evolutionary characterization of adaptation in generally conservative gene sequences. The purpose of this study is, thus, to explore other methods by which to characterize molecular adaptation that bypass the use of dN/dS in favor of alternative information content, using the selectively conservative cetartiodactyl cytochrome b gene/protein as a model.

    Putative Adaptation of the Cetacean Cytochrome b Protein

    Integral membrane proteins of the mitochondrial cytochrome bc1 complex carryout electron transfer in the Q-cycle mechanism of mitochondria (Degli Esposti et al. 1993; Zhang et al. 1998). Cytochrome b (cyt-b), a key transmembrane structure of this complex, is its central catalytic protein and, thus, a key component of the cellular respiratory function. The active sites of cyt-b, to a great extent, establish the proton gradient that fuels the production of ATPs, thus, greatly influencing the overall metabolism of the organism. The amino acid evolution of cyt-b is generally conservative and is not expected to be affected by significant levels of positive selection for nonsynonymous change in mammalian carnivores, rodents, cetaceans, or artiodactyls (Andrews, Jermiin, and Easteal 1998; Grossman et al. 2001). However, in organisms where the demands of metabolic processes drastically change or shift during cladogenesis, the hypothesis that key functional cyt-b amino acid sites have experienced positive selection may be viable even though the overall evolution of the gene is evolutionarily conservative. Such a shift has been demonstrated among cetacean globins (Naylor and Gerstein 2000) and also is likely to have occurred in cetacean cyt-b proteins.

    The mitochondrial cyt-b gene and associated protein have been studied extensively in the past several years. Sequences of cyt-b have been utilized in character sets for phylogenetic reconstruction in a wide variety of organisms, including cetacean groups (e.g., Milinkovitch, Meyer, and Powell 1994; Arnason and Gullberg 1996). The importance of its domain structure and biochemical function has been determined in great detail (Degli Esposti et al. 1993; Zhang et al. 1998; Iwata et al. 1998). The cyt-b protein is composed of three functional domains: the intermembrane domain, which is composed of four loops (designated ab, cd, ef, and gh)—the two central loops are much longer than the others—that extend into the mitochondrial cristae; the transmembrane, which is composed of eight -helices (designated A to H) that traverse the inner mitochondrial membrane from the matrix into the cristae; and the matrix domain, which is composed of the two termini (amino [N] and carboxyl [C]) and three loops (designated bc, de, and fg), all of which extend into the mitochondrial matrix (Irwin, Kocher, and Wilson 1991; Degli Esposti et al. 1993; Zhang et al. 1998). The intermembrane domain has been found to be primarily involved in the creation of a proton gradient and the transfer of electrons to the cytochrome c protein. The transmembrane domain is primarily responsible for anchoring the protein securely within the inner mitochondrial membrane, but it also has been implicated in the creation of the proton gradient, and it provides ligation sites for protein-protein interactions within the cytochrome bc1 complex. The matrix domain, however, has been implicated in very few functional activities, the exception being a few amino acid residues close to the N-terminus that assist in the creation of the proton gradient (Degli Esposti et al. 1993).

    Other studies have established a correlation between the rate of nucleotide/amino acid evolution with the structure/function of the domains of cyt-b (Irwin, Kocher, and Wilson 1991; DeWalt et al. 1993; Griffiths 1997; McClellan and McCracken 2001). These studies have found that the transmembrane domain evolves most quickly overall, followed by the matrix domain. The intermembrane domain, however, evolves extremely slowly. The importance of maintaining the function of the Q-cycle mechanism by selection is almost always a predominant consideration in interpreting these differential rates of change.

    One model that has been successfully used to determine the sites affected by positive selection in terms of quantitative biochemical properties is that presented by McClellan and McCracken (2001). In their limited study of cetartiodactyl cyt-b, they found that although the intermembrane loops experienced fewer changes than expected overall, there were unexpected changes associated with polar requirement in one intermembrane loop in the combined data, including along the phylogenetic branch basal to all cetaceans. However, this study included only four cetacean species relative to just six amino acid properties, which had been shown to be indicative of purifying selection among other mammal species (Xia and Li 1998). Inclusion of additional biochemical, physical, energetic, and conformational amino acid properties, therefore, may lead to the discovery of the properties, or suites of properties, that may have been involved in the hypothesized molecular adaptation experienced by the cetacean cyt-b protein.

    One vital requirement of the McClellan and McCracken (2001) method (MM01) is a well-corroborated phylogenetic tree for both artiodactyls and cetaceans for comparison purposes. The backbones of such trees have been provided by Nikaido, Rooney, and Okada (1999) and Nikaido et al. (2001), respectively. Built using synapomorphic SINE and LINE retroposon insertions, it was conclusively shown that (among other things) cetaceans share sister taxonomic status with the artiodactyl family Hippopotamidae (Nikaido, Rooney, and Okada 1999) and that river dolphins are not monophyletic (Nikaido et al. 2001). The MM01 model uses independent estimations of phylogeny such as these to establish a chronology of observable molecular evolutionary events. The frequency of these events are analyzed to identify (1) amino acid properties that may have radically changed more often than expected by chance (presumably because of selection promoting the occurrence of radical amino acid replacements) and (2) amino acid sites associated with selection, thus, establishing a correlation between the sites of positive selection and the structure and function of the protein. A phylogeny reliably reconstructed from data that are independent of the sequence data being analyzed for selective influences is preferable to avoid any semblance of circularity. The trees constructed using SINE and LINE insertion events, thus, fulfill this preference relative to cetacean and artiodactyl cyt-b data.

    Using a comparative approach to implement the MM01 and other models, we seek to evaluate the molecular adaptation of cyt-b and identify those sites in the functional regions of the protein that may have experienced a shift in genic selection strategy. We also evaluate the major differences between cetacean and artiodactyls cyt-b evolution among the different functional domains and pay particular attention to any shifts among amino acid sites or physicochemical properties being affected by positive selection for radical change since the divergence of early cetaceans from their artiodactyl ancestors.

    We found shifts in both the amino acid residue loci and physicochemical properties influenced by positive selection in the cetacean cytochrome b protein when compared with that of artiodactyls. We also found shifts in the magnitude of selection influencing the evolution of functional subdomains of the protein. We also correlate these shifts with the known function of many of the amino acid residues (determined via site-directed mutagenesis) and the crystal structures of the subdomains in which they reside. Finally, we evaluate correlations found between the results of the MM01 analysis (which exclusively implements information about changes in physicochemical amino acid properties) and other models (that detect and measure selection in terms of patterns of nucleotide change). The general conclusions of this study are (1) selection models that implement dN/dS ratios are generally not sensitive enough to detect more subtle molecular adaptations; (2) there has been appreciable molecular adaptation in the cetacean cyt-b protein relative to that found in their terrestrial relatives, results that correlate closely to the known function of the protein and its domain structure; and (3) there is a close correlation between the results of the MM01 model and at least one other model that utilizes information other than the dN/dS ratio, which verifies our results and presents evidence that the MM01 model may be successfully applied to detecting specific amino acid property adaptations within even relatively conservative proteins.

    Materials and Methods

    Phylogenetics

    Cytochrome b (cyt-b) sequences were obtained for 15 cetaceans and 15 artiodactyls from GenBank (table 1). Taxa were chosen in an effort to emphasize interfamilial synapomorphies. Alignments were unambiguous because these cyt-b sequences contained no obvious insertions or deletions. Modeltest version 3.06 (Posada and Crandall 1998) was used to determine that GTR++I was the most likely model of molecular evolution. Phylogenetic analyses were performed using a maximum-likelihood optimality criterion, implementing a GTR++I model in PAUP* version 4.0b10 (Swofford 2001). Trees previously produced using parsimony reconstructions of inferred SINE and LINE insertion events were used to constrain tree searches for both cetacean (Nikaido et al. 2001) and artiodactyl (Nikaido, Rooney, and Okada 1999) phylogenetic reconstructions. Greater taxonomic sampling could have easily been implemented for use in this study, but smaller trees emphasizing interfamilial relationships were found to be preferable to avoid any undue statistical influence of autapomorphic or intrafamilial changes inferred along unduly numerous terminal branches.

    Table 1 Common Name, Genus, Species, and Family of Organisms from Which the Cytochrome b DNA Sequences Used in This Study Originated

    Codon and Substitution Analyses

    Estimates of dN/dS ratios for individual sites were calculated using MrBayes version 3.0b4 (Nielsen and Huelsenbeck 2002; Ronquist and Huelsenbeck 2003) and the codeml algorithm found in the software package PAML version 3.13 (Yang 1997) (more exact methods are described below). The cyt-b sequences and PAUP*-generated phylogenetic topologies were further analyzed using CDM version 2.0 (McClellan, Sailsbery, and Christensen 2003; McClellan et al. 2004), a computer package that implements the codon-degeneracy model (McClellan 2000; McClellan et al. 2004). Reconstructions of ancestral character states were accomplished by implementing a GTR+ +I model of molecular evolution. Discrete patterns of nucleotide substitution and amino acid replacement were inferred using the SUBST algorithm in CDM by directly comparing the terminal and basal nodes of each branch of the tree to estimate the number and type of individual nucleotide changes, including information as to the synonymy, codon position, type of base exchange (transition or transversion), and exact genic locus and tree branch on which it occurred. The fit of overall nonsynonymous substitution patterns to neutral expectations per functional domain was evaluated using the codon-degeneracy model (McClellan 2000; McClellan et al. 2004), and a sliding window was implemented to analyze the significance of this fit within local regions of the protein. Windows of varying sizes (i.e., 1, 5, 10, 20, 30, and 40 codons in width) were used to verify results.

    Statistical significance for each CDM window setting was determined using a Bonferroni correction for multiple comparisons, assuming complete independence of individual comparisons. This approach produces results that are conservative because the sliding window analysis implemented in CDM produces comparisons that are not entirely independent because of window settings overlap. Conservative analytical results, however, are desirable to ensure greater confidence. Therefore, statistical significance of the computational output of the CDM sliding window analysis (in the form of successive log-likelihood ratios) was determined using = 0.005 (99.5% confidence) rather than the more usual = 0.05 (95.0% confidence).

    Identifying Selective Influences

    Data were analyzed with the intention of identifying sites historically affected by positive selection using the dN/dS model M3 implemented in MrBayes version 3.0b4 (Ronquist and Huelsenbeck 2003) and models M0, M1, M2, M3, M7, and M8 in the PAML algorithm codeml version 3.13 (Yang 1997). MrBayes selection analysis (Nielsen and Huelsenbeck 2002) was accomplished using a GTR+ +I model for one million tree generations and a 65,100-generation burn-in for cetaceans and a 113,800-generation burn-in for artiodactyls.

    The magnitude of changes in physicochemical amino acid properties were evaluated using a modification of the MM01 model (McClellan and McCracken 2001) (These modifications, along with the components of the model that were not modified, are briefly outlined below to establish context.), which measures selective influences based on the magnitude of changes in 31 physicochemical amino acid properties (table 2). Information necessary for this evaluation was generated by the computer package TreeSAAP version 2.2 (Woolley et al. 2003).

    Table 2 Physicochemical Amino Acid Properties Used for Selection Analysis in This Study

    The MM01 Model with Modifications

    For the purposes of this study, each range of possible one-step changes in a particular amino acid property (table 2) as governed by the structure of the governing genetic code was divided into eight magnitude categories of equal magnitude range, with lower category numbers denoting more conservative changes and higher category numbers denoting more radical changes. Each of the nine possible nucleotide changes in every codon (three at each codon position) of every DNA sequence within the data set was evaluated and characterized. Each potentially nonsynonymous change (or evolutionary pathway) was assigned to one of the magnitude categories for each amino acid property independently by considering differences in each property for the corresponding change in amino acid residue. Summed across the length of the sequences and taxonomic breadth of the data set, the relative frequencies of evolutionary pathways assigned to the eight magnitude categories establish an expected distribution (or null hypothesis) if it is assumed that each possible pathway is equally likely under selectively neutral conditions (McClellan and McCracken 2001). If distributions of observed changes fail to fit the expected distribution based on the combined evolutionary pathways, the null hypothesis of selective neutrality is rejected, and the alternative hypothesis, that the sequences have been significantly influenced by selection, is accepted. The goodness-of-fit of observed to expected distributions may also be used as an optimality criterion by which to directly compare the relative magnitude of selection on alternative physicochemical amino acid properties.

    To further test the hypothesis of neutrality, the number of inferred amino acid replacements per magnitude category for a given property is divided by the number of evolutionary pathways assigned to that partition to calculate a proportion of fixed pathways, pi, where

    (1)

    Di is the number of inferred nonsynonymous substitutions of magnitude range i (i = 1, 2,..., m, where m is the greatest magnitude range in which the expected frequency of replacement is greater than zero [McClellan and McCracken 2001]), rij is the number of possible evolutionary pathways of the same magnitude category across all possible nonsynonymous substitutions of genetic codon j (j = 1, 2,..., 60), and nj is the mean abundance of codon j in the extant DNA sequences. Under selectively neutral conditions, it is expected that every categorical proportion will be statistically equal to the overall mean. Therefore, when a violation to this null hypothesis is identified (using a normal distribution), it suggests that selection may be influencing the frequency of amino acid replacement at that magnitude of change. The test statistic (a z-score taken as a two-tailed test) for testing this more specific hypothesis is calculated as

    (2)

    where is the likelihood estimate of the mean proportion of nonsynonymous substitutions per pathway

    (3)

    is the standard error of the difference

    (4)

    and

    (5)

    DT is the total number of nonsynonymous substitutions. These calculations differ from the version of the model presented previously (McClellan and McCracken 2001) in that the former model statistically compared contiguous magnitude categories rather than comparing each to the overall mean for the system. Comparison to the mean renders the analytical results more interpretable and less ambiguous. Groups of substitutions implicated as being influenced by positive selection for radical amino acid changes (which we refer to as positive-destabilizing selection) for a particular amino acid property were subsequently mapped onto the phylogeny and protein sequence to establish a temporal and spatial context for each change.

    The average influence of selection on amino acid properties per magnitude category is evaluated using a likelihood estimation of a proportion similar to that calculated in equation 1 but taken across all amino acid properties under consideration, where

    (6)

    is the mean number of inferred nonsynonymous substitutions across all amino acid properties of magnitude range i, and the product rijnj is summed across codons, j, and amino acid properties, k (k = 1, 2, ..., h). Here again, the null hypothesis under consideration is that the local proportion (in this case Pi) is equal to the over all mean (equation 3). The test statistic (taken as a two-tailed test) used for testing this hypothesis is calculated

    (7)

    where the standard error is

    (8)

    and

    (9)

    Graphical representations of these test statistics may be interpreted according to figure 1A. Significant positive z-scores indicate that nonsynonymous substitutions of magnitude range i are more frequent than expected by chance and are, thus, influenced by positive selection (these amino acid replacements are being preferred by selection), whereas significant negative z-scores indicate they are less frequent than expected by chance and are influenced by negative or purifying selection. Furthermore, when positive selection is detected in lower, more conservative magnitude ranges (categories 1, 2, or 3) as outlined above, the amino acid property (equations 1, 2, 4, and 5) or properties (equations 6, 7, 8, and 9) are considered to be under a type of stabilizing selection (here defined as selection that tends to maintain the overall biochemistry of the protein, despite a rate of change that exceeds the rate expected under conditions of chance, as in figure 1B). Conversely, when positive selection is detected in greater, more radical magnitude ranges (categories 6, 7, or 8), the amino acid property or properties are considered to be under destabilizing selection (here defined as selection that results in radical structural or functional shifts in local regions of the protein, as in figure 1D). We make the assumption that positive-destabilizing selection represents the unambiguous signature of molecular adaptation because when radical changes are favored by selection, they result in local directional shifts in biochemical function, structure, or both. For such changes to be favored by selection (i.e., for such changes to be more abundant than expected by chance), they must instill an increased level of survival and/or reproductive success in the individuals who possess and propagate them.

    FIG. 1.— Diagrams that illustrate the characterization of selection algorithm in software package TreeSAAP. (A) Positive selection is detected when the number of inferred amino acid replacements significantly exceed the number expected by chance alone, resulting in positive z-scores. Negative selection is detected when the expected number of amino acid replacements significantly exceeds those that are inferred, resulting in negative z-scores. Selection is stabilizing if the magnitude of change is low (conservative) but destabilizing if the magnitude of change is high. (B) An example of an amino acid property that has been affected by positive-stabilizing selection (dotted lines define a zone of near neutrality; z-scores within this zone do not deviate significantly from zero). For the purposes of this study, an amino acid property is said to be influenced by positive-stabilizing selection when the frequency of changes in magnitude categories 1, 2, and/or 3 exceed the frequency (or frequencies) expected by chance. (C) An example of an amino acid property that has evolved nearly neutrally (none of the z-scores deviate significantly from zero). (D) An example of an amino acid property that has been affected by positive-destabilizing selection (radical amino acid replacements are more frequent than expected by chance). For the purposes of this study, an amino acid property is said to be affected by positive-destabilizing selection when the frequency of changes in magnitude categories 6, 7, and/or 8 exceed the frequency (or frequencies) expected by chance.

    Both the CDM and MM01 produce goodness-of-fit scores that indicate the degree of fit between observed distributions of genetic change and neutral expectations. However, the CDM log-likelihood ratio estimates the extent to which the pattern of nucleotide substitution corresponds to the pattern expected under neutral conditions relative to partitions defined by synonymy, base-exchange characteristics, and codon structure. The CDM is, therefore, based on the mechanics of nucleotide substitution as filtered through the governing genetic code. The MM01 chi-square score estimates departure from a neutral pattern of amino acid replacement based on expected frequency distributions of magnitude changes in amino acid properties under completely random conditions. These distributions are property specific and are generated by equally weighting all possible single-step evolutionary pathways also determined by the structure of the governing genetic code. These two goodness-of-fit scores are generated utilizing quite different genetic information, although they may originate from the same data. The relationship between these test statistics has not been explored.

    The relationship between MM01 goodness-of-fit scores and categorical z-scores, however, is easily explained. Whereas the GF-score estimates a global goodness-of-fit of the data to random expectations and tests the hypothesis that observed and expected distributions are equal, the z-score, as defined in this study, estimates local goodness-of-fit of the data in a particular magnitude category to random expectations and tests the hypothesis that rates of amino acid replacement per evolutionary pathway within that magnitude category is equal to the overall mean rate. Whereas an individual categorical z-score within a distribution of amino acid property changes may indicate rejection of the null hypothesis locally, the goodness-of-fit for that amino acid property may indicate global acceptance of the null. Thus, a particular magnitude of amino acid change may be favored or disfavored by selection, although the average effect across all magnitudes of change may appear (deceptively) nearly neutral. Therefore, both statistical analyses are necessary for appropriate interpretation of the data.

    Results and Discussion

    Phylogenetic analyses (with SINE/LINE insertion-mediated constraints) produced cetacean and artiodactyls trees that conform to previous hypotheses (Nikaido, Rooney, and Okada 1999; Nikaido et al. 2001) (fig. 2). The hypothesis that the three functional domains of the cytochrome b protein are influenced differentially by selection (Irwin, Kocher, and Wilson 1991; DeWalt et al. 1993; Griffiths 1997) was confirmed by the codon-degeneracy model (CDM) analysis; the cetacean transmembrane domain is affected most, followed by the matrix domain, and the intermembrane (table 3). The CDM goodness-of-fit scores further indicate that cetaceans have been influenced by more selection than have artiodactyls. This conclusion is consistent with the hypothesis that the cetacean cyt-b has experienced a greater selective influence. However, the principle mode of selection affecting conservative proteins, such as cyt-b, is positive selection on stabilizing changes and negative selection on destabilizing changes (discussed in greater detail with regard to cetaceans and artiodactyls hereafter); thus, an overall increase in selective influences as measured by the CDM may indicate that the majority of changes experienced by the cetacean cyt-b were most likely stabilizing in nature rather than consisting of radical global shifts in biochemical or structural properties, which are much more likely to be deleterious.

    FIG. 2.— Phylogenetic trees for artiodactyls and cetaceans (see Nikaido, Rooney, and Okada [1999] and Nikaido et al. [2001] for the topologies of the constraint trees). Dots superimposed on the branches of these trees represent suites of amino acid properties that were historically affected by positive-destabilizing selection within the cyt-b intermembrane functional domain. Light-gray dots correspond to P; dark-gray dots correspond to Pc, c, and Ht; and black dots correspond to pK' and Ra. Of the three suites of amino acid properties, one (light gray) is influenced by positive-destabilizing selection in both clades, whereas the other two suites of properties change only in cetaceans (black) or artiodactyls (dark gray) exclusively.

    Table 3 Observed and Expected Patterns of Nucleotide Substitution Categorized by Synonymy, Codon Position, and Base-Exchange with Associated Probabilities That Observed Data Fit the Selectively Neutral Predictions of the Codon-Degeneracy Model

    The CDM sliding window was executed for six alternative scales (1, 5, 10, 20, 30, and 40 codons in width [see figure 3; results for sliding windows of 1, 5, and 40 codons in width are not illustrated]). Courser scales exhibit results that generally correspond to cyt-b domain structure (e.g., see figure 4). However, specific subdomains fit random expectations differently in these two groups. For example, the cetacean A-helix, D-helix, and E-helix and de-loop fit predictions better than the artiodactyl domain homologs, whereas the artiodactyl G-helix and gh-loop fit neutral expectations better than cetacean homologs. As suggested above, a poor goodness-of-fit is most probably an indication of increased stabilizing selection on a generally conservative gene. Conversely, a better goodness-of-fit may indicate that (1) functional constraints may have been relaxed, resulting in an increased probability of fixation among mutations in these subdomains, or (2) the overall rate of amino acid replacement slowed as a result of purifying selection, resulting in a greater statistical "cushion" because of smaller sampling. Thus, whereas constraints became more rigid in the cetacean G-helix and gh-loop subdomains, constraints either became relaxed in the A-helix, D-helix, and E-helix and de-loop or purifying selection slowed the overall rate in these regions such that the analytical results were not statistically significant. This dynamic likely corresponds to an evolutionary demand for strategic biochemical changes in some domains of the cetacean cyt-b protein and an increase in selective constraints in others as habitat conditions influencing the adaptive landscape of optimal cellular respiratory mechanisms drastically altered over a relatively short period of evolutionary time among early cetaceans.

    FIG. 3.— Results of CDM sliding window analysis for cetaceans and artiodactyls, with window sizes of 10, 20, and 30 codons. Resolution generally improves (becomes more continuous) with an increase in window size. This phenomenon may be the result of domain structure, the subdomains of which generally exceed 10 codons in size.

    FIG. 4.— Results of CDM sliding window analysis for cetaceans and artiodactyls with a window size of 20 codons and domain structure superimposed. Green areas correspond to matrix subdomains, blue areas correspond to transmembrane subdomains, and yellow areas correspond to intermembrane subdomains (see Zhang et al. [1998] domain boundaries inferred from crystal-structure images). Results of cetacean cyt-b sequence analysis are represented by darker colors, whereas results of artiodactyl cyt-b sequence analysis are represented by lighter colors. (A) and (B) are the front and back of the graph, respectively. The x-axis corresponds to codon site of the last codon of each sliding-window.

    The dN/dS ratio analyses produced by codeml (Yang 1997) and MrBayes (Ronquist and Huelsenbeck 2003) found only one site in artiodactyls (but none in cetaceans) historically affected by positive selection using the criteria dN/dS > 1.0 (amino acid site 241, codeml model M8 only [Yang and Bielawski 2000; Yang et al. 2000]; no particular model was any better than any alternative using a likelihood ratio test). These results support our opinion that selection models that implement dN/dS ratios as a criteria for detecting selection are generally not sensitive enough to detect subtle molecular adaptations. Furthermore, they underscore the need to employ alternative criteria for the detection of positive selection among sites within generally conservative protein-coding genes. Although dN/dS > 1.0 conditions most certainly indicate significant levels of historical positive selection, it is largely unreasonable to assume that conservative genes, such as the cyt-b sequences analyzed in this study, do not adapt via selection. The stringent constraints imposed by the function of cyt-b proteins within the electron-transport chain would greatly preclude the obvious effects of positive selection by traditional criteria. However, if nonsynonymous substitutions are partitioned by the molecular-phenotypic effects of each, positive selection for radical amino acid changes that may have a slower rate but occur more frequently than expected by chance may be more easily detected. In this respect, the MM01 model holds great potential because it differentiates molecular changes along a gradient from conservative to radical amino acid differences among homologous protein sequences, allowing rigorous testing of the null hypothesis that all possible amino acid replacements are equally likely over evolutionary time.

    MM01 analysis confirms that positive selection on stabilizing changes coupled with negative selection on destabilizing changes is the predominant mode of selective influence among amino acid replacements (as mentioned above) affecting both cetacean and artiodactyl cyt-b proteins within all three functional domains. The proportion of amino acid properties being influenced by positive selection in conservative magnitude categories 1, 2, and 3 is greater than the proportion of properties influenced by negative selection (frequencies of amino acid properties affected by positive [p] and negative [n] selection in artiodactyls and cetaceans combined: intermembrane, f(p) = 0.328, f(n) = 0.258; matrix, f(p) = 0.333, f(n) = 0.210; and transmembrane, f(p) = 0.355, f(n) = 0.253). The opposite is true (and much more pronounced) for the more radical magnitude categories 6, 7, and 8 (intermembrane, f(p) = 0.054, f(n) = 0.145; matrix, f(p) = 0.097, f(n) = 0.194; and transmembrane, f(p) = 0.065, f(n) = 0.575), especially in the transmembrane domain. Despite this, near neutrality dominates in every functional domain (on average), except the transmembrane (the frequency of amino acid properties that are nearly neutral across all magnitude categories: intermembrane = 0.597; matrix = 0.595; and transmembrane = 0.325). The relationship between positive and negative selection among stabilizing and destabilizing amino acid replacements is also apparent in figure 5, although all but a few magnitude categories resulting from analyses of the artiodactyl and cetacean transmembrane domains are within the statistical zone of near neutrality, as defined by the critical value at = 0.05. Thus, even though several individual amino acid properties have been shown to be influenced by selection, the average effect is of near neutrality. This underscores the importance of appropriately designing analytical experiments meant to describe the mode of molecular evolution such that biologically meaningful information is not obscured by combining conflicting results, such as is the case with these amino acid properties and dN/dS ratios in general.

    FIG. 5.— Mean effect of selective pressures across 31 amino acid properties (taken as z-scores that correspond to the maximum-likelihood estimation of the mean proportion of observed amino acid replacements per evolutionary pathway per magnitude category [see table 2 for a list of physicochemical amino acid properties considered in this study]) for the three functional domains of artiodactyl and cetacean cytochrome b proteins. Although several of these properties were found to be under either positive or negative selection within the intermembrane and matrix domains in this study, the average overall effect is nearly neutral (dashed lines indicate the zone within which the calculated proportions fail to significantly deviate from the mean [see equations 6–9]). The transmembrane domain, however, has been affected by significant levels of positive-stabilizing selection (on average) across all amino acid properties in both artiodactyls and cetaceans (see figure 1 for x-axis categorical definitions).

    The amino acid properties found to be influenced by positive selection for destabilizing amino acid replacements are summarized by amino acid residue site in tables 4 and 5. All three functional domains are affected by positive-destabilizing selection for -helical tendencies (P), and turn tendencies (Pt). These conformational amino acid properties (Prabhakaran and Ponnuswamy 1979) may well be important to the overall optimization of cyt-b function in cetartiodactyls and have been periodically adjusted during cladogenesis to maximize the biochemical effect of the spatial relationships between transmembrane -helices and the primary functional amino acid residues that influence the proton input/output function of the cytochrome bc1 complex. Such changes may have served to optimize the size and spatial interaction of intermembrane loops that contain smaller functional helices, constituting inhibitor sites that chemically interact directly with other proteins in the complex (Degli Esposti et al. 1993; Zhang et al. 1998; Iwata et al. 1998). Evidence for these claims is presented in the following sections.

    Table 4 Amino Acid Sites and Physicochemical Amino Acid Properties That Have Been Affected by Positive-Destabilizing Selection Among Artiodactyl or Cetacean Taxa During Cladogenesis

    Table 5 Amino Acid Properties by Domain That Have Been Affected by Positive Destabilizing Selection in Artiodactyls or Cetaceans During Cladogenesis

    Intermembrane Domain

    The cetacean and artiodactyl intermembrane domains share only two positively selected amino acid properties that favor destabilizing changes: P and Pt (table 5; see table 2 for list of abbreviations), both of which influence the overall conformational characteristics of domain components. These two amino acid properties represent the only positive-destabilizing selective influences that cetaceans and artiodactyls historically have had in common within this domain. Cetaceans, however, also have had an additional suite of amino acid properties that historically also have been affected by positive-destabilizing selection.

    The biochemical shift in cyt-b resulting from the transition of cetacean ancestors from a terrestrial to an aquatic habitat is illustrated phylogenetically in figure 2 and graphically in figure 6. The artiodactyl cyt-b intermembrane has historically been influenced by positive-destabilizing selection relative to a suite of properties that includes c (p > 0.999, p7), Ht (p > 0.999, p6), and Pc (p > 0.999, p8). As soon as cetaceans diverged from artiodactyls, this selection regime was replaced by selection for radical change in another suite of properties, including pK' (p > 0.999, p8), and Ra (p = 0.955, p7). Further evidence for cetacean molecular adaptation in this domain is illustrated in figure 7. Of all the amino acid residue loci influenced by positive-destabilizing selection, only three are exclusive to artiodactyls, and only two are shared. The remaining eight amino acid residue sites are under selection exclusively in cetaceans, most of which are closely associated with key functional regions (within helices or immediately adjacent to them) within the domain. Notably, all of the highlighted residue sites in the cd-loop and all but one in the ef-loop in figure 7 are exclusive to cetaceans. Derived characters at these loci underscore the molecular evolutionary shift experienced in this domain and represent, without exception, sites that are spatially associated with either other functional regions in the domain (as is the case with the three helices of the ab-loops and cd-loops) or other proteins in the complex (as are those in the cd1-helix and ef-loop) (Degli Esposti et al. 1993; Zhang et al. 1998; Iwata et al. 1998).

    FIG. 6.— Amino acid properties found to be affected by differential levels of positive-destabilizing selection (taken as z-scores that correspond to the proportion of observed amino acid replacements per evolutionary pathway per magnitude category [see table 2 for abbreviation definitions]) in either artiodactyl or cetacean cyt-b intermembrane domains, but not both. Artiodactyls have had three unique properties influenced: Pc (category 8), c (category 7), and Ht (category 6). Cetaceans have had two unique properties affected: pK' (category 8) and Ra (category 7). TreeSAAP failed to detect positive-destabilizing selection among corresponding properties in the other group of organisms (whether cetaceans or artiodactyls). Dashed lines indicate the zone within which the selective influence is nearly neutral (calculated proportions fail to significantly deviate from the mean [see equations 1–5).

    FIG. 7.— Schematic of the cyt-b intermembrane domain secondary amino acid sequence structure (residue sites illustrated as circles; every 10 residues are annotated by position number of the site). Shaded circles indicate amino acid sites that have been significantly influenced by positive-destabilizing selection. Gray circles are unique to artiodactyls, black circles are unique to cetaceans, and circles colored with gray-to-black gradients indicate that both artiodactyls and cetaceans have been affected. The two black circles connected with a dashed line indicate sites that are consistently correlated (i.e., the circle with the solid outline indicates a radical increase in pK', whereas the circle with the dashed gray outline indicates a radical decrease in pK').

    The close evolutionary relationship between the ab-helix and cd2-helix (see figure 7) may indicate that the cd-loop is twisted such that these two helices are in relatively close physical proximity to one another. This conclusion is consistent with the finding that the cd1-helix and ef-loop (along with the end of the C-helix) form a quinone-binding site for hydroquinone while ISP ("Rieske" [2Fe-2S] protein) is in its "Int" positional state (Iwata et al. 1998). This close physical relationship between the cd1-helix and ef-loop during the initial phase of electron transport places the ab-loop and cd2-helix in close proximity. These changes may also represent an evolutionary positional adjustment of the ab-loop and cd2-helix relative to the cd1-helix, all three of which have been implicated in the proton-output function of the complex that creates the proton gradient necessary for the production of ATP and, thus, cellular respiration (Degli Esposti et al. 1993).

    Finally, in artiodactyls, residues 57 and 247 (both of which are near the transmembrane-intermembrane boundary and are in either direct or close proximity to cytochrome c1 when ISP is in its "c1" state [Iwata et al. 1998]) have been influenced by selection on a single suite of properties: P Pc, c, Ht, and Pt (no other amino acid sites in the intermembrane domain were affected by positive-destabilizing selection on more than one property [see table 4]). Radical changes in these properties (three of which are conformational properties) at these sites may have constituted minor overall adjustments in the length and relative orientation of membrane-spanning transmembrane subdomains. Combined, these two sites experienced 25 amino acid replacements, all but two of which may have been influenced by positive-destabilizing selection.

    Matrix Domain

    Amino acid residue sites where positively selected radical changes took place in the matrix domain are illustrated in figure 8. Five of these residue loci are unique to artiodactyls, six are unique to cetaceans, and eight are shared. The five residues highlighted between residues 10 and 20 all are within the -helix, which is located in an area implicated in the proton input function of the protein (Degli Esposti et al. 1993). Three of the five residues are common to both artiodactyls and cetaceans, suggesting that an optimization process of this function has continued uninterrupted from early artiodactyls in both clades. These three residues have been influenced by positive selection for radical change relative to the same amino acid properties in both clades (pK', residue 14; P, residue 17; and pK', residue 19). Selection on the other two residues, however, appears to be differentially optimizing c in this local region of the protein.

    FIG. 8.— Schematic of the cyt-b matrix domain secondary amino acid sequence structure (residue sites illustrated as circles; every 10 residues are annotated by position number of the site). Shaded circles indicate amino acid sites that have been significantly influenced by positive-destabilizing selection. Gray circles are unique to artiodactyls, black circles are unique to cetaceans, and circles colored with gray-to-black gradients indicate that both artiodactyls and cetaceans have been affected.

    Another area of interest is the de-loop. A region of four contiguous residues have all experienced selection for different amino acid properties (c, residue 214; P, residue 215; Pt, residue 216; and h, residue 217) resulting in at least nine radical amino acid replacements occurring on eight different phylogenetic branches (internal and terminal) among cetaceans, affecting 11 of the 15 taxa considered in this study, with none being reversals. This tight cluster is unusual because the function of this region of the matrix is unknown beyond keeping the D-helix and E-helix in the correct spatial orientation. However, the de-loop possesses more residues (three) where mutations result in a depressed sensitivity of the cyt-b proton-input function than any other matrix subdomain (Degli Esposti et al. 1993).

    In contrast to the intermembrane domain, artiodactyls and cetaceans share all but two positively selected amino acid properties that favor destabilizing changes; P, pK', h, c, Ra, Ht, and Pt are common to both, but positive-destabilizing selection relative to Hp (p = 0.982, P6) is unique to artiodactyls, whereas F (p = 0.971, p8) is unique to cetaceans. Several residues exhibit evidence for suites of properties being affected simultaneously. However, this characteristic is expressed differentially in artiodactyls and cetaceans as well. Four of the five residues where suites of properties appear to be affected are clustered in the artiodactyl N-terminus (suite h/Ra/Ht/Pt in residue 2; pK'/h/Ra/Ht/Pt in residue 4; h/c/Pt in residue 11; and c/Ht in residue 25) and include eight amino acid replacements on five branches (all terminal) of the artiodactyl phylogenetic tree, affecting only five of the 15 taxa.

    Transmembrane Domain

    A schematic indicating the orientation of the transmembrane domain components to the inner mitochondrial membrane is illustrated in figure 9. In all, 60 residues were found to have been influenced by significant levels of positive-destabilizing selection: three in the A-helix, seven in the B-helix, seven in the C-helix, 10 in the D-helix, nine in the E-helix, five in the F-helix, six in the G-helix, and 13 in the H-helix. Of these, 16 are unique to artiodactyls, 19 are unique to cetaceans, and the remainder (25) are shared by both clades. There are two noteworthy clusters of these residues found to be affected by positive-destabilizing selection: (1) between residues 188 and 200 in the D-helix (within which is the heme-group ligation site H197) and (2) between residues 228 and 244 in the E-helix. Curiously, both of these clusters occupy protein regions that are densely populated with residues that have been found to be involved with the protein's proton-input function (Degli Esposti et al. 1993). Thus, the radical changes in these clusters may be the result of molecular adaptation just as changes associated with proton-input in the de-loop of the matrix domain also may be (as discussed above).

    FIG. 9.— Schematic of the cyt-b transmembrane domain secondary amino acid sequence structure (residue sites illustrated as circles; every 10 residues are annotated by position number of the site). Shaded circles indicate amino acid sites that have been significantly influenced by positive-destabilizing selection. Gray circles are unique to artiodactyls, black circles are unique to cetaceans, and circles colored with gray-to-black gradients indicate that both artiodactyls and cetaceans have been affected.

    In the first cluster (in the D-helix), few changes were influenced by positive-destabilizing selection along the more basal branches of the artiodactyl tree. However, one such change (A195T, a synapomorphic radical decrease in P for all taxa but Camelidae) was reversed twice (T195M, a synapomorphy for Ruminantia and Hippopotamidae; T195A, an autapomorphy in the pig). Another (T191A, a synapomorphic radical increase in P for all Ruminantia) was reversed only once (A191T, an autapomorphy in the chevrotain) but was duplicated via convergence at the base of the cetacean clade (T191M) and reversed twice more (M191T, a synapomorphy among Monodontidae, Phocoenidae, and Delphinidae; M191T, an autapomorphy in the La Plata river dolphin).

    The selection dynamics of amino acid properties influenced by positive-destabilizing selection in the second cluster (in the E-helix) is much more complex than that of the D-helix. The properties pK' and P are of particular interest because they exhibit differential patterns of radical change in the two taxonomic groups. Unlike the amino acid replacement cluster of the D-helix, the vast majority of the radical amino acid changes in the E-helix cluster are inferred to have taken place on terminal branches in both the cetacean and artiodactyl phylogenetic trees (89% and 91%, respectively). This may be at least somewhat caused by relatively recent fixation of ancestral polymorphisms (lineage sorting).

    Spatial Correlation of CDM and MM01 Analytical Results

    Although the correlation between the results of the CDM and MM01 analyses are not completely consistent, at nearly every locus where the CDM indicates that there has been selection among nonsynonymous nucleotide substitutions, MM01 indicates that there have been amino acid replacements affected by positive-destabilizing selection hiding among a greater number of stabilizing changes (fig. 10).

    FIG. 10.— A qualitative comparison of CDM sliding window output with Bonferroni correction and the number of sites found to be affected by positive-destabilizing selection for at least one physicochemical amino acid property at each setting within each CDM window setting along cetacean and artiodactyl cytochrome b (cyt-b) gene/protein data sets. Regions found to be affected by selection via CDM analysis are represented by blue boxes for cetaceans and green boxes for artiodactyls and are interspersed by black bars (regions not affected by selection). The number of sites found to be affected by positive-destabilizing selection with each CDM sliding-window setting is indicated by the height of the small blue, green, or green-to-blue gradient boxes along the top of each row of the figure. Small blue boxes represent the number of cyt-b sites under positive-destabilizing selection in cetaceans, and green boxes represent the number of cyt-b sites similarly affected in artiodactyls. Boxes containing green-to-blue gradients represent sites affected by positive-destabilizing selection in cetaceans and artiodactyls. Protein domain structure is superimposed along the bottom of each row of the figure, and a two-dimensional schematic of the protein relative to the inner mitochondrial membrane is included in the bottom right. Light-green represents the matrix domain, light-blue represents the transmembrane domain, and yellow represents the intermembrane domain. In many cases, the results of these analyses appear correlated. Regions of disparity most likely result from the different levels of analytical specificity inherent to each test. A correlation between these analytical methods, therefore, indicates a high level of confidence in the results.

    It is not overly surprising that CDM and MM01 exhibit at least a few regions of disagreement, because these models all utilize different information content within the genetic sequences. Also, the CDM does not differentiate between positive and negative selection, whereas MM01 does. Nevertheless, these models may be implemented in concert with confidence, each yielding largely complementary results. These analytical results also present evidence that evaluating protein-coding genes at the codon-level is entirely conducive to the accurate and precise detection of selective influences, among changes in both nucleotides and amino acids, suggesting that this scale of examination, coupled with a phylogenetic perspective, may yield superior bioinformatic content. These approaches may prove instrumental to discovering the biochemical adaptations suggested by numerous molecular evolutionary analyses (Golding and Dean 1998; Grossman et al. 2001; Schmidt et al. 2001).

    As noted above, cetacean A-helix, D-helix, and E-helix and de-loop fit CDM predictions much better than the artiodactyl homologs. This could be the result of one of two largely mutually exclusive phenomena: relaxed functional constraints or increased purifying selection. The correlation between CDM and MM01 analyses illustrated in figure 10 allows for the differentiation of these two causative influences. The near absence of adaptive changes in the cetacean A-helix is most likely caused by increased purifying selection, resulting in a better fit to random expectations. The cetacean D-helix and E-helix, however, have experienced a much greater number of adaptive amino acid replacements than has the artiodactyl homologs, suggesting that these changes were permitted by relaxed selective constraints. The regions within these subdomains that fail to fit CDM expectations correspond closely to the clusters of residues (residues 188 to 200 and 228 to 244, plus variable residues 245 and 247 in the ef-loop) found to be influenced by positive-destabilizing selection. The lack of total correspondence, however, suggests that variable residues 228, 232, and 234 may not have experienced adaptive change at all (false positives), but are merely random changes that failed to affect the overall pattern of nucleotide substitution in this region. The remaining variable residues experiencing radical amino acid replacements most likely have been historically influenced by positive-destabilizing selection.

    The possibility that MM01 may be producing false-positive results is of great concern. Analytical strategies must be derived that will allow for greater confidence in results. However, for the purposes of this study, correspondence between the analytical results and the known structure and function of the active protein provide a great deal of confidence. Furthermore, correlations between MM01 and CDM sliding window results can be considered a sign of even greater confidence.

    The results of this study may be considered evidence that a total reliance on dN/dS ratios to detect molecular adaptation will result in a lack of sensitivity for two reasons: (1) molecular adaptation may be the result of even single amino acid changes, resulting in dN/dS < 1.0, and (2) the dominant mode of molecular change may result in the majority of amino acid replacements being stabilizing rather than destabilizing, and, thus, they may not be adaptive, even when dN/dS > 1.0. This is not to say that dN/dS is not useful for detecting positive selection. Our results do not contradict this paradigm. We only suggest that "positive selection" may not always be synonymous with "molecular adaptation." It may be advisable to consider other genetic information content for the purpose of characterizing adaptation; a simple comparison of nonsynonymous and synonymous rates of change may not be adequate. It may be time for a shift in thinking relative to adaptation: Molecular adaptation is a function of changes in protein phenotype resulting from corresponding changes in suites of physicochemical amino acid properties. Such a shift will result in more detailed characterizations of molecular adaptation, such that correlations may be drawn between molecular evolution and studies that focus on the description of protein structure and function.

    Acknowledgements

    We thank Keith A. Crandall for his thoughtful review of early versions of this manuscript and his many helpful suggestions. We also thank the two anonymous reviewers of the submitted manuscript for their thoughtful insights, editing prowess, and helpful comments and suggestions. This research has been supported by the Japan Society for the Promotion of Science (D.A.M.), the Pharmaceutical Research and Manufacturers of America (D.A.M.), and the Brigham Young University Office of Research and Creative Activities (D.A.M., M.J.S., and R.G.C.).

    References

    Alff-Steinberger, C. 1969. The genetic code and error transmission. Proc. Natl. Acad. Sci. USA 64:584–591.

    Andrews, T. D., L. S. Jermiin, and S. Easteal. 1998. Accelerated evolution of cytochrome b in simian primates: adaptive evolution in concert with other mitochondrial proteins? J. Mol. Evol. 47:249–257.

    Arnason, U., and A. Gullberg. 1996. Cytochrome b nucleotide sequences and the identification of five primary lineages of extant cetaceans. Mol. Biol. Evol. 13:407–417.

    Golding, G. B., and A. M. Dean. 1998. The structural basis of molecular adaptation. Mol. Biol. Evol. 15:355–369.

    Degli Esposti, M., S. De Vries, M. Crimi, A. Ghelli, T. Paternello, and A. Meyer. 1993. Mitochondrial cytochrome b: evolution and structure of the protein. Biochim. Biophys. Acta 1143:243–271.

    DeWalt, T. S., P. D. Sudman, M. S. Hafner, and S. K. Davis. 1993. Phylogenetic relationships of pocket gophers (Cratogeomys and Pappogeomys) based on mitochondrial DNA cytochrome b sequences. Mol. Phylogenet. Evol. 2:193–204.

    Fares, M. A., S. F. Elena, J. Ortiz, A. Moya, and E. Barrio. 2002. A sliding-window method to detect selective constraints in protein-coding genes and its application to RNA viruses. J. Mol. Evol. 55:509–521.

    Gaucher, E. A., M. M. Miyamoto, and S. A. Benner. 2003. Evolutionary, structural and biochemical evidence for a new interaction site of the leptin obesity protein. Genetics 163:1549–1553.

    Grantham, R. 1974. Amino acid difference formula to help explain protein evolution. Science 185:862–864.

    Griffiths, C. S. 1997. Correlation of functional domains and rates of nucleotide substitution in cytochrome b. Mol. Phylogenet. Evol. 7:352–365.

    Gromiha, M. M., and P. K. Ponnuswamy. 1993. Relationship between amino acid properties and protein compressibility. J. Theoret. Biol. 165:87–100.

    Grossman, L. I., T. R. Schmidt, D. E. Wildman, and M. Goodman. 2001. Molecular evolution of aerobic energy metabolism in primates. Mol. Phylogenet. Evol. 18:26–36.

    Irwin, D. M., T. D. Kocher, and A. C. Wilson. 1991. Evolution of the cytochrome b gene of mammals. J. Mol. Evol. 32:128–144.

    Iwata, S., J. W. Lee, K. Okada, J. K. Lee, M. Iwata, B. Rasmussen, T. A. Link, S. Ramaswamy, and B. K. Jap. 1998. Complete structure of the 11-subunit bovine mitochondrial cytochrome bc1 complex. Science 281:64–71.

    Koshi, J. M., D. P. Mindell, and R. A. Goldstein. 1999. Using physical-chemistry-based substitution models in phylogenetic analyses of HIV-1 subtypes. Mol. Biol. Evol. 16:173–179.

    Kyte, J., and R. F. Doolittle. 1982. A simple method for displaying the hydropathic character of a protein. J. Mol. Biol. 157:105–132.

    Lynn, D. J., A. T. Lloyd, M. A. Fares, and C. O'Farrelly. 2004. Evidence of positively selected sites in mammalian -defensins. Mol. Biol. Evol. 21:819–827.

    McClellan, D. A. 2000. The codon-degeneracy model of molecular evolution. J. Mol. Evol. 50:131–140.

    McClellan, D. A., and K. G. McCracken. 2001. Estimating the influence of selection on the variable amino acid sites of the cytochrome b protein functional domains. Mol. Biol. Evol. 18:917–925.

    McClellan, D. A., J. K. Sailsbery, and R. G. Christensen. 2003. CDM: codon-degeneracy model. Version 2.0. Brigham Young University, Provo, Utah.

    McClellan, D. A., D. G. Whiting, R. G. Christensen, and J. K. Sailsbery. 2004. Genetic codes as evolutionary filters: subtle differences in the structure of genetic codes result in significant differences in patterns of nucleotide substitution. J. Theoret. Biol. 226:393–400.

    Milinkovitch, M. C., A. Meyer, and J. R. Powell. 1994. Phylogeny of all major groups of cetaceans based on DNA sequences from three mitochondrial genes. Mol. Biol. Evol. 11:939–948.

    Naylor, G. J. P., and M. Gerstein. 2000. Measuring shifts in function and evolutionary opportunity using variability profiles: a case study of the globins. J. Mol. Evol. 51:223–233.

    Nielsen, R., and J. P. Huelsenbeck. 2002. Detecting positively selected amino acid sites using posterior predictive P-values. Pac. Symp. Biocomput. 2002:576–588.

    Nikaido, M., F. Matsuno, H. Hamilton, R. L. Brownell Jr., Y. Cao, W. Ding, Z. Zuoyan, A. M. Shedlock, R. E. Fordyce, M. Hasagawa, and N. Okada. 2001. Retroposon analysis of major cetacean lineages: the monophyly of toothed whales and the paraphyly of river dolphins. Proc. Natl. Acad. Sci. USA 98:7384–7389.

    Nikaido, M., A. P. Rooney, and N. Okada. 1999. Phylogenetic relationships among cetartiodactyls based on insertions of short and long interspersed elements: hippopotamuses are the closest extant relatives of whales. Proc. Natl. Acad. Sci. USA 96:10261–10266.

    Posada, D., and K. A. Crandall. 1998. MODELTEST: testing the model of DNA substitution. Bioinformatics 14:817–818.

    Prabhakaran, M., and P. K. Ponnuswamy. 1979. The spatial distribution of physical, chemical, energetic and conformational properties of amino acid residues in globular proteins. J. theoret. Biol. 80:485–504.

    Ronquist, F., and J. P. Huelsenbeck. 2003. MrBayes 3: Bayesian phylogenetic inference under mixed models. Bioinformatics 19:1572–1574.

    Schmidt, T. R., W. Wu, M. Goodman, and L. I. Grossman. 2001. Evolution of nuclear- and mitochondrial-encoded subunit interaction in cytochrome c oxidase. Mol. Biol. Evol. 18:563–569.

    Siltberg, J., and D. A. Liberles. 2002. A simple covarion-based approach to analyze nucleotide substitution rates. J. Evol. Biol. 15:588–594.

    Suzuki, Y., and T. Gojobori. 1999. A method for detecting positive selection at single amino acid sites. Mol. Biol. Evol. 16:1315–1328.

    Swofford, D. 2001. PAUP*: phylogenetic analysis using parsimony (*and other methods). Version 4.0b10. Sinauer Associates, Sunderland, Mass.

    Woese, C. R., D. H. Dugre, S. A. Dugre, M. Kondo, and W. C. Saxinger. 1966. On the fundamental nature and evolution of the genetic code. Cold Spring Harb. Symp. Quant. Biol. 403:304–308.

    Woolley, S., J. Johnson, M. J. Smith, K. A. Crandall, and D. A. McClellan. 2003. TreeSAAP: selection on amino acid properties using phylogenetic trees. Bioinformatics 19:671–672.

    Xia, X., and W.-H. Li. 1998. What amino acid properties affect protein evolution? J. Mol. Evol. 47:557–564.

    Yang, Z. 1997. PAML: a program package for phylogenetic analysis by maximum likelihood. CABIOS 13:555–556.

    Yang, Z., and J. P. Bielawski. 2000. Statistical methods for detecting molecular adaptation. Trends Ecol. Evol. 15:496–503.

    Yang, Z., and R. Nielsen. 2002. Codon-substitution models for detecting molecular adaptation at individual sites along specific lineages. Mol. Biol. Evol. 19:908–917.

    Yang, Z., and W. J. Swanson. 2002. Codon-substitution models to detect adaptive evolution that account for heterogeneous selective pressures among site classes. Mol. Biol. Evol. 19:49–57.

    Yang, Z., R. Nielsen, N. Goldman, and A.-M. K. Pedersen. 2000. Codon-substitution models for heterogeneous selection pressure at amino acid sites. Genetics 155:431–449.

    Zhang, Z., L. Huang, V. M. Shulmeister, Y.-I. Chi, K. K. Kim, L.-W. Hung, A. R. Crofts, E. A. Berry, and S.-H. Kim. 1998. Electron transfer by domain movement in cytochrome bc1. Nature 392:677–684.(D. A. McClellan*, E. J. P)