Comparison of Diverse Protein Sequences of the Nuclear-Encoded Subunits of Cytochrome C Oxidase Suggests Conservation of Structure Underlies
http://www.100md.com
分子生物学进展 2004年第8期
* Department of Ecology and Evolutionary Biology
Department of Molecular Biology, Princeton University, Princeton, New Jersey
E-mail: jdas@princeton.edu.
Abstract
Interspecific comparisons of protein sequences can reveal regions of evolutionary conservation that are under purifying selection because of functional constraints. Interpreting these constraints requires combining evolutionary information with structural, biochemical, and physiological data to understand the biological function of conserved regions. We take this integrative approach to investigate the evolution and function of the nuclear-encoded subunits of cytochrome c oxidase (COX). We find that the nuclear-encoded subunits evolved subsequent to the origin of mitochondria and the subunit composition of the holoenzyme varies across diverse taxa that include animals, yeasts, and plants. By mapping conserved amino acids onto the crystal structure of bovine COX, we show that conserved residues are structurally organized into functional domains. These domains correspond to some known functional sites as well as to other uncharacterized regions. We find that amino acids that are important for structural stability are conserved at frequencies higher than expected within each taxon, and groups of conserved residues cluster together at distances of less than 5 ? more frequently than do randomly selected residues. We, therefore, suggest that selection is acting to maintain the structural foundation of COX across taxa, whereas active sites vary or coevolve within lineages.
Key Words: cytochrome c oxidase ? nuclear-encoded ? evolution ? structure ? mitochondria
Introduction
Rates of amino acid substitution are dependent upon both structural and functional constraints acting at specific sites. Amino acid replacements that disrupt protein folding or functional interactions are eliminated by selection, leading to lower rates of substitution at those sites. Thus, phylogenetic comparisons can reveal conserved sites that are likely to be critical for protein function. Integrating such phylogenetic analyses with structural, biochemical, and physiological information provides a meaningful context for studying adaptive evolution and for generating new hypotheses about function (Golding and Dean 1998). Here, we apply this integrative approach to cytochrome c oxidase (COX), the terminal enzyme in the electron transport chain of oxidative phosphorylation.
COX, which catalyzes the transfer of electrons from reduced cytochrome c to molecular oxygen, is the primary determinant of cellular oxygen consumption and is thought to play a key role in regulating energy production (Poyton and McEwen 1996). COX is a multisubunit transmembrane protein located in the inner membrane of the mitochondrion. In mammals, it is composed of 13 subunits, three of which are encoded by mitochondrial DNA, and the remaining 10 are encoded in the nucleus. The three large mitochondrial-encoded subunits compose the catalytic core of the enzyme, which contains the reaction centers. The smaller nuclear-encoded subunits are arranged around the perimeter of the core enzyme (Tsukihara et al. 1996). The mitochondrial-encoded subunits are necessary and sufficient to carry out both electron transfer and proton-pumping functions (Capaldi 1990), although numerous studies of bacterial and bovine COX (Ruitenberg et al. 2002; Tsukihara et al. 2003) have yet to fully determine the mechanisms by which these processes occur.
The roles of the nuclear-encoded subunits have only recently begun to be elucidated. Experimental studies suggest that the primary function of the nuclear-encoded subunits is to regulate COX activity. For example, paralogous isoforms of subunit VII in Dictyostelium discoideum and subunit V in Saccharomyces cerevisiae are differentially expressed in response to oxygen concentration (Schiavo and Bisson 1989; Burke et al. 1997). In vertebrates, allosteric interactions that involve subunits IV and VIa are thought to mediate a phosphorylation-dependent mechanism of COX regulation (Ludwig et al. 2001). Some of the nuclear-encoded subunits are important for development and physiology of multicellular organisms, as shown by lethality and pleiotropic phenotypes caused by mutations in subunit VIc in Drosophila (Szuplewski and Terracol 2001).
There is also evidence for coevolution of the mitochondrial-encoded and nuclear-encoded subunits. When mitochondrial DNA is placed in a foreign nuclear background, the resulting disruption in mitochondrial activity suggests that functional interactions between the different subunits have coevolved. This finding has been shown in backcrosses between different intraspecific populations of the copepod Tigriopus californicus (Edmands and Burton 1998), in backcrosses between sister species of Drosophila (Sackton, Haney, and Rand 2003), and in xenomitochondrial cybrids of primate cell lines (Kenyon and Moraes 1997). In support of these experimental studies, analysis of evolutionary rates of mammalian COX subunits has shown that residues on nuclear-encoded subunits evolve more slowly when in close proximity to mitochondrial-encoded subunits, whereas the rapid evolution of mitochondrial DNA allows optimizing interactions with residues on nuclear-encoded subunits (Schmidt et al. 2001).
Because organisms have evolved to allocate their energetic resources differently based on their specific physiology and environment, understanding the regulation of COX will shed light on how this protein complex may have evolved to accommodate metabolic adaptation. Here, we describe the variation in COX subunit composition across diverse taxa and show that the nuclear-encoded subunits likely evolved subsequent to the origin of mitochondria. We use multiple sequence alignments to predict essential functional sites in these subunits and map these conserved sites onto the crystal structure of bovine COX. We conclude that these sites are structurally organized into functional regions, some of which correlate with known sites of interaction, whereas others are novel regions that warrant future experimental study.
Materials and Methods
Protein sequences for bovine COX nuclear-encoded subunits were obtained from the Entrez Protein sequence database (http://www.ncbi.nlm.nih.gov/Entrez/). Sequences for homologs in other species were obtained by use of either PSI-Blast to search the unrestricted NCBI database (Altschul et al. 1997) or protein-protein Blast programs to search organism-specific databases (Drosophila melanogaster, http://www.fruitfly.org/blast/index. html; Anopheles gambiae, http://www.ensembl.org/Anopheles_gambiae/blastview; Caenorhabditis elegans, http://www.wormbase.org/db/searches/blast; Arabidopsis thaliana, http://www.arabidopsis.org/Blast/; Saccharomyces cerevisiae, http://genome-www2.stanford.edu/cgi-bin/SGD/nph-blast2sgd; Schizosaccharomyces pombe, http://www.genedb.org/genedb/pombe/blast.sp). Whole-genome sequence trace archives were also searched when available (http://www.ncbi.nlm.nih.gov/Traces/trace.cgi?). Default parameters were used for all database searches. The crystal structure of bovine cytochrome c oxidase was obtained from the Protein Data Bank (PDB ID 1OCC; http://www.rcsb.org/pdb/, Berman et al. 2000).
Homologous sequences were aligned by application of the ClustalW multiple sequence alignment program (Thompson, Higgins, and Gibson 1994) with default parameters. All alignments are included in Supplementary Material online. Residues identified in each alignment as being identical or physicochemically conserved across at least two taxa were mapped onto the crystal structure by the PyMOL molecular graphics system (DeLano 2002). PyMOL scripts for real-time visualization of identical and conserved residues on the COX structure are also included in Supplementary Material online.
A program was written to evaluate the spatial distribution of conserved and identical residues. Distances were calculated between all pairs of residues on the list of conserved and identical amino acids described above. Here, distance is defined as the smallest interatomic distance between any two atoms in a pair of amino acids. For a null set, a list of randomly selected residues (and their dimer symmetry-related equivalents) of equal length was generated and distances calculated as above.
Results
Nomenclature of COX Nuclear-Encoded Subunits
The subunit composition of COX in vertebrates, plants, and yeasts has been determined primarily by resolution of the purified protein by polyacrylamide gel electrophoresis (Capaldi 1990; Geier et al. 1995; Jansch et al. 1996). Nomenclature of the individual subunits in each taxon is derived from their order of molecular mass, rather than from any sequence similarity between subunits. Variation in the size of homologous subunits has, therefore, resulted occasionally in conflicting nomenclature between taxa. Tables 1, 2, and 3 provide compilations of the parallel nomenclature of homologous subunits. We also list database accession numbers for homologous sequences found in other organisms with completed genomes. Because we have mapped conserved sites onto the bovine crystal structure, we refer to the nuclear-encoded subunits after the vertebrate nomenclature. There is no detectable sequence similarity between different nuclear-encoded subunits, nor do they appear to share any significant structural similarity beyond transmembrane helices. Although subunits IV, VIa, VIc, VIIa, VIIb, VIIc, and VIII of bovine COX are all classified under a folding pattern of a single transmembrane helix in the SCOP database (http://scop.mrc-lmb.cam.ac.uk/scop/), each subunit is subcategorized as a unique structural superfamily. The globular subunits, Va, Vb, and VIb, also do not share a fold classification. The lack of either sequence or structural similarity between different subunits suggests that none of them arose through duplication from another nuclear-encoded subunit.
Table 1 Nomenclature of Nuclear-Encoded COX Subunits IV, Va, and Vb Across Different Taxa.
Table 2 Nomenclature of Nuclear-Encoded COX Subunits VIa, VIb, and VIc Across Different Taxa.
Table 3 Nomenclature of Nuclear-Encoded COX Subunits VIIa, VIIb, VIIc, and VIII Across Different Taxa.
Comparison of COX Subunit Composition Across Taxa
Although the three mitochondrial-encoded subunits are present in all organisms that contain COX, the composition of the holoenzyme varies between taxa. Determination of the crystal structure of bovine COX (Tsukihara et al. 1996) clarified the composition of the vertebrate holoenzyme. Recent completion of whole-genome sequencing projects has now made it possible to use the vertebrate sequences to search for homologs in all eukaryotic kingdoms.
Ludwig and colleagues have postulated that the number of subunits is correlated with the regulatory complexity of the enzyme, as suggested by the presence of four subunits in the bacterium Paracoccus denitrificans, seven in Dictyostelium discoideum, 11 in S. cerevisiae, and 13 in Bos taurus (Ludwig et al. 2001). Our analysis suggests that this trend is true within the metazoans as well. Mammalian subunit VIIb has no homolog in insects (Szuplewski and Terracol 2001) or in nematodes, and Caenorhabditis elegans lacks genes homologous to mammalian subunits VIc, VIIa, and VIII as well (tables 2 and 3). However, failure to find homologs in these other species may be the result of incomplete sequencing or annotation of the genome (as is likely with subunit VIIc in Anopheles gambiae because this subunit is present in both D. melanogaster and C. elegans but not in Anopheles).
The composition of COX in plants provides further evidence that nuclear-encoded subunits may have evolved for taxon-specific regulatory functions. In plants, electrophoresis has shown that COX is composed of at least seven nuclear-encoded subunits (Jansch et al. 1996). However, only two vertebrate nuclear-encoded subunits (Vb and VIb) have orthologs in the Arabidopsis thaliana genome, which suggests that the remaining subunits are unique to plants. Of these novel subunits, only one, termed 5c, has been characterized (Hamanaka et al. 1999). Our Blast search of the A. thaliana genome also found an uncharacterized fourth paralog of subunit VIb that was not identified in a previous Southern hybridization screen (Ohtsu et al. 2001). This omission is likely the result of significant divergence from the other three paralogs in the region of the probe used for the screen, which included most of the coding region of AtCOX6b2.
Origin and Evolution of Nuclear-Encoded COX Subunits
Like the nuclear-encoded COX subunits, the majority of genes required for mitochondrial function are encoded by the nuclear genome. Some of these genes originated in the ancestral proteobacterium with subsequent transfer to the host nucleus, whereas others evolved within the eukaryotic nuclear genome (Berg and Kurland 2000). To explore the evolutionary origin of the nuclear-encoded COX subunits, we tested the first possibility by searching for homologous sequences in the -proteobacterium Rickettsia prowazekii, the prokaryote thought to be most closely related to ancestral mitochondria (Andersson et al. 1998), and the mitochondrial genome of the protozoan Reclinomonas americana, which contains the largest number of genes of any mitochondrion studied (Lang et al. 1997). A Blast search failed to find homologs of any nuclear-encoded subunits in Rickettsia, which suggests that these proteins were not present in the ancestral endosymbiont. The Reclinomonas mitochondrial genome, although containing COX subunits I, II, and III as well as a homolog of a nuclear-encoded COX assembly protein, also does not contain any genes similar to the other nuclear-encoded subunits (Lang et al. 1997). Together, these findings suggest that the nuclear subunits did not originate in the mitochondrial genome.
To determine when the COX subunits may have arisen in the eukaryotic genome, we searched for homologs in the genome of the protozoan Giardia lamblia, an ancient amitochondriate eukaryote that diverged close to when eukaryotes and prokaryotes split. Blast searches failed to reveal any putative homologs of nuclear-encoded COX subunits in Giardia, which suggests that these genes evolved in eukaryotes after the invasion of the ancestral endosymbiont. To attempt to find other genes in the eukaryotic genome from which the nuclear-encoded subunits may have been derived, we performed PSI-Blast searches with each of the nuclear-encoded subunits. Two iterations of each search failed to identify any genes with significant homology, which suggests extensive divergence from ancestral genes.
Although the evolutionary origin of the nuclear-encoded subunits remains a mystery, they have subsequently evolved specialized functions. ClustalW sequence alignments show that each subunit is orthologous across all taxa in which it is found (data not shown), but the presence of multiple paralogs within taxa suggests evolution of regulatory functions. For example, as previously discussed, alternative paralogs are expressed under normal or hypoxic conditions in both D. discoideum and S. cerevisiae (Schiavo and Bisson 1989). In vertebrates, multiple subunits have paralogs that are expressed either ubiquitously (L, liver-type) or primarily in mature contractile muscles (H, heart-type) (Ewart, Zhang, and Capaldi 1991), whereas paralogs of subunit VIb show differential tissue expression in plants and mammals (Ohtsu et al. 2001; Huttemann, Jaradat, and Grossman 2003). Similar functional differences may explain the existence of other uncharacterized paralogs found in insects and plants (tables 1–3). There also is evidence for adaptive evolution of individual paralogs after duplication, as shown by the inactivation of subunit VIIIH in Old World monkeys and hominids, possibly linked to optimization of aerobic energy metabolism (Goldberg et al. 2003).
Conservation of COX Subunits and Implications for Function
Although limited comparative analyses within taxa have been undertaken to identify evolutionary patterns in COX (Schmidt, Goodman, and Grossman 1999; Ludwig et al. 2001), a broader analysis across diverse taxa increases the power to detect specific sites that are essential for basic enzyme function and regulation. To identify regions of potential functional importance, we aligned sequences from all taxa in tables 1, 2, and 3, as well as the bovine sequence for each subunit. Sequences of all known isoforms within a species were included in the alignment. Subunits present in only one taxon (Vc in plants, VIII in animals, and VIIb in humans only) were excluded from the analysis because they are not essential for function across taxa. For each subunit, residues were classified as "identical," "conserved," or "nonconserved," on basis of the level of physicochemical conservation determined by a ClustalW sequence alignment across all taxa in which the subunit is present.
As expected, the three mitochondrial-encoded subunits encoding the core enzyme are conserved across diverse species at comparatively high levels that from 36% to 63%. Subunit Va, which is 39% conserved and is found in animals and yeasts, is much more conserved than any of the other nuclear-encoded subunits. The high level of conservation suggests that subunit Va has been subject to stronger purifying selection. Supporting this implication of functional constraint is experimental evidence for ligand interaction that involves subunit Va (Arnold, Goglia, and Kadenbach 1998). The evidence is discussed in more detail in following sections. Subunits VIa and VIb are conserved at moderate levels (22% and 24%, respectively). Subunit VIb is one of the two nuclear-encoded subunits shared between animals, yeasts, and plants and, based on studies of the crystal structure (Tsukihara et al. 1996), it appears to play a key role in intermonomer contact, whereas subunit VIa contains a known interaction site with ATP (Taanman, Turina, and Capaldi 1994). In contrast, subunit VIIc is only 16% conserved and has no characterized function.
A low proportion of conserved amino acids, however, does not necessarily indicate a lack of function. Subunit IV is proportionately the least conserved of the nuclear-encoded subunits (9%). Despite the low level of conservation between the animal and yeast subunit IV sequences, we concur with previous authors (Huttemann, Kadenbach, and Grossman 2001) that these subunits are homologs, albeit highly divergent, on the basis of two results from our sequence analysis. First, in a search of the NCBI Conserved Domain Database, the yeast sequences show significant homology to COX IV domains. Second, in a neighbor-joining tree of all COX nuclear-encoded subunits from all species (unpublished data), the yeast sequences cluster most closely in a monophyletic group with the subunit IV sequences of animals, in contrast with the deep divergence between monophyletic groups of any two other subunits.
Identification of Functional Domains by Comparative Structural Analysis
To detect specific functional sites in the nuclear-encoded subunits, we mapped identical and conserved amino acids onto the crystal structure of bovine COX. We then assessed whether these sites correlated with residues of known function on the basis of biochemical data or whether they might be novel sites of unknown function and, thus, targets for future experimental investigation. The crystal structure of bovine COX in dimeric form (Tsukihara et al. 1996) is shown in figure 1; all identical residues in the nuclear-encoded subunits are highlighted. (All sequence alignments, a complete list of identical and conserved residues, and stereo images of each subunit with identical and conserved residues highlighted are included in Supplementary Material online.) Specific residues are referred to by their nomenclature in accordance with the bovine sequence in the PDB structure.
FIG. 1. Identical residues highlighted on the bovine heart COX structure. Residues discussed in the text are numbered according to their position in the bovine structure. Nuclear subunits are labeled on the right monomer; hypothetical binding sites for ATP/ADP (in yeast and/or mammals), T2 (in mammals), human androgen receptor (hAR, in mammals), the regulatory subunit of protein kinase A (RI, in mammals), and Na+/Ca2+ (in mammals), as well as the location of the zinc ion, are indicated on the left. Mitochondrial subunits are in gray; indicated membrane boundaries are approximate
Below, we discuss the structural context and putative functional significance of the identical residues in each subunit. In the following analysis, we define amino acids to be in contact if the minimum atom-atom distance between their side chains is less than 5 ? (Russell and Barton 1994).
Subunit IV
Subunit IV is the least conserved of the nuclear-encoded subunits. The identical and conserved residues fall into two regions of the subunit—one cluster in the matrix domain and five residues in the cytosolic domain. In the matrix domain, WIV48, an identical site, contacts other identical residues in a highly conserved region of subunit Va and is likely to be part of the same functional region (fig. 2). Identical site EIV55 appears to be important for maintaining a helix-turn-helix structure through its contacts with three other conserved residues (LIV40, KIV43, and LIV51) on the adjacent helix and loop. The charged side chains of EIV55 and KIV43 are only 3.0 ? apart and well positioned to form a stabilizing salt bridge. This fold allows WIV48 to be positioned to extend into a highly conserved domain of subunit Va. Other conserved residues in the matrix domain also maintain contact with subunit Va.
FIG. 2. Interaction of conserved regions of nuclear-encoded subunits IV and Va. Wall-eye stereo view of matrix domains of subunits IV (red) and Va (blue). Identical and conserved residues are labeled and shown in ball-and-stick models. Dashed line indicates a putative salt bridge between KIV43 and EIV55, whose side chains are 3.0 ? apart. The contacts between LIV40, KIV43, LIV51, and EIV55 maintain the helix-turn-helix structure in subunit IV that positions WIV48 to extend into a highly conserved region of subunit Va. This region corresponds to part of the likely binding site for thyroid hormone T2
In mammals, two ATP binding sites in subunit IV have been proposed. One binding site has been modeled in the matrix domain and involves residues RIV20, RIV73, TIV75, EIV77, and WIV78, as well as residues from subunits I and II (Huttemann, Kadenbach, and Grossman 2001). None of these residues in subunit IV are conserved across taxa. The second ATP binding site in mammals is located in the cytosolic domain (Reimann et al. 1988), but the putative function of the conserved cytosolic residues in subunit IV is not clear. The sole identical residue in that region, GIV133, is likely conserved not because of interaction with other residues or ligands, but rather to provide the structural flexibility needed for the sharp bend in the backbone chain in that region. Indeed, analysis of the geometry of this residue by application of PROCHECK (Laskowski et al. 1993) shows that the C-N-C angle of GIV133 deviates from the ideal angle of 112.5° by 16.7°, the largest such deviation in the COX structure by some 20%.
Subunit Va
The five -helices of subunit Va form a right-handed superhelix (Tsukihara et al. 1996), a structure that probably is maintained across animals and yeasts, as shown by the distribution of identical and conserved residues throughout the protein. However, the majority of the identical residues are clustered in the three -helices formed by the amino acid sequence between PVa43 and GVa97. This region, which includes a number of charged residues, is 48% conserved (28% identical) and forms an exposed pocket that is also bordered by WIV48 (fig. 2). Subunit Va has been shown to bind thyroid hormone 3,5-diiodothyronine (T2) (Arnold, Goglia, and Kadenbach 1998), and we speculate that this pocket may be the T2 binding site.
Subunit Vb
The COOH-terminal of subunit Vb contains a zinc site that, in animals, involves four cysteine residues structurally arranged in a classical zinc finger motif (Tsukihara et al. 1996). Whereas three of these cysteines are identical across all taxa, CVb62 is substituted with a glycine residue in plants (Welchen, Chan, and Gonzalez 2002), as well as in yeasts and in D. discoideum. However, the sequences for subunit Vb in Arabidopsis, rice, S. cerevisiae, S. pombe, and D. discoideum contain a histidine residue at position 67. This histidine may fulfill the same function as CVb62 in animals, because histidine residues can also coordinate zinc ions in classical zinc finger domains (Matthews and Sunde 2002). We predict that this evolutionary change in animals would likely require a change in the conformation of the backbone to properly position the cysteine to coordinate the zinc ion. Identical residues SVb51 and RVb56 may be critical residues for attachment to subunit I, supported by several other conserved residues contacting core subunits I and III. Also, the ?-barrel structure of the COOH-terminal domain (Tsukihara et al. 1996) may be maintained across taxa, on the basis of conservation of three residues in contact on the inner surface (VVb49, VVb58, and YVb89) that appear to stabilize the structure.
In mammals, subunit Vb interacts with the regulatory subunit of protein kinase A. This action inhibits COX activity in a cAMP-dependent process (Yang et al. 1998; Bender and Kadenbach 2000). In addition, in vitro studies have shown that subunit Vb, primarily through the COOH-terminal, binds the human androgen receptor (hAR), although the effect of hAR on COX activity is unknown (Beauchemin et al. 2001). Although these putative regulatory mechanisms have not been tested across taxa, conservation of similar interactions may underlie the maintenance of functional and structural domains of subunit Vb.
Subunit VIa
The primary conserved region of subunit VIa is a loop structure on the cytosolic side of the enzyme. This region correlates with a putative ATP binding site at residues 63 to 68, previously identified by more limited sequence comparison and based on similarity to other ATP binding–site motifs (Taanman, Turina, and Capaldi 1994). The GDGXX(T/S) motif is conserved in all taxa we analyzed, with slight modifications: GS at position 1 in C. elegans and DE at position 2 in A. gambiae and in D. melanogaster CG17280. Our analysis identified several other residues that may also be part of the binding site or that may help to stabilize the structure. RVIa56, KVIa58, WVIa62, FVIa70, and NVIa72 are identical amino acids that are found on either side of the ATP binding motif and that contact subunits I or III.
Subunit VIa is also adjacent to subunit VIb near the dimer interface. YVIa50 is an identical residue that contacts residues GVIb79 (identical) and FVIb81 (conserved) in subunit VIb, which suggests that it is critical for the interface of the two subunits. Of the 10 residues at the NH2-terminal of subunit VIa that contact subunit I of the opposite monomer, only AVIa4 is conserved. This arrangement indicates some variation in how the intermonomer contact in that region is maintained. Finally, six identical and conserved residues are distributed throughout the transmembrane helix of subunit VIa. Although functional studies in yeast and mammals suggest a second ATP binding site on the matrix side of VIa (Anthony, Reimann, and Kadenbach 1993; Beauvoit et al. 1999), the orientation of these conserved residues along the helix does not make them obvious candidates for a putative binding site. Instead, they may be more important for maintaining the structure of the helix or contact with subunit III.
Subunit VIb
Subunit VIb is composed of three -helices connected by relatively tight turns. VIb is the only subunit that is situated completely on the side of the intermembrane space and is the primary nuclear subunit responsible for intermonomer contact. The importance of this role is reflected in the high level of sequence conservation across taxa. The region of contact is from residues 39 to 53, which forms the turn between two helices that extend into the intermembrane space (Tsukihara et al. 1996). GlyVIb47 is identical across all taxa, perhaps because of the constraints of the tight structure of the turn. The two extending helices are maintained in a near antiparallel geometry by two disulfide bridges between CVIb29 to CVIb64 and CVIb39 to CVIb53, all of which are identical. Several other identical and conserved residues likely preserve the orientation of the third helix as well, whereas conserved COOH-terminal amino acids contact subunit VIa.
Although the amino acids that are essential for the conformation of this domain are all identical, the intervening residues vary between species. For example, mammals have three extra amino acids in the loop at positions 42 to 44, and both the testis-specific isoform in mammals (Huttemann, Jaradat, and Grossman 2003) and the novel isoform in Arabidopsis are highly divergent from the common isoforms of their respective species in this region. Such lineage-specific and tissue-specific diversity suggests that this region may be evolving for some physiological function.
An interaction site for cytochrome c has been proposed on the cytosolic side of the enzyme. The proposed interaction involves acidic residues from subunits I, II, III, and VIb (Tsukihara et al. 1996). Of the amino acids from subunit VIb that are hypothesized to interact with cytochrome c, DVIb74 is conserved across taxa, whereas EVIb78 is not. Because DVIb74 is the only conserved residue on the external surface of the third helix, our analysis supports an important role for this amino acid in interacting with cytochrome c. There are also three identical sites in the NH2-terminal region. Although these residues (19 to 21) may merely stabilize the conformation of the subunit, they are on the margin of an open pocket on the cytosolic surface. This region may have functional significance because it is bordered by identical residues from subunits I and III as well.
Subunit VIc
The identical and conserved sites in subunit VIc do not correlate with any known function. Two identical sites, FVIc50 and YVIc51, are located at the cytosolic end of the transmembrane helix and contact subunits II and IV. The neighboring amino acids on subunits II and IV are not conserved, however, so there is no selection on interacting pairs. The third identical residue in subunit VIc is located in the cytosolic -helix, directly above the other two residues. As KVIc58 is near no other subunit, these three identical residues together may be important for some ligand interaction.
Subunit VIIa
Three identical sites in subunit VIIa occur at the sharp turn between the transmembrane helix and the NH2-terminal -helix. These residues are likely to be required to maintain the structure of the subunit and its orientation with respect to other subunits; for example, PVIIa19 is located at the sharp bend in the turn, whereas QVIIa13 forms a polar interaction with the conserved residue TVb14. VVIIa14, which is also conserved, contacts an identical residue on subunit III, EIII60. Less explicable is the identity of LVIIa31, which contacts YIII55 and QIII56. These two residues in subunit III are not conserved in plants and yeasts but are identical within animals, as well as adjacent to WIII57, which is identical across taxa. Therefore, some function may be associated with this region.
Subunit VIIc
Four identical residues in subunit VIIc are paired in Pro-Phe combinations in two locations. One pair occurs in the irregular NH2-terminal domain on the matrix side; the position of PVIIc12 at a bend in this region suggests that this pair is required for the structure. The transmembrane helix of subunit VIIc changes its angle of inclination midway through the helix. The second pair, with PVIIc36, is located at this bend in the helix and is supported by another identical residue, FVIIc33, which again suggests that it is conserved for structural integrity. Other conserved residues at the COOH-terminal end of the transmembrane helix interact with conserved residues in subunit I.
Evidence for Selection on Functional Domains
Because our analysis of sequence conservation across taxa identified many residues that did not correlate with regions of known function, we performed two global analyses to assess whether these conserved residues are likely to be playing a functional or structural role and whether any selective pressures could be influencing the evolution of the enzyme as a whole. We first examined whether similar biochemical classes of amino acids were more likely to be conserved within different taxa. We then explored the spatial location of amino acids conserved across all taxa to determine whether they were positioned in a nonrandom distribution that would support selective constraints based on structure or function.
Groups of amino acids can be classified by the different roles they tend to play within the context of a protein's structure and function (Betts and Russell 2003). Variation in substitution rates of different amino acids can shed light on the selective forces acting on a protein. On the basis of the relative importance of different functional or structural constraints, amino acids that are more likely to be involved in polar interactions (for example in salt bridges or in ligand interaction) or nonpolar interactions (such as hydrophobic packing or aromatic ring stacking) may be conserved at different frequency. To determine whether similar functional constraints are acting on different evolutionary lineages, we analyzed within-taxon sequence alignments (human-Drosophila, S. cerevisiae–S. pombe, and Arabidopsis-rice) for similar patterns of amino acid conservation.
To predict the expected distribution of conserved residues given an equal probability of conservation for all amino acids, we calculated the average frequency of amino acid occurrence across all subunits within each lineage. The distribution of amino acids identical within each taxon was significantly different from the average distribution in yeasts and in animals but not in plants (2 test; yeasts, P = 0.017; animals, P = 0.023; plants, P = 0.639), as might be expected on the basis of the comparatively recent divergence of monocots and eudicots. However, in all taxa, the aromatic amino acids Phe, Tyr, and Trp were overrepresented among identical residues by at least 15% compared with the number expected on the basis of overall amino acid frequency, as were Gly and Cys (table 4). All of these amino acids play key roles in structural stability: the bulky side chains of aromatic residues are often key components of hydrophobic cores and can participate in energetically favorable stacking interactions; cysteines are important for disulfide bonds and for binding metals; and glycine is conformationally flexible and occurs at tight turns in structures (Betts and Russell 2003). This pattern across independently evolving lineages suggests that, in general, hydrophobic amino acids in COX are conserved to maintain the structure of functional sites and possibly to maintain the structure of interfaces between subunits. In contrast, no strong trends are evident across taxa in the conservation of hydrophilic amino acids. However, those hydrophilic residues that are conserved are still likely to play important roles; for example, salt bridges tend to be conserved for specific functional or structural purposes rather than by any general rule based on solvent exposure, proximity to active sites, or location in the protein (Schueler and Margalit 1995).
Table 4 Distribution of Identical Amino Acids Within Taxa.
Visual inspection of the COX structure showed that identical and physicochemically conserved residues occur in spatial proximity to each other, which suggests that many of these residues may indeed be positioned to form functional sites (fig. 3A). We explored this observation by calculating whether identical and conserved residues have a tendency to cluster together. Clustering of conserved amino acids would suggest that selection acts to maintain functional or structural interactions between residues.
FIG. 3. Conserved residues are nonrandomly clustered in putative functional domains. (A) Amino acids in subunits Vb and VIIc are shown in space-filling display to illustrate how conserved residues (colored green [Vb] and blue [VIIc]) tend to be located in close proximity to identical residues (colored red [Vb] and orange [VIIc]). (B) A histogram of pairwise distances between the 284 identical and conserved residues in the COX dimer (solid bars) and between an equal number of randomly chosen residues (open bars). Compared with the randomly chosen residues, conserved sites occur more frequently at distances of less than 20 ? from each other
We plotted the frequency of pairwise minimum atom-atom distances of less than 30 ? for all identical and conserved amino acids as well as for an equal number of randomly selected residues (fig. 3B). There is an excess of conserved amino acids at distances of less than 15 ? from each other; amino acids within this distance are likely to be part of the same structural feature or functional region. This result, along with the overrepresentation of structurally important residues among conserved amino acids, supports the premise that these conserved residues act to maintain the structural integrity of functional sites. Even when the highly conserved subunits Va and VIb are excluded from the analysis, direct contacts (defined as less than 5 ?) still occur at twofold higher frequency among conserved residues than in the randomized data set (data not shown).
Discussion
What are the roles of the nuclear-encoded subunits in COX? Experimental approaches have revealed many intriguing functions for the nuclear-encoded subunits in model organisms, such as environmental regulation, tissue-specific function, and ATP-dependent allosteric regulation (Schiavo and Bisson 1989; Ewart, Zhang, and Capaldi 1991; Taanman, Turina, and Capaldi 1994). Other studies have employed phylogenetic techniques to measure rates of substitution in the nuclear-encoded subunits to identify residues likely to be functional or coevolving with known interacting proteins or cofactors (Schmidt et al. 2001; Goldberg et al. 2003). However, these previous studies have been confined largely to one organism or to a group of relatively closely related species, which limits our understanding of how widespread these regulatory mechanisms are and of which ones are essential for basic regulation across diverse taxa. Our broader analysis allows us to detect highly conserved sites to identify essential functional domains of the enzyme.
Early in the past decade, the focus on finding functions of the nuclear-encoded subunits shifted to lower eukaryotes, in part because experimental studies of higher eukaryotes were proving difficult (Capaldi et al. 1990). Since then, researchers have begun to recognize the diversity of regulatory mechanisms provided by the nuclear-encoded subunits, which appear to fine-tune the basic redox reaction of COX to different physiologies. With the availability of whole-genome sequences, we have been able to perform a broad search for subunit homologs across taxa that revealed different subunit compositions of the holoenzyme, even within the metazoans. The roles of evolutionarily novel subunits merit closer investigation, as preliminary evidence indicates interesting functional roles. For example, expression of subunit VIIb is depressed in the intestines of mice deficient in gastrointestinal pacemaker cells (Takayama et al. 2001), which suggests a regulatory role consistent with origin along the deuterostome lineage. Subunit VIII evolved before the protostome-deuterostome split, yet subsequent duplication in the vertebrates followed by inactivation of one paralog in humans implies relatively rapid evolution of function (Goldberg et al. 2003); this subunit remains uncharacterized in invertebrates. Plants, in particular, appear to have an almost entirely novel set of nuclear-encoded subunits. Initial characterization of plant subunits Vc and VIb show differential tissue expression, and the novel divergent paralog of Arabidopsis subunit VIb that was identified in our analysis may reveal new regulatory properties.
The question of how this diversity of nuclear-encoded subunits evolved remains a conundrum. There is evidence for recent transfer of COX genes from the mitochondrial to the nuclear genome. During the evolution of plants, subunit II has been transferred to the nucleus in the legume lineage, with some species maintaining two active copies and other species silencing either the nuclear or the mitochondrial copy (Palmer et al. 2000). This plasticity may be facilitated by the large size of the plant mitochondrial genome. We were unable to find any support for a similar mitochondrial origin of the nuclear-encoded subunits or of any other nuclear genes from which the subunits may have been derived. Therefore, we conclude that they arose in the nuclear genome after the acquisition of mitochondria and have diverged extensively from any ancestral genes.
By aligning sequences of the nuclear-encoded subunits from highly divergent species and mapping identical and physicochemically conserved sites onto the protein structure, we have been able to classify regions of interest of the protein into three broad categories: (1) known functional sites that have diverged among species, (2) known functional sites that are conserved across species, and (3) conserved sites of unknown function. Although we do not speculate further on the third class, we discuss below the most intriguing examples of the first two.
One case of divergence of known functional regions involves the sites at which nucleotides can bind to regulate COX activity. Equilibrium analysis of COX with ATP and ADP have revealed 10 nucleotide binding sites in the bovine monomer (Napiwotzki et al. 1997). Two of these sites, the matrix and cytosolic ATP binding sites on subunit IV, have been studied experimentally in mammals (Napiwotzki et al. 1997; Huttemann, Kadenbach, and Grossman 2001). In yeasts, ATP has also been shown to inhibit COX activity through binding on both the matrix and the cytosolic sides of subunit VIa (Anthony, Reimann, and Kadenbach 1993; Taanman, Turina, and Capaldi 1994; Beauvoit et al. 1999). Our analysis shows that among the locations of these four nucleotide binding sites, only the cytosolic domain of subunit VIa is conserved across animals and yeasts. The others have either arisen independently in each lineage or have diverged through compensatory mutations that preserve the function of the region.
An example of an unpredicted conserved region is the putative binding site for thyroid hormone T2 on subunit Va in mammals (Arnold, Goglia, and Kadenbach 1998). T2 increases COX activity by interfering with allosteric inhibition by ATP. Our analysis reveals that the binding site for T2 is highly conserved across animals and yeasts, a surprising result because insects and yeasts do not produce thyroid hormones. What might be the analogous ligand in these other species? In insects, a likely functional analog is juvenile hormone (JH) (Wheeler and Nijhout 2003); L-3,5,3'-triiodothyronine (T3), an adduct of T2, can act via a putative JH receptor in insect follicle cell membranes (Kim et al. 1999). Presumably a similar analog also acts in yeasts.
On the basis of the proximity of subunits IV and Va, T2 has been proposed to affect COX activity by interacting with the ATP binding site in the matrix domain of subunit IV (Ludwig et al. 2001). Because this ATP binding site does not appear to be conserved across taxa, we argue that this mechanism is unlikely to be a widespread phenomenon, unless a matrix ATP binding site in subunit IV has been independently maintained in different taxa after duplication. Alternatively, an allosteric interaction may occur with the conserved domain in the cytosolic domain of subunit VIa.
In addition to residues in these known binding sites or those with obvious interactions with other subunits or cofactors, our analysis has identified essential structural elements as well as many identical and conserved amino acids whose functions are unclear and are worthy of future study. Some of these structural elements and amino acids may be catalytic residues in uncharacterized active sites. However, our results show that hydrophobic and nonpolar amino acids, which are more likely to be involved in structural scaffolding rather than in substrate interactions, tend to be conserved at frequencies higher than expected within all taxa. Because these identical and conserved residues are nonrandomly distributed in clusters throughout the protein, we suggest that these groups of interacting amino acids may be conserved to maintain the structural foundation for evolving catalytic sites.
Supplementary Material
The following supplementary material can be found online at the journal's web site:
Multiple sequence alignments in interleaved format for all nuclear-encoded subunits (HTML).
Multiple sequence alignments in sequential format for all nuclear-encoded subunits (text).
Complete list of all identical and conserved residues (tab-delimited text).
PyMOL scripts for the complete structure and all individual subunits with identical and conserved residues highlighted (HTML).
Stereo images of the complete structure and all individual subunits with identical and conserved residues highlighted (PDF).
Acknowledgements
We thank Greg Davis, Alistair McGregor, members of the Stern lab, and three anonymous reviewers for helpful comments. This work was supported by a Howard Hughes Medical Institute Predoctoral Fellowship to J.D. and a David and Lucile Packard Foundation Fellowship and a National Institutes of Health grant to D.L.S.
Literature Cited
Altschul, S. F., T. L. Madden, A. A. Schaffer, J. Zhang, Z. Zhang, W. Miller, and D. J. Lipman. 1997. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25:3389-3402.
Andersson, S. G., A. Zomorodipour, J. O. Andersson, T. Sicheritz-Ponten, U. C. Alsmark, R. M. Podowski, A. K. Naslund, A. S. Eriksson, H. H. Winkler, and C. G. Kurland. 1998. The genome sequence of Rickettsia prowazekii and the origin of mitochondria. Nature 396:133-140.
Anthony, G., A. Reimann, and B. Kadenbach. 1993. Tissue-specific regulation of bovine heart cytochrome-c oxidase activity by ADP via interaction with subunit VIa. Proc. Natl. Acad. Sci. USA 90:1652-1656.
Arnold, S., F. Goglia, and B. Kadenbach. 1998. 3,5-Diiodothyronine binds to subunit Va of cytochrome-c oxidase and abolishes the allosteric inhibition of respiration by ATP. Eur. J. Biochem. 252:325-330.
Beauchemin, A. M., B. Gottlieb, L. K. Beitel, Y. A. Elhaji, L. Pinsky, and M. A. Trifiro. 2001. Cytochrome c oxidase subunit Vb interacts with human androgen receptor: a potential mechanism for neurotoxicity in spinobulbar muscular atrophy. Brain Res. Bull. 56:285-297.
Beauvoit, B., O. Bunoust, B. Guerin, and M. Rigoulet. 1999. ATP regulation of cytochrome oxidase in yeast mitochondria: role of subunit VIa. Eur. J. Biochem. 263:118-127.
Bender, E., and B. Kadenbach. 2000. The allosteric ATP-inhibition of cytochrome c oxidase activity is reversibly switched on by cAMP-dependent phosphorylation. FEBS Lett. 466:130-134.
Berg, O. G., and C. G. Kurland. 2000. Why mitochondrial genes are most often found in nuclei. Mol. Biol. Evol. 17:951-961.
Berman, H. M., J. Westbrook, Z. Feng, G. Gilliland, T. N. Bhat, H. Weissig, I. N. Shindyalov, and P. E. Bourne. 2000. The protein data bank. Nucleic Acids Res. 28:235-242.
Betts, M. J., and R. B. Russell. 2003. Amino acids properties and consequences of substitutions. Pp. 289–316 in M. R. Barnes, and I. C. Gray, eds. Bioinformatics for geneticists. Wiley, Chichester, UK.
Burke, P. V., D. C. Raitt, L. A. Allen, E. A. Kellogg, and R. O. Poyton. 1997. Effects of oxygen concentration on the expression of cytochrome c and cytochrome c oxidase genes in yeast. J. Biol. Chem. 272:14705-14712.
Capaldi, R. A. 1990. Structure and function of cytochrome c oxidase. Annu. Rev. Biochem. 59:569-596.
Capaldi, R. A., Y. Z. Zhang, R. Rizzuto, D. Sandona, G. Schiavo, and R. Bisson. 1990. The two oxygen-regulated subunits of cytochrome c oxidase in Dictyostelium discoideum derive from a common ancestor. FEBS Lett. 261:158-160.
DeLano, W. L. 2002. The PyMOL molecular graphics system. DeLano Scientific, San Carlos, CA, USA.
Edmands, S., and R. Burton. 1998. Variation in cytochrome-c oxidase activity is not maternally inherited in the copepod Tigriopus californicus. Heredity 80:668-674.
Ewart, G. D., Y. Z. Zhang, and R. A. Capaldi. 1991. Switching of bovine cytochrome c oxidase subunit VIa isoforms in skeletal muscle during development. FEBS Lett. 292:79-84.
Geier, B. M., H. Schagger, C. Ortwein, T. A. Link, W. R. Hagen, U. Brandt, and G. Von Jagow. 1995. Kinetic properties and ligand binding of the eleven-subunit cytochrome-c oxidase from Saccharomyces cerevisiae isolated with a novel large-scale purification method. Eur. J. Biochem. 227:296-302.
Goldberg, A., D. E. Wildman, T. R. Schmidt, M. Huttemann, M. Goodman, M. L. Weiss, and L. I. Grossman. 2003. Adaptive evolution of cytochrome c oxidase subunit VIII in anthropoid primates. Proc. Natl. Acad. Sci. USA 100:5873-5878.
Golding, G. B., and A. M. Dean. 1998. The structural basis of molecular adaptation. Mol. Biol. Evol. 15:355-369.
Hamanaka, S., K. Ohtsu, K. Kadowaki, M. Nakazono, and A. Hirai. 1999. Identification of cDNA encoding cytochrome c oxidase subunit 5c (COX5c) from rice: comparison of its expression with nuclear-encoded and mitochondrial-encoded COX genes. Genes Genet. Syst. 74:71-75.
Huttemann, M., S. Jaradat, and L. I. Grossman. 2003. Cytochrome c oxidase of mammals contains a testes-specific isoform of subunit VIb—the counterpart to testes-specific cytochrome c? Mol. Reprod. Dev. 66:8-16.
Huttemann, M., B. Kadenbach, and L. I. Grossman. 2001. Mammalian subunit IV isoforms of cytochrome c oxidase. Gene 267:111-123.
Jansch, L., V. Kruft, U. K. Schmitz, and H. P. Braun. 1996. New insights into the composition, molecular mass and stoichiometry of the protein complexes of plant mitochondria. Plant J. 9:357-368.
Kenyon, L., and C. T. Moraes. 1997. Expanding the functional human mitochondrial DNA database by the establishment of primate xenomitochondrial cybrids. Proc. Natl. Acad. Sci. USA 94:9131-9135.
Kim, Y., E. D. Davari, V. Sevala, and K. G. Davey. 1999. Functional binding of a vertebrate hormone, L-3,5,3'-triiodothyronine (T3), on insect follicle cell membranes. Insect Biochem. Mol. Biol. 29:943-950.
Lang, B. F., G. Burger, C. J. O'Kelly, R. Cedergren, G. B. Golding, C. Lemieux, D. Sankoff, M. Turmel, and M. W. Gray. 1997. An ancestral mitochondrial DNA resembling a eubacterial genome in miniature. Nature 387:493-497.
Laskowski, R. A., M. W. MacArthur, D. S. Moss, and J. M. Thornton. 1993. PROCHECK: a program to check the stereochemical quality of protein structures. J. Appl. Cryst. 26:283-291.
Ludwig, B., E. Bender, S. Arnold, M. Huttemann, I. Lee, and B. Kadenbach. 2001. Cytochrome c oxidase and the regulation of oxidative phosphorylation. Chem. Biochem. 2:392-403.
Matthews, J. M., and M. Sunde. 2002. Zinc fingers–folds for many occasions. IUBMB Life 54:351-355.
Napiwotzki, J., K. Shinzawa-Itoh, S. Yoshikawa, and B. Kadenbach. 1997. ATP and ADP bind to cytochrome c oxidase and regulate its activity. Biol. Chem. 378:1013-1021.
Ohtsu, K., M. Nakazono, N. Tsutsumi, and A. Hirai. 2001. Characterization and expression of the genes for cytochrome c oxidase subunit VIb (COX6b) from rice and Arabidopsis thaliana. Gene 264:233-239.
Palmer, J. D., K. L. Adams, Y. Cho, C. L. Parkinson, Y. L. Qiu, and K. Song. 2000. Dynamic evolution of plant mitochondrial genomes: mobile genes and introns and highly variable mutation rates. Proc. Natl. Acad. Sci. USA 97:6960-6966.
Poyton, R. O., and J. E. McEwen. 1996. Crosstalk between nuclear and mitochondrial genomes. Annu. Rev. Biochem. 65:563-607.
Reimann, A., F. J. Huther, J. A. Berden, and B. Kadenbach. 1988. Anions induce conformational changes and influence the activity and photoaffinity-labelling by 8-azido-ATP of isolated cytochrome c oxidase. Biochem. J. 254:723-730.
Ruitenberg, M., A. Kannt, E. Bamberg, K. Fendler, and H. Michel. 2002. Reduction of cytochrome c oxidase by a second electron leads to proton translocation. Nature 417:99-102.
Russell, R. B., and G. J. Barton. 1994. Structural features can be unconserved in proteins with similar folds: an analysis of side-chain to side-chain contacts secondary structure and accessibility. J. Mol. Biol. 244:332-350.
Sackton, T. B., R. A. Haney, and D. M. Rand. 2003. Cytonuclear coadaptation in Drosophila: disruption of cytochrome c oxidase activity in backcross genotypes. Evolution 57:2315-2325.
Schiavo, G., and R. Bisson. 1989. Oxygen influences the subunit structure of cytochrome c oxidase in the slime mold Dictyostelium discoideum. J. Biol. Chem. 264:7129-7134.
Schmidt, T. R., M. Goodman, and L. I. Grossman. 1999. Molecular evolution of the COX7A gene family in primates. Mol. Biol. Evol. 16:619-626.
Schmidt, T. R., W. Wu, M. Goodman, and L. I. Grossman. 2001. Evolution of nuclear- and mitochondrial-encoded subunit interaction in cytochrome c oxidase. Mol. Biol. Evol. 18:563-569.
Schueler, O., and H. Margalit. 1995. Conservation of salt bridges in protein families. J. Mol. Biol. 248:125-135.
Szuplewski, S., and R. Terracol. 2001. The cyclope gene of Drosophila encodes a cytochrome c oxidase subunit VIc homolog. Genetics 158:1629-1643.
Taanman, J. W., P. Turina, and R. A. Capaldi. 1994. Regulation of cytochrome c oxidase by interaction of ATP at two binding sites, one on subunit VIa. Biochemistry 33:11833-11841.
Takayama, I., Y. Diago, S. M. Ward, K. M. Sanders, T. Yamanaka, and M. A. Fujino. 2001. Differential gene expression in the small intestines of wildtype and W/W-V mice. Neurogastroenterol. Motil. 13:163-168.
Thompson, J. D., D. G. Higgins, and T. J. Gibson. 1994. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22:4673-4680.
Tsukihara, T., H. Aoyama, E. Yamashita, T. Tomizaki, H. Yamaguchi, K. Shinzawa-Itoh, R. Nakashima, R. Yaono, and S. Yoshikawa. 1996. The whole structure of the 13-subunit oxidized cytochrome c oxidase at 2.8 angstrom. Science 272:1136-1144.
Tsukihara, T., K. Shimokata, and Y. Katayama, et al. (12 co-authors). 2003. The low-spin heme of cytochrome c oxidase as the driving element of the proton-pumping process. Proc. Natl. Acad. Sci. USA 100:15304-15309.
Welchen, E., R. L. Chan, and D. H. Gonzalez. 2002. Metabolic regulation of genes encoding cytochrome c and cytochrome c oxidase subunit Vb in Arabidopsis. Plant Cell Environ. 25:1605-1615.
Wheeler, D. E., and H. F. Nijhout. 2003. A perspective for understanding the modes of juvenile hormone action as a lipid signaling system. Bioessays 25:994-1001.
Yang, W. L., L. Iacono, W. M. Tang, and K. V. Chin. 1998. Novel function of the regulatory subunit of protein kinase A: regulation of cytochrome c oxidase activity and cytochrome c release. Biochemistry 37:14175-14180.(Jayatri Das*, Stephen T. )
Department of Molecular Biology, Princeton University, Princeton, New Jersey
E-mail: jdas@princeton.edu.
Abstract
Interspecific comparisons of protein sequences can reveal regions of evolutionary conservation that are under purifying selection because of functional constraints. Interpreting these constraints requires combining evolutionary information with structural, biochemical, and physiological data to understand the biological function of conserved regions. We take this integrative approach to investigate the evolution and function of the nuclear-encoded subunits of cytochrome c oxidase (COX). We find that the nuclear-encoded subunits evolved subsequent to the origin of mitochondria and the subunit composition of the holoenzyme varies across diverse taxa that include animals, yeasts, and plants. By mapping conserved amino acids onto the crystal structure of bovine COX, we show that conserved residues are structurally organized into functional domains. These domains correspond to some known functional sites as well as to other uncharacterized regions. We find that amino acids that are important for structural stability are conserved at frequencies higher than expected within each taxon, and groups of conserved residues cluster together at distances of less than 5 ? more frequently than do randomly selected residues. We, therefore, suggest that selection is acting to maintain the structural foundation of COX across taxa, whereas active sites vary or coevolve within lineages.
Key Words: cytochrome c oxidase ? nuclear-encoded ? evolution ? structure ? mitochondria
Introduction
Rates of amino acid substitution are dependent upon both structural and functional constraints acting at specific sites. Amino acid replacements that disrupt protein folding or functional interactions are eliminated by selection, leading to lower rates of substitution at those sites. Thus, phylogenetic comparisons can reveal conserved sites that are likely to be critical for protein function. Integrating such phylogenetic analyses with structural, biochemical, and physiological information provides a meaningful context for studying adaptive evolution and for generating new hypotheses about function (Golding and Dean 1998). Here, we apply this integrative approach to cytochrome c oxidase (COX), the terminal enzyme in the electron transport chain of oxidative phosphorylation.
COX, which catalyzes the transfer of electrons from reduced cytochrome c to molecular oxygen, is the primary determinant of cellular oxygen consumption and is thought to play a key role in regulating energy production (Poyton and McEwen 1996). COX is a multisubunit transmembrane protein located in the inner membrane of the mitochondrion. In mammals, it is composed of 13 subunits, three of which are encoded by mitochondrial DNA, and the remaining 10 are encoded in the nucleus. The three large mitochondrial-encoded subunits compose the catalytic core of the enzyme, which contains the reaction centers. The smaller nuclear-encoded subunits are arranged around the perimeter of the core enzyme (Tsukihara et al. 1996). The mitochondrial-encoded subunits are necessary and sufficient to carry out both electron transfer and proton-pumping functions (Capaldi 1990), although numerous studies of bacterial and bovine COX (Ruitenberg et al. 2002; Tsukihara et al. 2003) have yet to fully determine the mechanisms by which these processes occur.
The roles of the nuclear-encoded subunits have only recently begun to be elucidated. Experimental studies suggest that the primary function of the nuclear-encoded subunits is to regulate COX activity. For example, paralogous isoforms of subunit VII in Dictyostelium discoideum and subunit V in Saccharomyces cerevisiae are differentially expressed in response to oxygen concentration (Schiavo and Bisson 1989; Burke et al. 1997). In vertebrates, allosteric interactions that involve subunits IV and VIa are thought to mediate a phosphorylation-dependent mechanism of COX regulation (Ludwig et al. 2001). Some of the nuclear-encoded subunits are important for development and physiology of multicellular organisms, as shown by lethality and pleiotropic phenotypes caused by mutations in subunit VIc in Drosophila (Szuplewski and Terracol 2001).
There is also evidence for coevolution of the mitochondrial-encoded and nuclear-encoded subunits. When mitochondrial DNA is placed in a foreign nuclear background, the resulting disruption in mitochondrial activity suggests that functional interactions between the different subunits have coevolved. This finding has been shown in backcrosses between different intraspecific populations of the copepod Tigriopus californicus (Edmands and Burton 1998), in backcrosses between sister species of Drosophila (Sackton, Haney, and Rand 2003), and in xenomitochondrial cybrids of primate cell lines (Kenyon and Moraes 1997). In support of these experimental studies, analysis of evolutionary rates of mammalian COX subunits has shown that residues on nuclear-encoded subunits evolve more slowly when in close proximity to mitochondrial-encoded subunits, whereas the rapid evolution of mitochondrial DNA allows optimizing interactions with residues on nuclear-encoded subunits (Schmidt et al. 2001).
Because organisms have evolved to allocate their energetic resources differently based on their specific physiology and environment, understanding the regulation of COX will shed light on how this protein complex may have evolved to accommodate metabolic adaptation. Here, we describe the variation in COX subunit composition across diverse taxa and show that the nuclear-encoded subunits likely evolved subsequent to the origin of mitochondria. We use multiple sequence alignments to predict essential functional sites in these subunits and map these conserved sites onto the crystal structure of bovine COX. We conclude that these sites are structurally organized into functional regions, some of which correlate with known sites of interaction, whereas others are novel regions that warrant future experimental study.
Materials and Methods
Protein sequences for bovine COX nuclear-encoded subunits were obtained from the Entrez Protein sequence database (http://www.ncbi.nlm.nih.gov/Entrez/). Sequences for homologs in other species were obtained by use of either PSI-Blast to search the unrestricted NCBI database (Altschul et al. 1997) or protein-protein Blast programs to search organism-specific databases (Drosophila melanogaster, http://www.fruitfly.org/blast/index. html; Anopheles gambiae, http://www.ensembl.org/Anopheles_gambiae/blastview; Caenorhabditis elegans, http://www.wormbase.org/db/searches/blast; Arabidopsis thaliana, http://www.arabidopsis.org/Blast/; Saccharomyces cerevisiae, http://genome-www2.stanford.edu/cgi-bin/SGD/nph-blast2sgd; Schizosaccharomyces pombe, http://www.genedb.org/genedb/pombe/blast.sp). Whole-genome sequence trace archives were also searched when available (http://www.ncbi.nlm.nih.gov/Traces/trace.cgi?). Default parameters were used for all database searches. The crystal structure of bovine cytochrome c oxidase was obtained from the Protein Data Bank (PDB ID 1OCC; http://www.rcsb.org/pdb/, Berman et al. 2000).
Homologous sequences were aligned by application of the ClustalW multiple sequence alignment program (Thompson, Higgins, and Gibson 1994) with default parameters. All alignments are included in Supplementary Material online. Residues identified in each alignment as being identical or physicochemically conserved across at least two taxa were mapped onto the crystal structure by the PyMOL molecular graphics system (DeLano 2002). PyMOL scripts for real-time visualization of identical and conserved residues on the COX structure are also included in Supplementary Material online.
A program was written to evaluate the spatial distribution of conserved and identical residues. Distances were calculated between all pairs of residues on the list of conserved and identical amino acids described above. Here, distance is defined as the smallest interatomic distance between any two atoms in a pair of amino acids. For a null set, a list of randomly selected residues (and their dimer symmetry-related equivalents) of equal length was generated and distances calculated as above.
Results
Nomenclature of COX Nuclear-Encoded Subunits
The subunit composition of COX in vertebrates, plants, and yeasts has been determined primarily by resolution of the purified protein by polyacrylamide gel electrophoresis (Capaldi 1990; Geier et al. 1995; Jansch et al. 1996). Nomenclature of the individual subunits in each taxon is derived from their order of molecular mass, rather than from any sequence similarity between subunits. Variation in the size of homologous subunits has, therefore, resulted occasionally in conflicting nomenclature between taxa. Tables 1, 2, and 3 provide compilations of the parallel nomenclature of homologous subunits. We also list database accession numbers for homologous sequences found in other organisms with completed genomes. Because we have mapped conserved sites onto the bovine crystal structure, we refer to the nuclear-encoded subunits after the vertebrate nomenclature. There is no detectable sequence similarity between different nuclear-encoded subunits, nor do they appear to share any significant structural similarity beyond transmembrane helices. Although subunits IV, VIa, VIc, VIIa, VIIb, VIIc, and VIII of bovine COX are all classified under a folding pattern of a single transmembrane helix in the SCOP database (http://scop.mrc-lmb.cam.ac.uk/scop/), each subunit is subcategorized as a unique structural superfamily. The globular subunits, Va, Vb, and VIb, also do not share a fold classification. The lack of either sequence or structural similarity between different subunits suggests that none of them arose through duplication from another nuclear-encoded subunit.
Table 1 Nomenclature of Nuclear-Encoded COX Subunits IV, Va, and Vb Across Different Taxa.
Table 2 Nomenclature of Nuclear-Encoded COX Subunits VIa, VIb, and VIc Across Different Taxa.
Table 3 Nomenclature of Nuclear-Encoded COX Subunits VIIa, VIIb, VIIc, and VIII Across Different Taxa.
Comparison of COX Subunit Composition Across Taxa
Although the three mitochondrial-encoded subunits are present in all organisms that contain COX, the composition of the holoenzyme varies between taxa. Determination of the crystal structure of bovine COX (Tsukihara et al. 1996) clarified the composition of the vertebrate holoenzyme. Recent completion of whole-genome sequencing projects has now made it possible to use the vertebrate sequences to search for homologs in all eukaryotic kingdoms.
Ludwig and colleagues have postulated that the number of subunits is correlated with the regulatory complexity of the enzyme, as suggested by the presence of four subunits in the bacterium Paracoccus denitrificans, seven in Dictyostelium discoideum, 11 in S. cerevisiae, and 13 in Bos taurus (Ludwig et al. 2001). Our analysis suggests that this trend is true within the metazoans as well. Mammalian subunit VIIb has no homolog in insects (Szuplewski and Terracol 2001) or in nematodes, and Caenorhabditis elegans lacks genes homologous to mammalian subunits VIc, VIIa, and VIII as well (tables 2 and 3). However, failure to find homologs in these other species may be the result of incomplete sequencing or annotation of the genome (as is likely with subunit VIIc in Anopheles gambiae because this subunit is present in both D. melanogaster and C. elegans but not in Anopheles).
The composition of COX in plants provides further evidence that nuclear-encoded subunits may have evolved for taxon-specific regulatory functions. In plants, electrophoresis has shown that COX is composed of at least seven nuclear-encoded subunits (Jansch et al. 1996). However, only two vertebrate nuclear-encoded subunits (Vb and VIb) have orthologs in the Arabidopsis thaliana genome, which suggests that the remaining subunits are unique to plants. Of these novel subunits, only one, termed 5c, has been characterized (Hamanaka et al. 1999). Our Blast search of the A. thaliana genome also found an uncharacterized fourth paralog of subunit VIb that was not identified in a previous Southern hybridization screen (Ohtsu et al. 2001). This omission is likely the result of significant divergence from the other three paralogs in the region of the probe used for the screen, which included most of the coding region of AtCOX6b2.
Origin and Evolution of Nuclear-Encoded COX Subunits
Like the nuclear-encoded COX subunits, the majority of genes required for mitochondrial function are encoded by the nuclear genome. Some of these genes originated in the ancestral proteobacterium with subsequent transfer to the host nucleus, whereas others evolved within the eukaryotic nuclear genome (Berg and Kurland 2000). To explore the evolutionary origin of the nuclear-encoded COX subunits, we tested the first possibility by searching for homologous sequences in the -proteobacterium Rickettsia prowazekii, the prokaryote thought to be most closely related to ancestral mitochondria (Andersson et al. 1998), and the mitochondrial genome of the protozoan Reclinomonas americana, which contains the largest number of genes of any mitochondrion studied (Lang et al. 1997). A Blast search failed to find homologs of any nuclear-encoded subunits in Rickettsia, which suggests that these proteins were not present in the ancestral endosymbiont. The Reclinomonas mitochondrial genome, although containing COX subunits I, II, and III as well as a homolog of a nuclear-encoded COX assembly protein, also does not contain any genes similar to the other nuclear-encoded subunits (Lang et al. 1997). Together, these findings suggest that the nuclear subunits did not originate in the mitochondrial genome.
To determine when the COX subunits may have arisen in the eukaryotic genome, we searched for homologs in the genome of the protozoan Giardia lamblia, an ancient amitochondriate eukaryote that diverged close to when eukaryotes and prokaryotes split. Blast searches failed to reveal any putative homologs of nuclear-encoded COX subunits in Giardia, which suggests that these genes evolved in eukaryotes after the invasion of the ancestral endosymbiont. To attempt to find other genes in the eukaryotic genome from which the nuclear-encoded subunits may have been derived, we performed PSI-Blast searches with each of the nuclear-encoded subunits. Two iterations of each search failed to identify any genes with significant homology, which suggests extensive divergence from ancestral genes.
Although the evolutionary origin of the nuclear-encoded subunits remains a mystery, they have subsequently evolved specialized functions. ClustalW sequence alignments show that each subunit is orthologous across all taxa in which it is found (data not shown), but the presence of multiple paralogs within taxa suggests evolution of regulatory functions. For example, as previously discussed, alternative paralogs are expressed under normal or hypoxic conditions in both D. discoideum and S. cerevisiae (Schiavo and Bisson 1989). In vertebrates, multiple subunits have paralogs that are expressed either ubiquitously (L, liver-type) or primarily in mature contractile muscles (H, heart-type) (Ewart, Zhang, and Capaldi 1991), whereas paralogs of subunit VIb show differential tissue expression in plants and mammals (Ohtsu et al. 2001; Huttemann, Jaradat, and Grossman 2003). Similar functional differences may explain the existence of other uncharacterized paralogs found in insects and plants (tables 1–3). There also is evidence for adaptive evolution of individual paralogs after duplication, as shown by the inactivation of subunit VIIIH in Old World monkeys and hominids, possibly linked to optimization of aerobic energy metabolism (Goldberg et al. 2003).
Conservation of COX Subunits and Implications for Function
Although limited comparative analyses within taxa have been undertaken to identify evolutionary patterns in COX (Schmidt, Goodman, and Grossman 1999; Ludwig et al. 2001), a broader analysis across diverse taxa increases the power to detect specific sites that are essential for basic enzyme function and regulation. To identify regions of potential functional importance, we aligned sequences from all taxa in tables 1, 2, and 3, as well as the bovine sequence for each subunit. Sequences of all known isoforms within a species were included in the alignment. Subunits present in only one taxon (Vc in plants, VIII in animals, and VIIb in humans only) were excluded from the analysis because they are not essential for function across taxa. For each subunit, residues were classified as "identical," "conserved," or "nonconserved," on basis of the level of physicochemical conservation determined by a ClustalW sequence alignment across all taxa in which the subunit is present.
As expected, the three mitochondrial-encoded subunits encoding the core enzyme are conserved across diverse species at comparatively high levels that from 36% to 63%. Subunit Va, which is 39% conserved and is found in animals and yeasts, is much more conserved than any of the other nuclear-encoded subunits. The high level of conservation suggests that subunit Va has been subject to stronger purifying selection. Supporting this implication of functional constraint is experimental evidence for ligand interaction that involves subunit Va (Arnold, Goglia, and Kadenbach 1998). The evidence is discussed in more detail in following sections. Subunits VIa and VIb are conserved at moderate levels (22% and 24%, respectively). Subunit VIb is one of the two nuclear-encoded subunits shared between animals, yeasts, and plants and, based on studies of the crystal structure (Tsukihara et al. 1996), it appears to play a key role in intermonomer contact, whereas subunit VIa contains a known interaction site with ATP (Taanman, Turina, and Capaldi 1994). In contrast, subunit VIIc is only 16% conserved and has no characterized function.
A low proportion of conserved amino acids, however, does not necessarily indicate a lack of function. Subunit IV is proportionately the least conserved of the nuclear-encoded subunits (9%). Despite the low level of conservation between the animal and yeast subunit IV sequences, we concur with previous authors (Huttemann, Kadenbach, and Grossman 2001) that these subunits are homologs, albeit highly divergent, on the basis of two results from our sequence analysis. First, in a search of the NCBI Conserved Domain Database, the yeast sequences show significant homology to COX IV domains. Second, in a neighbor-joining tree of all COX nuclear-encoded subunits from all species (unpublished data), the yeast sequences cluster most closely in a monophyletic group with the subunit IV sequences of animals, in contrast with the deep divergence between monophyletic groups of any two other subunits.
Identification of Functional Domains by Comparative Structural Analysis
To detect specific functional sites in the nuclear-encoded subunits, we mapped identical and conserved amino acids onto the crystal structure of bovine COX. We then assessed whether these sites correlated with residues of known function on the basis of biochemical data or whether they might be novel sites of unknown function and, thus, targets for future experimental investigation. The crystal structure of bovine COX in dimeric form (Tsukihara et al. 1996) is shown in figure 1; all identical residues in the nuclear-encoded subunits are highlighted. (All sequence alignments, a complete list of identical and conserved residues, and stereo images of each subunit with identical and conserved residues highlighted are included in Supplementary Material online.) Specific residues are referred to by their nomenclature in accordance with the bovine sequence in the PDB structure.
FIG. 1. Identical residues highlighted on the bovine heart COX structure. Residues discussed in the text are numbered according to their position in the bovine structure. Nuclear subunits are labeled on the right monomer; hypothetical binding sites for ATP/ADP (in yeast and/or mammals), T2 (in mammals), human androgen receptor (hAR, in mammals), the regulatory subunit of protein kinase A (RI, in mammals), and Na+/Ca2+ (in mammals), as well as the location of the zinc ion, are indicated on the left. Mitochondrial subunits are in gray; indicated membrane boundaries are approximate
Below, we discuss the structural context and putative functional significance of the identical residues in each subunit. In the following analysis, we define amino acids to be in contact if the minimum atom-atom distance between their side chains is less than 5 ? (Russell and Barton 1994).
Subunit IV
Subunit IV is the least conserved of the nuclear-encoded subunits. The identical and conserved residues fall into two regions of the subunit—one cluster in the matrix domain and five residues in the cytosolic domain. In the matrix domain, WIV48, an identical site, contacts other identical residues in a highly conserved region of subunit Va and is likely to be part of the same functional region (fig. 2). Identical site EIV55 appears to be important for maintaining a helix-turn-helix structure through its contacts with three other conserved residues (LIV40, KIV43, and LIV51) on the adjacent helix and loop. The charged side chains of EIV55 and KIV43 are only 3.0 ? apart and well positioned to form a stabilizing salt bridge. This fold allows WIV48 to be positioned to extend into a highly conserved domain of subunit Va. Other conserved residues in the matrix domain also maintain contact with subunit Va.
FIG. 2. Interaction of conserved regions of nuclear-encoded subunits IV and Va. Wall-eye stereo view of matrix domains of subunits IV (red) and Va (blue). Identical and conserved residues are labeled and shown in ball-and-stick models. Dashed line indicates a putative salt bridge between KIV43 and EIV55, whose side chains are 3.0 ? apart. The contacts between LIV40, KIV43, LIV51, and EIV55 maintain the helix-turn-helix structure in subunit IV that positions WIV48 to extend into a highly conserved region of subunit Va. This region corresponds to part of the likely binding site for thyroid hormone T2
In mammals, two ATP binding sites in subunit IV have been proposed. One binding site has been modeled in the matrix domain and involves residues RIV20, RIV73, TIV75, EIV77, and WIV78, as well as residues from subunits I and II (Huttemann, Kadenbach, and Grossman 2001). None of these residues in subunit IV are conserved across taxa. The second ATP binding site in mammals is located in the cytosolic domain (Reimann et al. 1988), but the putative function of the conserved cytosolic residues in subunit IV is not clear. The sole identical residue in that region, GIV133, is likely conserved not because of interaction with other residues or ligands, but rather to provide the structural flexibility needed for the sharp bend in the backbone chain in that region. Indeed, analysis of the geometry of this residue by application of PROCHECK (Laskowski et al. 1993) shows that the C-N-C angle of GIV133 deviates from the ideal angle of 112.5° by 16.7°, the largest such deviation in the COX structure by some 20%.
Subunit Va
The five -helices of subunit Va form a right-handed superhelix (Tsukihara et al. 1996), a structure that probably is maintained across animals and yeasts, as shown by the distribution of identical and conserved residues throughout the protein. However, the majority of the identical residues are clustered in the three -helices formed by the amino acid sequence between PVa43 and GVa97. This region, which includes a number of charged residues, is 48% conserved (28% identical) and forms an exposed pocket that is also bordered by WIV48 (fig. 2). Subunit Va has been shown to bind thyroid hormone 3,5-diiodothyronine (T2) (Arnold, Goglia, and Kadenbach 1998), and we speculate that this pocket may be the T2 binding site.
Subunit Vb
The COOH-terminal of subunit Vb contains a zinc site that, in animals, involves four cysteine residues structurally arranged in a classical zinc finger motif (Tsukihara et al. 1996). Whereas three of these cysteines are identical across all taxa, CVb62 is substituted with a glycine residue in plants (Welchen, Chan, and Gonzalez 2002), as well as in yeasts and in D. discoideum. However, the sequences for subunit Vb in Arabidopsis, rice, S. cerevisiae, S. pombe, and D. discoideum contain a histidine residue at position 67. This histidine may fulfill the same function as CVb62 in animals, because histidine residues can also coordinate zinc ions in classical zinc finger domains (Matthews and Sunde 2002). We predict that this evolutionary change in animals would likely require a change in the conformation of the backbone to properly position the cysteine to coordinate the zinc ion. Identical residues SVb51 and RVb56 may be critical residues for attachment to subunit I, supported by several other conserved residues contacting core subunits I and III. Also, the ?-barrel structure of the COOH-terminal domain (Tsukihara et al. 1996) may be maintained across taxa, on the basis of conservation of three residues in contact on the inner surface (VVb49, VVb58, and YVb89) that appear to stabilize the structure.
In mammals, subunit Vb interacts with the regulatory subunit of protein kinase A. This action inhibits COX activity in a cAMP-dependent process (Yang et al. 1998; Bender and Kadenbach 2000). In addition, in vitro studies have shown that subunit Vb, primarily through the COOH-terminal, binds the human androgen receptor (hAR), although the effect of hAR on COX activity is unknown (Beauchemin et al. 2001). Although these putative regulatory mechanisms have not been tested across taxa, conservation of similar interactions may underlie the maintenance of functional and structural domains of subunit Vb.
Subunit VIa
The primary conserved region of subunit VIa is a loop structure on the cytosolic side of the enzyme. This region correlates with a putative ATP binding site at residues 63 to 68, previously identified by more limited sequence comparison and based on similarity to other ATP binding–site motifs (Taanman, Turina, and Capaldi 1994). The GDGXX(T/S) motif is conserved in all taxa we analyzed, with slight modifications: GS at position 1 in C. elegans and DE at position 2 in A. gambiae and in D. melanogaster CG17280. Our analysis identified several other residues that may also be part of the binding site or that may help to stabilize the structure. RVIa56, KVIa58, WVIa62, FVIa70, and NVIa72 are identical amino acids that are found on either side of the ATP binding motif and that contact subunits I or III.
Subunit VIa is also adjacent to subunit VIb near the dimer interface. YVIa50 is an identical residue that contacts residues GVIb79 (identical) and FVIb81 (conserved) in subunit VIb, which suggests that it is critical for the interface of the two subunits. Of the 10 residues at the NH2-terminal of subunit VIa that contact subunit I of the opposite monomer, only AVIa4 is conserved. This arrangement indicates some variation in how the intermonomer contact in that region is maintained. Finally, six identical and conserved residues are distributed throughout the transmembrane helix of subunit VIa. Although functional studies in yeast and mammals suggest a second ATP binding site on the matrix side of VIa (Anthony, Reimann, and Kadenbach 1993; Beauvoit et al. 1999), the orientation of these conserved residues along the helix does not make them obvious candidates for a putative binding site. Instead, they may be more important for maintaining the structure of the helix or contact with subunit III.
Subunit VIb
Subunit VIb is composed of three -helices connected by relatively tight turns. VIb is the only subunit that is situated completely on the side of the intermembrane space and is the primary nuclear subunit responsible for intermonomer contact. The importance of this role is reflected in the high level of sequence conservation across taxa. The region of contact is from residues 39 to 53, which forms the turn between two helices that extend into the intermembrane space (Tsukihara et al. 1996). GlyVIb47 is identical across all taxa, perhaps because of the constraints of the tight structure of the turn. The two extending helices are maintained in a near antiparallel geometry by two disulfide bridges between CVIb29 to CVIb64 and CVIb39 to CVIb53, all of which are identical. Several other identical and conserved residues likely preserve the orientation of the third helix as well, whereas conserved COOH-terminal amino acids contact subunit VIa.
Although the amino acids that are essential for the conformation of this domain are all identical, the intervening residues vary between species. For example, mammals have three extra amino acids in the loop at positions 42 to 44, and both the testis-specific isoform in mammals (Huttemann, Jaradat, and Grossman 2003) and the novel isoform in Arabidopsis are highly divergent from the common isoforms of their respective species in this region. Such lineage-specific and tissue-specific diversity suggests that this region may be evolving for some physiological function.
An interaction site for cytochrome c has been proposed on the cytosolic side of the enzyme. The proposed interaction involves acidic residues from subunits I, II, III, and VIb (Tsukihara et al. 1996). Of the amino acids from subunit VIb that are hypothesized to interact with cytochrome c, DVIb74 is conserved across taxa, whereas EVIb78 is not. Because DVIb74 is the only conserved residue on the external surface of the third helix, our analysis supports an important role for this amino acid in interacting with cytochrome c. There are also three identical sites in the NH2-terminal region. Although these residues (19 to 21) may merely stabilize the conformation of the subunit, they are on the margin of an open pocket on the cytosolic surface. This region may have functional significance because it is bordered by identical residues from subunits I and III as well.
Subunit VIc
The identical and conserved sites in subunit VIc do not correlate with any known function. Two identical sites, FVIc50 and YVIc51, are located at the cytosolic end of the transmembrane helix and contact subunits II and IV. The neighboring amino acids on subunits II and IV are not conserved, however, so there is no selection on interacting pairs. The third identical residue in subunit VIc is located in the cytosolic -helix, directly above the other two residues. As KVIc58 is near no other subunit, these three identical residues together may be important for some ligand interaction.
Subunit VIIa
Three identical sites in subunit VIIa occur at the sharp turn between the transmembrane helix and the NH2-terminal -helix. These residues are likely to be required to maintain the structure of the subunit and its orientation with respect to other subunits; for example, PVIIa19 is located at the sharp bend in the turn, whereas QVIIa13 forms a polar interaction with the conserved residue TVb14. VVIIa14, which is also conserved, contacts an identical residue on subunit III, EIII60. Less explicable is the identity of LVIIa31, which contacts YIII55 and QIII56. These two residues in subunit III are not conserved in plants and yeasts but are identical within animals, as well as adjacent to WIII57, which is identical across taxa. Therefore, some function may be associated with this region.
Subunit VIIc
Four identical residues in subunit VIIc are paired in Pro-Phe combinations in two locations. One pair occurs in the irregular NH2-terminal domain on the matrix side; the position of PVIIc12 at a bend in this region suggests that this pair is required for the structure. The transmembrane helix of subunit VIIc changes its angle of inclination midway through the helix. The second pair, with PVIIc36, is located at this bend in the helix and is supported by another identical residue, FVIIc33, which again suggests that it is conserved for structural integrity. Other conserved residues at the COOH-terminal end of the transmembrane helix interact with conserved residues in subunit I.
Evidence for Selection on Functional Domains
Because our analysis of sequence conservation across taxa identified many residues that did not correlate with regions of known function, we performed two global analyses to assess whether these conserved residues are likely to be playing a functional or structural role and whether any selective pressures could be influencing the evolution of the enzyme as a whole. We first examined whether similar biochemical classes of amino acids were more likely to be conserved within different taxa. We then explored the spatial location of amino acids conserved across all taxa to determine whether they were positioned in a nonrandom distribution that would support selective constraints based on structure or function.
Groups of amino acids can be classified by the different roles they tend to play within the context of a protein's structure and function (Betts and Russell 2003). Variation in substitution rates of different amino acids can shed light on the selective forces acting on a protein. On the basis of the relative importance of different functional or structural constraints, amino acids that are more likely to be involved in polar interactions (for example in salt bridges or in ligand interaction) or nonpolar interactions (such as hydrophobic packing or aromatic ring stacking) may be conserved at different frequency. To determine whether similar functional constraints are acting on different evolutionary lineages, we analyzed within-taxon sequence alignments (human-Drosophila, S. cerevisiae–S. pombe, and Arabidopsis-rice) for similar patterns of amino acid conservation.
To predict the expected distribution of conserved residues given an equal probability of conservation for all amino acids, we calculated the average frequency of amino acid occurrence across all subunits within each lineage. The distribution of amino acids identical within each taxon was significantly different from the average distribution in yeasts and in animals but not in plants (2 test; yeasts, P = 0.017; animals, P = 0.023; plants, P = 0.639), as might be expected on the basis of the comparatively recent divergence of monocots and eudicots. However, in all taxa, the aromatic amino acids Phe, Tyr, and Trp were overrepresented among identical residues by at least 15% compared with the number expected on the basis of overall amino acid frequency, as were Gly and Cys (table 4). All of these amino acids play key roles in structural stability: the bulky side chains of aromatic residues are often key components of hydrophobic cores and can participate in energetically favorable stacking interactions; cysteines are important for disulfide bonds and for binding metals; and glycine is conformationally flexible and occurs at tight turns in structures (Betts and Russell 2003). This pattern across independently evolving lineages suggests that, in general, hydrophobic amino acids in COX are conserved to maintain the structure of functional sites and possibly to maintain the structure of interfaces between subunits. In contrast, no strong trends are evident across taxa in the conservation of hydrophilic amino acids. However, those hydrophilic residues that are conserved are still likely to play important roles; for example, salt bridges tend to be conserved for specific functional or structural purposes rather than by any general rule based on solvent exposure, proximity to active sites, or location in the protein (Schueler and Margalit 1995).
Table 4 Distribution of Identical Amino Acids Within Taxa.
Visual inspection of the COX structure showed that identical and physicochemically conserved residues occur in spatial proximity to each other, which suggests that many of these residues may indeed be positioned to form functional sites (fig. 3A). We explored this observation by calculating whether identical and conserved residues have a tendency to cluster together. Clustering of conserved amino acids would suggest that selection acts to maintain functional or structural interactions between residues.
FIG. 3. Conserved residues are nonrandomly clustered in putative functional domains. (A) Amino acids in subunits Vb and VIIc are shown in space-filling display to illustrate how conserved residues (colored green [Vb] and blue [VIIc]) tend to be located in close proximity to identical residues (colored red [Vb] and orange [VIIc]). (B) A histogram of pairwise distances between the 284 identical and conserved residues in the COX dimer (solid bars) and between an equal number of randomly chosen residues (open bars). Compared with the randomly chosen residues, conserved sites occur more frequently at distances of less than 20 ? from each other
We plotted the frequency of pairwise minimum atom-atom distances of less than 30 ? for all identical and conserved amino acids as well as for an equal number of randomly selected residues (fig. 3B). There is an excess of conserved amino acids at distances of less than 15 ? from each other; amino acids within this distance are likely to be part of the same structural feature or functional region. This result, along with the overrepresentation of structurally important residues among conserved amino acids, supports the premise that these conserved residues act to maintain the structural integrity of functional sites. Even when the highly conserved subunits Va and VIb are excluded from the analysis, direct contacts (defined as less than 5 ?) still occur at twofold higher frequency among conserved residues than in the randomized data set (data not shown).
Discussion
What are the roles of the nuclear-encoded subunits in COX? Experimental approaches have revealed many intriguing functions for the nuclear-encoded subunits in model organisms, such as environmental regulation, tissue-specific function, and ATP-dependent allosteric regulation (Schiavo and Bisson 1989; Ewart, Zhang, and Capaldi 1991; Taanman, Turina, and Capaldi 1994). Other studies have employed phylogenetic techniques to measure rates of substitution in the nuclear-encoded subunits to identify residues likely to be functional or coevolving with known interacting proteins or cofactors (Schmidt et al. 2001; Goldberg et al. 2003). However, these previous studies have been confined largely to one organism or to a group of relatively closely related species, which limits our understanding of how widespread these regulatory mechanisms are and of which ones are essential for basic regulation across diverse taxa. Our broader analysis allows us to detect highly conserved sites to identify essential functional domains of the enzyme.
Early in the past decade, the focus on finding functions of the nuclear-encoded subunits shifted to lower eukaryotes, in part because experimental studies of higher eukaryotes were proving difficult (Capaldi et al. 1990). Since then, researchers have begun to recognize the diversity of regulatory mechanisms provided by the nuclear-encoded subunits, which appear to fine-tune the basic redox reaction of COX to different physiologies. With the availability of whole-genome sequences, we have been able to perform a broad search for subunit homologs across taxa that revealed different subunit compositions of the holoenzyme, even within the metazoans. The roles of evolutionarily novel subunits merit closer investigation, as preliminary evidence indicates interesting functional roles. For example, expression of subunit VIIb is depressed in the intestines of mice deficient in gastrointestinal pacemaker cells (Takayama et al. 2001), which suggests a regulatory role consistent with origin along the deuterostome lineage. Subunit VIII evolved before the protostome-deuterostome split, yet subsequent duplication in the vertebrates followed by inactivation of one paralog in humans implies relatively rapid evolution of function (Goldberg et al. 2003); this subunit remains uncharacterized in invertebrates. Plants, in particular, appear to have an almost entirely novel set of nuclear-encoded subunits. Initial characterization of plant subunits Vc and VIb show differential tissue expression, and the novel divergent paralog of Arabidopsis subunit VIb that was identified in our analysis may reveal new regulatory properties.
The question of how this diversity of nuclear-encoded subunits evolved remains a conundrum. There is evidence for recent transfer of COX genes from the mitochondrial to the nuclear genome. During the evolution of plants, subunit II has been transferred to the nucleus in the legume lineage, with some species maintaining two active copies and other species silencing either the nuclear or the mitochondrial copy (Palmer et al. 2000). This plasticity may be facilitated by the large size of the plant mitochondrial genome. We were unable to find any support for a similar mitochondrial origin of the nuclear-encoded subunits or of any other nuclear genes from which the subunits may have been derived. Therefore, we conclude that they arose in the nuclear genome after the acquisition of mitochondria and have diverged extensively from any ancestral genes.
By aligning sequences of the nuclear-encoded subunits from highly divergent species and mapping identical and physicochemically conserved sites onto the protein structure, we have been able to classify regions of interest of the protein into three broad categories: (1) known functional sites that have diverged among species, (2) known functional sites that are conserved across species, and (3) conserved sites of unknown function. Although we do not speculate further on the third class, we discuss below the most intriguing examples of the first two.
One case of divergence of known functional regions involves the sites at which nucleotides can bind to regulate COX activity. Equilibrium analysis of COX with ATP and ADP have revealed 10 nucleotide binding sites in the bovine monomer (Napiwotzki et al. 1997). Two of these sites, the matrix and cytosolic ATP binding sites on subunit IV, have been studied experimentally in mammals (Napiwotzki et al. 1997; Huttemann, Kadenbach, and Grossman 2001). In yeasts, ATP has also been shown to inhibit COX activity through binding on both the matrix and the cytosolic sides of subunit VIa (Anthony, Reimann, and Kadenbach 1993; Taanman, Turina, and Capaldi 1994; Beauvoit et al. 1999). Our analysis shows that among the locations of these four nucleotide binding sites, only the cytosolic domain of subunit VIa is conserved across animals and yeasts. The others have either arisen independently in each lineage or have diverged through compensatory mutations that preserve the function of the region.
An example of an unpredicted conserved region is the putative binding site for thyroid hormone T2 on subunit Va in mammals (Arnold, Goglia, and Kadenbach 1998). T2 increases COX activity by interfering with allosteric inhibition by ATP. Our analysis reveals that the binding site for T2 is highly conserved across animals and yeasts, a surprising result because insects and yeasts do not produce thyroid hormones. What might be the analogous ligand in these other species? In insects, a likely functional analog is juvenile hormone (JH) (Wheeler and Nijhout 2003); L-3,5,3'-triiodothyronine (T3), an adduct of T2, can act via a putative JH receptor in insect follicle cell membranes (Kim et al. 1999). Presumably a similar analog also acts in yeasts.
On the basis of the proximity of subunits IV and Va, T2 has been proposed to affect COX activity by interacting with the ATP binding site in the matrix domain of subunit IV (Ludwig et al. 2001). Because this ATP binding site does not appear to be conserved across taxa, we argue that this mechanism is unlikely to be a widespread phenomenon, unless a matrix ATP binding site in subunit IV has been independently maintained in different taxa after duplication. Alternatively, an allosteric interaction may occur with the conserved domain in the cytosolic domain of subunit VIa.
In addition to residues in these known binding sites or those with obvious interactions with other subunits or cofactors, our analysis has identified essential structural elements as well as many identical and conserved amino acids whose functions are unclear and are worthy of future study. Some of these structural elements and amino acids may be catalytic residues in uncharacterized active sites. However, our results show that hydrophobic and nonpolar amino acids, which are more likely to be involved in structural scaffolding rather than in substrate interactions, tend to be conserved at frequencies higher than expected within all taxa. Because these identical and conserved residues are nonrandomly distributed in clusters throughout the protein, we suggest that these groups of interacting amino acids may be conserved to maintain the structural foundation for evolving catalytic sites.
Supplementary Material
The following supplementary material can be found online at the journal's web site:
Multiple sequence alignments in interleaved format for all nuclear-encoded subunits (HTML).
Multiple sequence alignments in sequential format for all nuclear-encoded subunits (text).
Complete list of all identical and conserved residues (tab-delimited text).
PyMOL scripts for the complete structure and all individual subunits with identical and conserved residues highlighted (HTML).
Stereo images of the complete structure and all individual subunits with identical and conserved residues highlighted (PDF).
Acknowledgements
We thank Greg Davis, Alistair McGregor, members of the Stern lab, and three anonymous reviewers for helpful comments. This work was supported by a Howard Hughes Medical Institute Predoctoral Fellowship to J.D. and a David and Lucile Packard Foundation Fellowship and a National Institutes of Health grant to D.L.S.
Literature Cited
Altschul, S. F., T. L. Madden, A. A. Schaffer, J. Zhang, Z. Zhang, W. Miller, and D. J. Lipman. 1997. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25:3389-3402.
Andersson, S. G., A. Zomorodipour, J. O. Andersson, T. Sicheritz-Ponten, U. C. Alsmark, R. M. Podowski, A. K. Naslund, A. S. Eriksson, H. H. Winkler, and C. G. Kurland. 1998. The genome sequence of Rickettsia prowazekii and the origin of mitochondria. Nature 396:133-140.
Anthony, G., A. Reimann, and B. Kadenbach. 1993. Tissue-specific regulation of bovine heart cytochrome-c oxidase activity by ADP via interaction with subunit VIa. Proc. Natl. Acad. Sci. USA 90:1652-1656.
Arnold, S., F. Goglia, and B. Kadenbach. 1998. 3,5-Diiodothyronine binds to subunit Va of cytochrome-c oxidase and abolishes the allosteric inhibition of respiration by ATP. Eur. J. Biochem. 252:325-330.
Beauchemin, A. M., B. Gottlieb, L. K. Beitel, Y. A. Elhaji, L. Pinsky, and M. A. Trifiro. 2001. Cytochrome c oxidase subunit Vb interacts with human androgen receptor: a potential mechanism for neurotoxicity in spinobulbar muscular atrophy. Brain Res. Bull. 56:285-297.
Beauvoit, B., O. Bunoust, B. Guerin, and M. Rigoulet. 1999. ATP regulation of cytochrome oxidase in yeast mitochondria: role of subunit VIa. Eur. J. Biochem. 263:118-127.
Bender, E., and B. Kadenbach. 2000. The allosteric ATP-inhibition of cytochrome c oxidase activity is reversibly switched on by cAMP-dependent phosphorylation. FEBS Lett. 466:130-134.
Berg, O. G., and C. G. Kurland. 2000. Why mitochondrial genes are most often found in nuclei. Mol. Biol. Evol. 17:951-961.
Berman, H. M., J. Westbrook, Z. Feng, G. Gilliland, T. N. Bhat, H. Weissig, I. N. Shindyalov, and P. E. Bourne. 2000. The protein data bank. Nucleic Acids Res. 28:235-242.
Betts, M. J., and R. B. Russell. 2003. Amino acids properties and consequences of substitutions. Pp. 289–316 in M. R. Barnes, and I. C. Gray, eds. Bioinformatics for geneticists. Wiley, Chichester, UK.
Burke, P. V., D. C. Raitt, L. A. Allen, E. A. Kellogg, and R. O. Poyton. 1997. Effects of oxygen concentration on the expression of cytochrome c and cytochrome c oxidase genes in yeast. J. Biol. Chem. 272:14705-14712.
Capaldi, R. A. 1990. Structure and function of cytochrome c oxidase. Annu. Rev. Biochem. 59:569-596.
Capaldi, R. A., Y. Z. Zhang, R. Rizzuto, D. Sandona, G. Schiavo, and R. Bisson. 1990. The two oxygen-regulated subunits of cytochrome c oxidase in Dictyostelium discoideum derive from a common ancestor. FEBS Lett. 261:158-160.
DeLano, W. L. 2002. The PyMOL molecular graphics system. DeLano Scientific, San Carlos, CA, USA.
Edmands, S., and R. Burton. 1998. Variation in cytochrome-c oxidase activity is not maternally inherited in the copepod Tigriopus californicus. Heredity 80:668-674.
Ewart, G. D., Y. Z. Zhang, and R. A. Capaldi. 1991. Switching of bovine cytochrome c oxidase subunit VIa isoforms in skeletal muscle during development. FEBS Lett. 292:79-84.
Geier, B. M., H. Schagger, C. Ortwein, T. A. Link, W. R. Hagen, U. Brandt, and G. Von Jagow. 1995. Kinetic properties and ligand binding of the eleven-subunit cytochrome-c oxidase from Saccharomyces cerevisiae isolated with a novel large-scale purification method. Eur. J. Biochem. 227:296-302.
Goldberg, A., D. E. Wildman, T. R. Schmidt, M. Huttemann, M. Goodman, M. L. Weiss, and L. I. Grossman. 2003. Adaptive evolution of cytochrome c oxidase subunit VIII in anthropoid primates. Proc. Natl. Acad. Sci. USA 100:5873-5878.
Golding, G. B., and A. M. Dean. 1998. The structural basis of molecular adaptation. Mol. Biol. Evol. 15:355-369.
Hamanaka, S., K. Ohtsu, K. Kadowaki, M. Nakazono, and A. Hirai. 1999. Identification of cDNA encoding cytochrome c oxidase subunit 5c (COX5c) from rice: comparison of its expression with nuclear-encoded and mitochondrial-encoded COX genes. Genes Genet. Syst. 74:71-75.
Huttemann, M., S. Jaradat, and L. I. Grossman. 2003. Cytochrome c oxidase of mammals contains a testes-specific isoform of subunit VIb—the counterpart to testes-specific cytochrome c? Mol. Reprod. Dev. 66:8-16.
Huttemann, M., B. Kadenbach, and L. I. Grossman. 2001. Mammalian subunit IV isoforms of cytochrome c oxidase. Gene 267:111-123.
Jansch, L., V. Kruft, U. K. Schmitz, and H. P. Braun. 1996. New insights into the composition, molecular mass and stoichiometry of the protein complexes of plant mitochondria. Plant J. 9:357-368.
Kenyon, L., and C. T. Moraes. 1997. Expanding the functional human mitochondrial DNA database by the establishment of primate xenomitochondrial cybrids. Proc. Natl. Acad. Sci. USA 94:9131-9135.
Kim, Y., E. D. Davari, V. Sevala, and K. G. Davey. 1999. Functional binding of a vertebrate hormone, L-3,5,3'-triiodothyronine (T3), on insect follicle cell membranes. Insect Biochem. Mol. Biol. 29:943-950.
Lang, B. F., G. Burger, C. J. O'Kelly, R. Cedergren, G. B. Golding, C. Lemieux, D. Sankoff, M. Turmel, and M. W. Gray. 1997. An ancestral mitochondrial DNA resembling a eubacterial genome in miniature. Nature 387:493-497.
Laskowski, R. A., M. W. MacArthur, D. S. Moss, and J. M. Thornton. 1993. PROCHECK: a program to check the stereochemical quality of protein structures. J. Appl. Cryst. 26:283-291.
Ludwig, B., E. Bender, S. Arnold, M. Huttemann, I. Lee, and B. Kadenbach. 2001. Cytochrome c oxidase and the regulation of oxidative phosphorylation. Chem. Biochem. 2:392-403.
Matthews, J. M., and M. Sunde. 2002. Zinc fingers–folds for many occasions. IUBMB Life 54:351-355.
Napiwotzki, J., K. Shinzawa-Itoh, S. Yoshikawa, and B. Kadenbach. 1997. ATP and ADP bind to cytochrome c oxidase and regulate its activity. Biol. Chem. 378:1013-1021.
Ohtsu, K., M. Nakazono, N. Tsutsumi, and A. Hirai. 2001. Characterization and expression of the genes for cytochrome c oxidase subunit VIb (COX6b) from rice and Arabidopsis thaliana. Gene 264:233-239.
Palmer, J. D., K. L. Adams, Y. Cho, C. L. Parkinson, Y. L. Qiu, and K. Song. 2000. Dynamic evolution of plant mitochondrial genomes: mobile genes and introns and highly variable mutation rates. Proc. Natl. Acad. Sci. USA 97:6960-6966.
Poyton, R. O., and J. E. McEwen. 1996. Crosstalk between nuclear and mitochondrial genomes. Annu. Rev. Biochem. 65:563-607.
Reimann, A., F. J. Huther, J. A. Berden, and B. Kadenbach. 1988. Anions induce conformational changes and influence the activity and photoaffinity-labelling by 8-azido-ATP of isolated cytochrome c oxidase. Biochem. J. 254:723-730.
Ruitenberg, M., A. Kannt, E. Bamberg, K. Fendler, and H. Michel. 2002. Reduction of cytochrome c oxidase by a second electron leads to proton translocation. Nature 417:99-102.
Russell, R. B., and G. J. Barton. 1994. Structural features can be unconserved in proteins with similar folds: an analysis of side-chain to side-chain contacts secondary structure and accessibility. J. Mol. Biol. 244:332-350.
Sackton, T. B., R. A. Haney, and D. M. Rand. 2003. Cytonuclear coadaptation in Drosophila: disruption of cytochrome c oxidase activity in backcross genotypes. Evolution 57:2315-2325.
Schiavo, G., and R. Bisson. 1989. Oxygen influences the subunit structure of cytochrome c oxidase in the slime mold Dictyostelium discoideum. J. Biol. Chem. 264:7129-7134.
Schmidt, T. R., M. Goodman, and L. I. Grossman. 1999. Molecular evolution of the COX7A gene family in primates. Mol. Biol. Evol. 16:619-626.
Schmidt, T. R., W. Wu, M. Goodman, and L. I. Grossman. 2001. Evolution of nuclear- and mitochondrial-encoded subunit interaction in cytochrome c oxidase. Mol. Biol. Evol. 18:563-569.
Schueler, O., and H. Margalit. 1995. Conservation of salt bridges in protein families. J. Mol. Biol. 248:125-135.
Szuplewski, S., and R. Terracol. 2001. The cyclope gene of Drosophila encodes a cytochrome c oxidase subunit VIc homolog. Genetics 158:1629-1643.
Taanman, J. W., P. Turina, and R. A. Capaldi. 1994. Regulation of cytochrome c oxidase by interaction of ATP at two binding sites, one on subunit VIa. Biochemistry 33:11833-11841.
Takayama, I., Y. Diago, S. M. Ward, K. M. Sanders, T. Yamanaka, and M. A. Fujino. 2001. Differential gene expression in the small intestines of wildtype and W/W-V mice. Neurogastroenterol. Motil. 13:163-168.
Thompson, J. D., D. G. Higgins, and T. J. Gibson. 1994. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22:4673-4680.
Tsukihara, T., H. Aoyama, E. Yamashita, T. Tomizaki, H. Yamaguchi, K. Shinzawa-Itoh, R. Nakashima, R. Yaono, and S. Yoshikawa. 1996. The whole structure of the 13-subunit oxidized cytochrome c oxidase at 2.8 angstrom. Science 272:1136-1144.
Tsukihara, T., K. Shimokata, and Y. Katayama, et al. (12 co-authors). 2003. The low-spin heme of cytochrome c oxidase as the driving element of the proton-pumping process. Proc. Natl. Acad. Sci. USA 100:15304-15309.
Welchen, E., R. L. Chan, and D. H. Gonzalez. 2002. Metabolic regulation of genes encoding cytochrome c and cytochrome c oxidase subunit Vb in Arabidopsis. Plant Cell Environ. 25:1605-1615.
Wheeler, D. E., and H. F. Nijhout. 2003. A perspective for understanding the modes of juvenile hormone action as a lipid signaling system. Bioessays 25:994-1001.
Yang, W. L., L. Iacono, W. M. Tang, and K. V. Chin. 1998. Novel function of the regulatory subunit of protein kinase A: regulation of cytochrome c oxidase activity and cytochrome c release. Biochemistry 37:14175-14180.(Jayatri Das*, Stephen T. )