当前位置: 首页 > 期刊 > 《病菌学杂志》 > 2006年第11期 > 正文
编号:11303922
Molecular Determinants of Substrate Specificity for Semliki Forest Virus Nonstructural Protease
http://www.100md.com 《病菌学杂志》
     Estonian Biocentre University of Tartu, Tartu, Estonia Institute of Biotechnology, University of Helsinki, Helsinki, Finland

    ABSTRACT

    The C-terminal cysteine protease domain of Semliki Forest virus nonstructural protein 2 (nsP2) regulates the virus life cycle by sequentially cleaving at three specific sites within the virus-encoded replicase polyprotein P1234. The site between nsP3 and nsP4 (the 3/4 site) is cleaved most efficiently. Analysis of Semliki Forest virus-specific cleavage sites with shuffled N-terminal and C-terminal half-sites showed that the main determinants of cleavage efficiency are located in the region preceding the cleavage site. Random mutagenesis analysis revealed that amino acid residues in positions P4, P3, P2, and P1 of the 3/4 cleavage site cannot tolerate much variation, whereas in the P5 position most residues were permitted. When mutations affecting cleavage efficiency were introduced into the 2/3 and 3/4 cleavage sites, the resulting viruses remained viable but had similar defects in P1234 processing as observed in the in vitro assay. Complete blockage of the 3/4 cleavage was found to be lethal. The amino acid in position P1' had a significant effect on cleavage efficiency, and in this regard the protease markedly preferred a glycine residue over the tyrosine natively present in the 3/4 site. Therefore, the cleavage sites represent a compromise between protease recognition and other requirements of the virus life cycle. The protease recognizes at least residues P4 to P1', and the P4 arginine residue plays an important role in the fast cleavage of the 3/4 site.

    INTRODUCTION

    Semliki Forest virus (SFV) is an enveloped positive-strand RNA virus of the genus Alphavirus, family Togaviridae. Alphavirus replication relies on the production of replicase proteins in the form of polyprotein precursor(s), which are then co- and posttranslationally processed (40). This finally leads to the generation of individual replicase protein subunits and ensures the formation of proper interactions between the cleavage products of a single polyprotein (35). However, it is not known whether multiple polyproteins are required to build up an individual replication complex. A large body of evidence indicates that the polyprotein expression and cleavage strategy regulates viral RNA replication (20, 24, 25, 37, 46). The full-length nonstructural (ns) polyprotein, designated P1234, is translated directly from the viral genomic RNA and becomes cleaved immediately after or during its synthesis. Its primary cleavage products, polyprotein P123 and nsP4, form the early replication complexes synthesizing negative-strand RNA. P123 is first cleaved to yield nsP1 and P23, giving an intermediate complex, nsP1-P23-nsP4, which is capable of both negative-strand and positive-strand synthesis (20, 24, 46) but which appears to be very short lived in infected cells. P23 is rapidly cleaved into nsP2 and nsP3, and the replication complex is rearranged into a stable form making exclusively positive-sense RNA with the negative strand as a template.

    Alphavirus ns proteins exhibit multiple activities during the replication cycle (17). NsP1 binds to cellular membranes and possesses methyltransferase and guanylyltransferase activities essential for viral RNA capping reactions (1, 2, 45). NsP3 is a necessary replication factor, which may also act as a poly-ADP-ribose binding protein or as an enzyme hydrolyzing ADP-ribose-1"-phosphate, similarly to its cellular and coronavirus homologs (19, 33, 46). The catalytic RNA polymerase subunit of the viral replicase is represented by nsP4 (5, 13, 28). The N-terminal domain of nsP2 participates in the replication process by providing NTPase, RNA helicase, and RNA triphosphatase activities (10, 34, 42), whereas the C-terminal domain shows proteolytic activity (7, 15, 44). The alphavirus nsP2 protease domain belongs to the papain superfamily of cysteine proteases (39; reviewed in reference 41). The proteolytic function of SFV nsP2 has been mapped to its C-terminal domain (amino acids 458 to 799). The recombinant protein Pro39, corresponding to this domain and named according to its molecular weight and activity, exhibited proteolytic activities similar to those of full-length nsP2 (44).

    There are large differences in the cleavage efficiencies of the three sites present in the nonstructural polyprotein. It was shown using short recombinant substrates that the site between nsP3 and nsP4 (the 3/4 site) was processed very readily, the 1/2 site was processed much less efficiently, and cleavage of the 2/3 site remained extremely poor even with a large excess of enzyme (44). Efficient cleavage of the 3/4 site in vitro is nicely in accord with the fact that this site is the first to be processed in vivo. The generation of mature nsP4 is absolutely required for alphavirus RNA synthesis (25, 37). The 1/2 site appears to be predominantly processed in cis, that is, by the protease present within the same polyprotein (43). However, as already mentioned, the purified protease can also cleave this site in trans with a low efficiency. When the 2/3 site is presented in short model substrates, it is virtually uncleavable by Pro39 or nsP2, in marked contrast to the other two sites. This is quite surprising, since the native P23 polyprotein is cleaved rapidly during in vitro translation and this cleavage takes place in trans (43, 44). Thus, efficient 2/3 site cleavage requires sequences that are more distant from the point of cleavage and not present in the model peptide substrate (see Discussion).

    Nonstructural polyprotein processing is directly linked to the regulation of the alphavirus replication cycle and thus represents one of the key events of the infection process. The previous body of work has described the sequential order of the processing events, but the intermolecular recognition mechanisms remain largely unknown. Here we have examined the processing of model substrates containing native, mutated, and shuffled cleavage site sequences. We find that the SFV nonstructural protease specifically recognizes at least five residues, four of them lying on the N-terminal side of the cleavage site. In most of these positions the protease prefers small residues, but the P4 position is rather specific for arginine. Mutations made in the P4 position similarly affected protease cleavage in vitro and in the infected cells.

    MATERIALS AND METHODS

    Bacterial strains and expression vectors. Escherichia coli strain DH5 (Gibco BRL) was used for propagation of plasmids. Recombinant proteins were expressed in E. coli BL21(DE3) (Novagen). The backbone of the pET41b vector (Novagen) was used for construction of plasmids for expression of recombinant substrates. In total, 98 oligonucleotide primers were used in this study. The sequences of all primers, as well as those of the expression constructs, are available from the authors upon request.

    To construct plasmids for expression of substrates fused to enhanced green fluorescent protein (EGFP), (i) the glutathione S-transferase tag in vector pET41b was replaced by the E2 tag derived from the bovine papillomavirus E2 protein (18), resulting in a vector designated pET-E2; (ii) EGFP was PCR amplified by using specific primers with appropriate adaptors and pd1EGFP-N1 (Clontech) as a template; and (iii) the PCR fragment was treated with SacII and BglII and cloned into pET-E2 digested with the same enzymes. The resulting expression vector was sequenced and designated as pET-E2-EGFP.

    The plasmid with two thioredoxin (Trx) tags, designated pET-2Trx, was kindly provided by AS Quattromed (Tartu, Estonia). The plasmid contains the E2 affinity tag fused to the first Trx sequence and a polylinker region between the two Trx modules with the sequence (Trx1)-CAT ATG CAC CAT CAC CAT CAC CTA GCC ATG GCG ATA TCG GAT CCG AAT TCG AGC TCC GTC GAC AAG CTT GAG GCC GCA CTC GAG-(Trx2), where triplets correspond to codons, the recognition sites of BamHI and XhoI are in bold, and the His5 tag is underlined.

    To obtain a set of EGFP-fused expression constructs with truncations at the 3/4 site, the sequences corresponding to the truncated 3/4 sites were PCR amplified using specific primers with adaptors and the infectious cDNA clone of SFV, pSP6-SFV4 (27), as a template. The PCR fragments were digested with EcoRV and XhoI and cloned into pET-E2-EGFP, treated with the same enzymes. The resulting clones were verified by sequencing.

    To clone the sequences encoding truncated, modified, or shuffled cleavage sites into the pET-2Trx vector, the corresponding sequences were first constructed by annealing pairs of complementary oligonucleotides. Each pair of oligonucleotides was designed to form a duplex with a four-base overhang at each end, corresponding to the overhangs created by BamHI (5' end of the coding strand) or XhoI (3' end) restriction enzymes. The duplexes were cloned into the pET-2Trx vector digested with the same enzymes, and the clones were verified by sequencing. The library of clones in the pET-2Trx vector, which contained variant amino acid codons in position P1', was kindly provided by AS Quattromed.

    The construction of libraries containing randomized codons was done as described above; the primers used had one completely randomized codon (NNN) in the selected position, corresponding to the P5, P4, P3, P2, or P1 position of the 3/4 cleavage site. The resulting libraries were designated pET-2Trx-Ran5, pET-2Trx-Ran4, pET-2Trx-Ran3, pET-2Trx-Ran2, and pET-2Trx-Ran1.

    Protein expression and purification. The Pro39 and nsP2 proteases were produced and purified essentially as described in reference 44. For expression of the recombinant substrates, E. coli BL21(DE3) containing the appropriate plasmid was grown in 2x YT medium (Difco) supplemented with the appropriate antibiotic (100 μg/ml for ampicillin and 50 μg/ml for kanamycin) at 37°C, until the optical density at 600 nm reached 0.6. Then, the culture was cooled down to 15°C, and isopropopyl--D-thiogalactopyranoside was added to a final concentration of 0.5 to 0.8 mM. Incubation was continued for 12 h at 16 to 20°C, and cells were collected by centrifugation at 6,000 x g for 10 min. The pellet was resuspended in buffer A (50 mM Tris-HCl [pH 8.0], 5 mM imidazole, 1 mM dithiothreitol [DTT] and 0.1% Tween 20). One millimolar phenylmethylsulfonylfluoride was added to the cell suspension before sonication with six pulses for 30 s each. NaCl was added to the lysate to 0.5 to 1.0 M, and cell debris was removed by centrifugation at 38,000 x g for 30 min. The cleared supernatant was mixed with nickel-nitrilotriacetic acid agarose (QIAGEN), preequilibrated with buffer A supplemented with 20 mM imidazole, incubated for 1 h, and loaded onto a polypropylene column (Bio-Rad). The column was washed with 10 volumes of buffer A supplemented with 20 mM imidazole, and elution was performed with buffer A supplemented with 200 mM imidazole. Glycerol and DTT (at final concentrations of 20% and 1 mM, respectively) were added to the eluate, which was divided into aliquots, frozen in liquid nitrogen, and stored at –70°C. The final concentration of substrate preparations was typically in the range of 3 to 8 μg/μl.

    In vitro translation. In vitro translation reactions were carried out using the T7 TNT rabbit reticulocyte lysate system (Promega) according to the manufacturer's protocol. Reaction mixtures (10 μl) containing 10 μCi of [35S]methionine (Amersham Biosciences) and 0.5 μg of plasmid DNA were incubated for 1 h at 30°C, whereafter translation was stopped by adding cycloheximide to a final concentration of 1 mM.

    Proteolytic assays. For digestion of the products of in vitro translation, 10 μl of substrate mixture was combined with 0.5 μg of enzyme and the reaction was incubated for 1 h at 30°C. Cleavage products were separated by SDS-PAGE in 15% gels, and the proteins were visualized with a Fuji BAS 1500 bioimaging analyzer or with X-ray film. The imaging data were further processed with the Tina 2.09c software.

    For purified recombinant substrates, the proteolytic reaction was performed as follows. The substrate (5 μg) was mixed with purified recombinant Pro39 or nsP2 in a molar ratio of 50:1, 20:1, or 5:1 in buffer containing 50 mM HEPES-NaOH (pH 7.2), 50 mM NaCl, and 1 mM DTT in a final volume of 10 μl and incubated for 1 h at 30°C, if not indicated otherwise. The reaction products were subjected to SDS-PAGE in 15% gels, and the proteins were visualized by staining with Coomassie brilliant blue R-250. Quantification was performed with the Tina 2.09c software. To increase the sensitivity of the analysis of the substrates with low cleavage efficiency, Western blot analysis was performed using a mouse monoclonal antibody against the E2 tag (18). Matrix-assisted laser desorption-ionization time-of-flight (MALDI-TOF) mass spectrometry was carried out with the MALDI-TOF Voyager DE Pro instrument (Applied Biosystems).

    Recombinant virus construction. Plasmids containing sequences surrounding the 2/3 or 3/4 cleavage sites were used as templates for PCR-based mutagenesis, and the presence of the desired mutations was verified by sequencing. The mutated fragments were transferred into pSP6-SFV4 utilizing available restriction sites. The resulting clones were verified by sequencing the entire transferred fragment and designated pSFV(23/RAGC), pSFV(34/TAGA), pSFV(34/HAGA), pSFV(34/RSGA), and pSFV(34/RAEV). Capped transcripts were prepared with SP6 polymerase after linearization of the plasmids with SpeI (27) and used for transfection of BHK21 cells by electroporation. The primary virus stock was collected after 24 h at 37°C, titrated, and used in all subsequent experiments. The presence of the mutations in these stocks was verified by sequencing after reverse transcriptase PCR of isolated RNA. The infectious center assay was performed essentially as described previously (11).

    Metabolic labeling and immunoprecipitation. Polyprotein processing was studied by pulse labeling the cells with [35S]methionine. BHK21 cells (106) were infected with 100 PFU per cell and labeled at 3 h postinfection for 15 min. In chase samples, the pulse was followed by chase for 45 min in the presence of excess unlabeled methionine. Cells were then lysed in 1% SDS and proteins denatured by boiling. Samples were diluted 1:20 with NET buffer (50 mM Tris [pH 7.5], 150 mM NaCl, 5 mM EDTA, 0.5% NP-40) and incubated for 1 h at +4°C with combinations of rabbit polyclonal antisera against ns proteins (35). Immunocomplexes were precipitated with protein A Sepharose CL-4B (Amersham Biosciences) overnight at +4°C. The precipitates were washed four times with NET buffer containing 400 mM NaCl. The precipitated proteins were denatured by heating in Laemmli buffer, and the samples were separated by SDS-PAGE. Proteins were visualized with a Fuji BAS 1500 bioimaging analyzer. The imaging data were further processed with the Tina 2.09c software.

    RESULTS

    The nsP2 protease is responsible for the processing of all three cleavage sites within SFV P1234, but the efficiency of cleavage is quite distinct for each site (43, 44). Comparison of the amino acid sequences of the ns polyproteins encoded by various alphaviruses revealed that they all contain a Gly residue at the P2 position of every cleavage site (Table 1). It has been proposed that this residue is an important determinant of the cleavage (40). Substitution of this Gly with Val or Glu totally abolished processing at the mutated sites (20, 36, 44). The downstream regions of the 2/3 and 3/4 sites are highly conserved, whereas some variation is observed in the N terminus of nsP2 (Table 1). In SFV ns polyprotein, the amino acid residue preceding the conserved Gly is always Ala. These two are the only amino acids absolutely conserved in all three cleavage sites of SFV polyprotein (Table 1, row 1). Ala is also very commonly found in the P3 position in other alphaviruses.

    Length of the 3/4 site. It has been previously reported that a recombinant substrate, which contains the amino acid sequence of the 3/4 site consisting of 19 upstream and 17 downstream residues with respect to the cleavage position, can be efficiently processed by Pro39 (44). This indicates that the protease recognition sequence for the 3/4 site must be rather short. In order to determine the length of the important region around the cleavage point, we constructed a set of sequences originating at the 3/4 site, fused to the C-terminus of EGFP (Fig. 1A). These substrates were expressed in E. coli, purified by affinity chromatography, and subjected to proteolytic processing.

    Truncation of the downstream sequence from 20 residues to 15 or 10 had little or no effect on cleavage efficiency (Fig. 1B, lanes 2 to 4). However, when the downstream region was reduced to five amino acids, cleavage was slightly reduced (lane 5). Similarly, the upstream region could be truncated from 20 to 10 residues, and even an upstream region as short as 5 SFV-specific amino acid residues was sufficient to allow efficient cleavage (lanes 7 to 9). When the virus-specific downstream sequence was completely removed and replaced by a vector-derived Leu-Glu sequence followed by an octahistidine tag, some cleavage could still be detected, although at a much reduced efficiency (lane 6). When full-length nsP2 protease was used instead of Pro39, identical results were obtained (data not shown). As an additional indication of specificity, cleavages by both proteases were inhibited by Zn2+, as described previously (44). Thus, in this context the sequence recognized by Pro39 or nsP2 could be as short as five virus-specific amino acid residues immediately upstream of the bond cleaved by the enzyme. The downstream region of the 3/4 cleavage site was found to be dispensable for recognition and cleavage specificity, but its presence as well as length affected processing efficiency.

    Random mutagenesis of P1, P2, P3, P4, and P5 in the 3/4 site. Next, we focused on the sequence of the efficiently cleaved 3/4 site. Analysis of the amino acid composition of the upstream region of the 3/4 sites for different alphaviruses revealed that with exception of the P2 Gly, all other positions can be occupied by different amino acid residues (Table 1). In order to discover the extent of possible variation in the positions P5 (native residue Gly), P4 (Arg), P3 (Ala), P2 (Gly), and P1 (Ala), the codons for each amino acid in turn were subjected to random mutagenesis in the context of the 2Trx-10/2' substrate (Fig. 2A). Thirty clones were randomly picked for each position, and the encoded proteins were translated in vitro. Eight clones out of 150 were found to produce truncated proteins and therefore contain a stop codon in the randomized position. These were excluded from subsequent analysis. The remaining 142 substrates were tested for protease susceptibility with Pro39. All of these clones were sequenced, and the permitted changes as well as the nonacceptable changes in the randomized positions were identified (Fig. 2B).

    First, the results of this experiment revealed that 11 different amino acid residues were not acceptable in position P1. Besides native Ala, only two permitted changes, Gly and Ser, were identified. Gly is natively found the in P1 position of the 3/4 site of several alphaviruses (Table 1). Second, as expected based on the absolute sequence conservation, only nonacceptable changes were identified for the P2 position. Third, the replacement of Ala in position P3 almost always resulted in an uncleavable site. The substrates derived from only two clones from the library were cleaved, and the change of the native Ala to Ser was identified as the only permitted substitution for this position, in contrast to 10 different changes which were found to be nonacceptable. This was not unexpected, given that the variation of the P3 position within alphaviruses is quite limited. Fourth, the replacement of Arg in position P4 invariably resulted in an uncleavable site; in total, 10 nonacceptable changes were identified by this analysis. The only clone from the random library producing a cleavable substrate was found to have another Arg codon in position P4. This finding was surprising, since position P4 is not strictly conserved, and Arg, Asp, Glu, and Gly residues are found in this position of the 3/4 site in different alphaviruses (Table 1). Thus, the results of mutagenesis in position P4 were quite similar to those for positions P1, P2, and P3, indicating that this position is also critical for protease cleavage. Fifth, in contrast to the clones from the previous random libraries, almost all clones with a randomized codon in position P5 gave rise to substrates which were cleaved in vitro. Sequence analysis of the corresponding clones revealed that native Gly in position P5 can be replaced with essentially all types of amino acid residues, including Trp, Asn, Gln, Phe, Thr, Ser, Ala, Cys, Ile, Leu, Asp, and even Pro, although the cleavage efficiency somewhat varied depending on the type of the residue in position P5. The only change resulting in an almost noncleavable substrate was the change of Gly to a positively charged Lys residue. Taken together, these results demonstrate that positions P1, P2, P3, and P4 in the upstream region of the 3/4 site are critical for protease recognition, in contrast to position P5, which can be altered (Fig. 2B).

    Influence of the P1' residue. Our previous results indicated that the processing of recombinant substrates containing the 3/4 cleavage site can take place, at least to some extent, even if the downstream region of the cleavage site has no virus-specific amino acids (Fig. 1B, lane 6). However, it was not clear how large an influence the amino acid residue in position P1' has on cleavage efficiency and specificity. This analysis is also important for understanding the processing of P1234, since the residues in the position P1' in the 1/2, 2/3, and 3/4 cleavage sites of the SFV ns polyprotein are quite different in nature (Table 1). To reveal the specific role of the P1' residue for the processing of the 3/4 site, constructs containing 15 upstream residues followed by one selected residue for position P1' were analyzed (Fig. 3A). Purified recombinant substrates were incubated with Pro39, and the reaction products were analyzed by SDS-PAGE. The results suggest that a wide range of different amino acid residues could be tolerated at the P1' position of the 3/4 site (Fig. 3B). The experiment with full-length nsP2 protease instead of Pro39 led to identical results (data not shown). Kinetic characterization of the processing of all the tested substrates showed a large variation in cleavage efficiency (Fig. 3C and D). Unexpectedly, the substrate that was processed best contained a Gly residue instead of the native Tyr in position P1'. This substrate was followed in efficiency by those with either Arg or Ser in the corresponding position, but they had half-lives almost twice as long as that of the Gly substrate in the reaction (Fig. 3C). Most of the remaining substrates, including the substrate with the native Tyr in the P1' position, were grouped closely together and had half-lives approximately four times longer than that of the best substrate (Fig. 3C and D). Substrates with acidic amino acid residues (Asp or Glu) were cleaved very inefficiently, and their cleavage products showed abnormal electrophoretic mobility in SDS-PAGE (Fig. 3B and C). However, MALDI-TOF analysis confirmed that similarly to all other substrates used in this study, these substrates were also cleaved at the correct position (data not shown). The only substrate that remained completely uncleaved both by Pro39 (Fig. 3C) and by nsP2 (data not shown) contained a proline residue in the P1' position.

    Cleavage of native and shuffled sites. As indicated above, the sequences of the upstream and downstream regions of the 1/2, 2/3, and 3/4 cleavage sites of SFV P1234 are dissimilar (Table 1), with the exception of the conserved P2 Gly and P3 Ala residues. Therefore, it was essential to identify which region of the cleavage site primarily determines cleavage efficiency. To address this question, another set of recombinant substrates was constructed, expressed, and subjected to proteolysis by Pro39 essentially as described above, except that a more-sensitive Western blot analysis and detection procedure was used for visualization of the substrates and reaction products. In these substrates, the upstream and downstream regions originating from the 1/2, 2/3, and 3/4 sites had a length of 10 amino acid residues. These regions were shuffled to make all the possible combinations of the protease half-sites (Fig. 4A and B).

    First, the cleavage of these substrates fell into three clear groups according to the sequence upstream of the cleavage site. All of the substrates containing the C terminus of nsP3 were processed quite well (Fig. 4C, lanes 8 to 10). The substrates containing the C terminus of nsP1 were processed in an intermediate fashion (lanes 2 to 4), and those containing the C terminus of nsP2 showed virtually no processing by Pro39 (lanes 5 to 7) or by full-length nsP2 (data not shown). These results can be described in the following order: cleavage efficiency with respect to the C-terminal sequences, 3C>>1C>>2C. Thus, sequences upstream of the cleavage site have a dominant influence on cleavage, and in particular, they are responsible for the almost complete block of the 2/3 site cleavage.

    Second, replacement of the downstream region in 1/2 and 3/4 sites nevertheless resulted in a change of cleavage efficiency (Fig. 4C, compare lanes 2 to 4 with each other and lanes 8 to 10 with each other). Cleavage was always more complete when the N terminus of nsP2 was used as a downstream region (lanes 2 and 8). Substrates with the N-terminus of nsP4 were found to be slightly less efficiently processed (lanes 4 and 10), and substrates with the N terminus of nsP3 were the worst (lanes 3 and 9). These changes were poorly visible for the 2/3 site due to its inefficient cleavage (lanes 5 to 7), but the tendencies seem to be maintained, since substitution with the N terminus of nsP2 permitted a small amount of cleavage (lane 5). In accordance with these rules, the artificial substrate 3/2 was processed even more effectively than the native 3/4 substrate (Fig. 4C, lanes 8 and 10). These results can be described in the following order: cleavage efficiency with respect to the N-terminal sequences, 2N>4N>3N. Improved cleavage with the N terminus of nsP2 compared to that of nsP4 may be explained by the difference in the P1' residue, which in nsP2 is the most favorable Gly (Fig. 3). The NsP3 N terminus is Ala, which by itself is in the same class as nsP4 Tyr (Fig. 3). However, nsP3 contains a Pro in the P2' position (Table 1), which, due to its rigid conformation, may cause some deleterious influence.

    Testing the influence of P1 and P4 residues by mutagenesis. Since our findings identified the upstream region as the main contributor for processing efficiency, the next step of analysis was to identify which of the amino acid residues in this region are involved in determination of the cleavage efficiency. The obvious difference between the upstream regions of the 1/2 and 3/4 sites from that of the 2/3 site is that the latter contains Cys instead of Ala in position P1 (Table 1). To find out whether this difference was responsible for the poor cleavage of the 2/3 site, mutations changing native Ala to Cys in the sites 1/2 and 3/4 and a mutation converting native Cys to Ala in the 2/3 site were made (Fig. 5A). This was essential, since no clone with Cys at the P1 position was found among the analyzed clones of the random library. Both mutated cleavage sites 2Trx-1/2C and 2Trx-3/4C were processed by Pro39 similarly to wild-type 2Trx-1/2 and 2Trx-3/4 respectively, and only very slight decreases in cleavage efficiencies were observed (Fig. 5B, lanes 2 and 3 and 7 and 8). At the same time, the conversion of Cys to Ala in the 2/3 site did not result in detectable cleavage (Fig. 5B, lanes 4 and 5). When these substrates were processed with nsP2, identical results were obtained (data not shown).

    Another difference between the sites is position P4, which is occupied by His or Thr in the 1/2 or 2/3 cleavage site, respectively (Table 1). Their role in cleavage regulation remained unclear, since the random library did not contain clones with either of these residues in the P4 position (Fig. 2B). To find out whether P4 Thr is responsible for poor cleavage of the 2/3 site, a mutation changing native Thr to Arg in the 2/3 site and mutations changing native Arg to His or Thr in the 3/4 site were made (Fig. 5A). The mutated cleavage site 2Trx-2/3R was successfully processed by Pro39 to some extent, in contrast to noncleavable native 2Trx-2/3 (Fig. 5B, lanes 4 and 6), thus indicating that the P4 Thr played an important role in blocking the cleavage. Moreover, the reciprocal replacement of Arg with Thr in the P4 position of the 3/4 site made it almost uncleavable (Fig. 5B, lanes 7 and 10). The change of Arg to His in the P4 position of the 3/4 site also resulted in very inefficient cleavage by Pro39 (Fig. 5B, lane 9). The same results were obtained when nsP2 was used instead of Pro39 (data not shown). These last results are in good agreement with those for the random mutagenesis of the P4 position of the 3/4 site, which showed that Arg could not be replaced by other residues (Fig. 2B).

    Analysis of selected mutations in the context of the recombinant SFV genome. From the large number of mutations analyzed in vitro, several were selected for in vivo analysis. First, two mutations in the P4 position of the 3/4 site (Arg to Thr and Arg to His), which in vitro drastically reduced the cleavage of the mutated substrate, were chosen (Fig. 6A). Second, the only acceptable mutation in the P3 position of the 3/4 site (Ala to Ser) (Fig. 2B) was selected. Third, the effect of the mutation Thr to Arg for the P4 position of the 2/3 site, which activates processing, was tested. Finally, a double mutant (P2-P1 from Gly-Ala to Glu-Val) of the 3/4 cleavage site, which is known to block cleavage completely (43), was used as a negative control. No mutations were introduced into the P1' position of nsP4, since the native Tyr residue has been reported to be crucial for virus infectivity (38). Thus, altogether five recombinant genomes, designated SFV(23/RAGC), SFV(34/TAGA), SFV(34/HAGA), SFV(34/RSGA), and SFV(34/RAEV), were made and tested (Fig. 6A).

    We measured the relative infectivities of the virus clones by plaque assay directly after RNA transfection to BHK cells (11, 27). SFV(34/RAEV) was noninfectious, since it was unable to produce any plaques or other signs of replication. In contrast, all the other recombinant viruses were viable, and their infectivity was rather close to that of wild-type SFV4 (Fig. 6A). Additionally, all these recombinant viruses produced stocks with high titers similar to that of SFV4. SFV(23/RAGC) always formed larger plaques than SFV4 or other recombinant viruses, and SFV(34/TAGA) had a small delay in causing cytopathic effects in infected cells (data not shown), whereas the other viruses resembled the wild type in these properties. Since the presence of the introduced mutations in the virus stocks was confirmed by reverse transcriptase PCR and sequencing, it can be concluded that these mutations do not have major effects on the replication of SFV in BHK cells.

    In order to reveal how these mutations affect SFV ns polyprotein processing, the pulse-chase radiolabeling experiment was used. This analysis did not reveal any difference in polyprotein processing between SFV4 and SFV(23/RAGC) (Fig. 6B). However, SFV(34/RSGA), SFV(34/HAGA), and SFV(34/TAGA) showed a clear defect in the processing of the 3/4 site. This defect is evident from the stabilization of the P34 precursor in chase samples and in a significant reduction of the amounts of mature nsP3 and nsP4 in both pulse and chase samples. The processing defects were found to be more severe in the case of SFV(34/TAGA) and SFV(34/HAGA) than in SFV(34/RSGA). Somewhat increased accumulation levels of the largest precursor P1234 were also observed in the former two mutants (Fig. 6B). The results obtained in experiments with in vitro processing of purified recombinant substrates and those obtained in virus-infected cells are in accord with each other, indicating that the in vitro methods used in this study are relevant for studying the activities of the SFV nsP2 protease.

    DISCUSSION

    The SFV nonstructural protease is a highly specific enzyme, which is able to cleave short model substrates containing sequences related to its native cleavage sites (44) (Fig. 1). Here we have shown that at least residues P1 to P4 preceding the cleavage site are critical for the activity of the protease (Fig. 2). Positions P1 to P3 are relatively well conserved in various alphaviruses, but P4 shows wide variation (Table 1). Nevertheless, the P4 residue has a major influence on cleavage efficiency and thus has an important regulatory function. The amino acid in position P1' also has a significant influence on processing, although in the native sites residues preceding the site dominate over the P1' effect (Fig. 4). The results obtained with Pro39 and full-length nsP2 were identical in these respects, indicating that the N-terminal domain of nsP2 does not influence the recognition of short model substrates.

    Cleavage of the ns polyprotein in infected cells. During alphavirus infection, the three sites are cleaved in the specific order of 3/4 followed by 1/2 and finally 2/3 (36, 43), which permits the proper formation of replication complexes sequentially active in minus-strand and plus-strand syntheses, as described in the introduction. The 3/4 site is rapidly cleaved during virus infection, and this cleavage is absolutely essential for virus replication and virus-specific RNA synthesis (20, 25, 37), as also confirmed by results with the lethal mutant SFV(34/RAEV) in this study. This cleavage creates the core polymerase subunit nsP4, which requires its correct N terminus to be functional. The native tyrosine in this position can be replaced only by another aromatic amino acid or a histidine (38). We have shown here that the native 3/4 site is not optimal for the protease in vitro, since replacement of the P1' Tyr with Gly or certain other residues creates a better site (Fig. 3 and 4).

    Other than the P1' residue, the 3/4 site appears to be optimized for the protease to allow its fast cleavage; this is especially true with regard to the P4 arginine, which cannot be altered without reducing the cleavage in vitro (Fig. 2 and 5). Mutations in the P4 position also led to increased accumulation of the precursors P34 and P1234 in cells infected with mutant viruses (Fig. 6). However, this had a relatively small effect on the infectivity of viral RNA and on virus growth in cell culture. The amount of functional nsP4 needed for the formation of replicase complexes may be rather small. The level of nsP4 is downregulated in infected cells by rapid degradation and in many alphaviruses (but not in SFV) also by a leaky translation termination codon preceding nsP4 (17, 40). Reduced levels of free nsP3 in the mutants were also sufficient for replication, suggesting that the amount of mature nsP3 is not limiting under these conditions. It should be noted that a significant fraction of the nsPs is found in subcellular locations other than the replication complexes (35).

    The 1/2 site is normally cleaved predominantly in cis (43). However, this is not an absolute requirement, since the protease can cleave 1/2 substrates with a fair efficiency in trans (Fig. 5). During infection and also during the experimental setup of in vitro translation, the increase in local concentration of the substrate that the cis cleavage provides may make it the predominant pathway. This situation is different from the cleavage of the hepatitis C virus (HCV) polyprotein site NS3/4A, which is obligatorily cleaved in cis (4, 21). The sequence of the NS3/4A site is distinct from those of the other HCV cleavage sites and more tolerant for substitutions, due to additional interactions involved in recognition of the site and due to protein folding (47). In our case, the most apparent sequence feature of the 1/2 site that should reduce its cleavage efficiency is the presence of a histidine in the P4 position (Fig. 5).

    The 2/3 site is not cleaved by Pro39 or by nsP2 when it is present in small model substrates. The site combines several nonfavorable features: the downstream region is the least preferred of the three sites (Fig. 4), the P1 residue is a cysteine, and most importantly the P4 position is a threonine. Changing the latter to an arginine by itself can make the site cleavable, although still not an efficient site (Fig. 5). However, during infection or during in vitro translation of SFV polyproteins, the native 2/3 site is quite efficiently cleaved (43). Two features are needed to accomplish this. First, the protease must be full-length nsP2 with a correct N terminus (43). This requirement explains why cleavage of the 1/2 site always precedes that of the 2/3 site (20, 30, 36). Second, the substrate must be much larger than the model peptides used here (A. Lulla, unpublished data). Therefore, it is likely that there is an additional recognition of the substrate or a modification of the conformation of the substrate and/or protease that is needed for the cleavage of this otherwise poor site. Dilution experiments indicate that the cleavage of the 2/3 site takes place in trans (43), so this recognition must be an intermolecular event. Experiments with larger substrates are ongoing to understand the special features of the native 2/3 site processing.

    It is notable that the N termini of the alphavirus replication proteins are much more conserved than the C termini (Table 1). However, the protease predominantly recognizes the regions preceding the cleavage sites (Fig. 4). Thus, the conservation of the N-terminal sequences reflects other functional requirements, at least in the cases of nsP4 and nsP2, as discussed above. Variation in the C-terminal sequences suggests that protease recognition has undergone some evolution in alphaviruses. The variation is especially common for the P4 position, which we found to be a central determinant of cleavage efficiency in SFV. There are viruses in which the P4 residues of the 2/3 and 3/4 sites are identical (Table 1). In those cases, other features in the primary sequence or conformation of the sites are likely to influence their differential recognition, since the mechanism regulating minus- and plus-strand RNA syntheses and the underlying specific order of cleavages are conserved in alphaviruses (20, 43).

    Substrate recognition of SFV protease compared with other proteases. The alphavirus nsP2 proteases form 1 family among at least 24 in the largest clan, CA, of cysteine proteases. The different families within the clan are usually very distantly related to each other, such that they cannot be recognized as homologous by simple sequence comparisons, but all these families are still thought to have been evolved from a single ancestral peptidase, as shown by the conservation of their basic protein fold (3). Most clan CA members, such as the archetypal cysteine protease papain, several cathepsins, and the foot-and-mouth disease virus leader protease Lpro, strongly prefer large hydrophobic residues in the P2 position (3, 22, 32). Many of these proteases also favor large residues in the P3 position (22, 32). The situation is completely different for nsP2, which can recognize only substrates with small residues in the P1, P2, and P3 positions and specifically requires a glycine in the P2 position (Fig. 2). This indicates that the substrate recognition site of nsP2 has to be rather constricted. There are also other examples in clan CA of proteases that recognize substrates with small residues, notably the coronavirus papain-like protease PLP2, which usually cleaves substrates with glycines in P1 and P2, as well as a small residue in P1' and a positively charged residue in P3 (14). The role of the P4 residue in the substrate recognition of cysteine proteases has been less clear, but our comprehensive analysis shows that it is crucially important for nsP2 (Fig. 2). Other studies also indicate that P4 still has a significant influence on substrate cleavage in addition to P1 to P3 (14, 32).

    The clan CA contains several families of viral cysteine proteases (3), but so far they have not been as extensively studied as inhibitor targets as have other types of viral proteases. Virus-encoded proteases have been considered attractive drug targets, and the human immunodeficiency virus aspartate protease has been very successfully targeted (29). Studies of substrate recognition have been crucial in the development of inhibitors in the case of the HCV serine protease. Inhibitors derived from the N-terminal cleavage product hexapeptide by incorporating modified amino acids already have proved very potent (16) and have since been developed into highly efficient peptidomimetic compounds (23). Our current studies can be applied in the design of inhibitors against alphavirus protease, as they suggest the types of moieties preferred at different substrate positions.

    Our study revealed that the optimal cleavage consensus for the SFV nsP2 is represented by sequence (G)RAGA/G, which is not found in any of the virus-specific sites. This sequence is hydrophilic and rich in small amino acids, and sequence searches show that it is relatively common in the mammalian proteome. Cellular targets have been identified for poliovirus proteases 2A and 3C, and the functional significance of these interactions has been revealed (6, 12, 31). More recently, an important role in immune evasion was demonstrated for the HCV protease NS3 (9, 26). It is likely that the alphavirus protease is also involved in the inactivation or activation of certain crucial cellular proteins during infection. Interestingly, a large fraction of nsP2 is found in the nucleus of the infected cells (17), and there it seems to be involved in processes important for pathogenicity (8). It is tempting to speculate that some nuclear proteins may represent the cellular targets of this enzyme, and the identification of its preferred substrate sequences may facilitate the identification of such targets.

    ACKNOWLEDGMENTS

    We thank Lidia Vasiljeva for her help and advice at the start of this project, Ursel Soomets for help with mass spectrometry, and Anna Iofik for technical assistance.

    This work was supported by grant 5055 from the Estonian Science Foundation and by the European Union 5th Framework program (Project SFvectors), as well as by Academy of Finland grant 201687 and grant 067575 from the Wellcome Trust.

    REFERENCES

    Ahola, T., and L. Kriinen. 1995. Reaction in alphavirus mRNA capping: formation of a covalent complex of nonstructural protein nsP1 with 7-methyl-GMP. Proc. Natl. Acad. Sci. USA 92:507-511.

    Ahola, T., A. Lampio, P. Auvinen, and L. Kriinen. 1999. Semliki Forest virus mRNA capping enzyme requires association with anionic membrane phospholipids for activity. EMBO J. 18:3164-3172.

    Barrett, A. J., and N. D. Rawlings. 2001. Evolutionary lines of cysteine peptidases. Biol. Chem. 382:727-733.

    Bartenschlager, R., L. Ahlborn-Laake, K. Yasargil, J. Mous, and H. Jacobsen. 1995. Substrate determinants for cleavage in cis and in trans by the hepatitis C virus NS3 proteinase. J. Virol. 69:198-205.

    Barton, D. J., S. G. Sawicki, and D. L. Sawicki. 1988. Demonstration in vitro of temperature-sensitive elongation of RNA in Sindbis virus mutant ts6. J. Virol. 62:3597-3602.

    Das, S., and A. Dasgupta. 1993. Identification of the cleavage site and determinants required for poliovirus 3CPro-catalyzed cleavage of human TATA-binding transcription factor TBP. J. Virol. 67:3326-3331.

    Ding, M. X., and M. J. Schlesinger. 1989. Evidence that Sindbis virus NSP2 is an autoprotease which processes the virus nonstructural polyprotein. Virology 171:280-284.

    Fazakerley, J. K., A. Boyd, M. L. Mikkola, and L. Kriinen. 2002. A single amino acid change in the nuclear localization sequence of the nsP2 protein affects the neurovirulence of Semliki Forest virus. J. Virol. 76:392-396.

    Foy, E., K. Li, C. Wang, R. Sumpter, Jr., M. Ikeda, S. M. Lemon, and M. Gale, Jr. 2003. Regulation of interferon regulatory factor-3 by the hepatitis C virus serine protease. Science 300:1145-1148.

    Gomez de Cedrón, M., N. Ehsani, M. L. Mikkola, J. A. García, and L. Kriinen. 1999. RNA helicase activity of Semliki Forest virus replicase protein NSP2. FEBS Lett. 448:19-22.

    Gorchakov, R., E. Frolova, R. G. Williams, C. M. Rice, and I. Frolov. 2004. PKR-dependent and -independent mechanisms are involved in translational shutoff during Sindbis virus infection. J. Virol. 78:8455-8467.

    Gradi, A., Y. V. Svitkin, H. Imataka, and N. Sonenberg. 1998. Proteolysis of human eukaryotic translation initiation factor eIF4GII, but not eIF4GI, coincides with the shutoff of host protein synthesis after poliovirus infection. Proc. Natl. Acad. Sci. USA 95:11089-11094.

    Hahn, Y. S., A. Grakoui, C. M. Rice, E. G. Strauss, and J. H. Strauss. 1989. Mapping of RNA– temperature-sensitive mutants of Sindbis virus: complementation group F mutants have lesions in nsP4. J. Virol. 63:1194-1202.

    Han, Y., G. Chang, C. Juo, H. Lee, S. Yeh, J. Hsu, and X. Chen. 2005. Papain-like protease 2 (PLP2) from severe acute respiratory syndrome coronavirus (SARS-CoV): expression, purification, characterization, and inhibition. Biochemistry 44:10349-10359.

    Hardy, W. R., and J. H. Strauss. 1989. Processing the nonstructural polyproteins of Sindbis virus: nonstructural proteinase is in the C-terminal half of nsP2 and functions both in cis and in trans. J. Virol. 63:4653-4664.

    Ingallinella, P., S. Altamura, E. Bianchi, M. Taliani, R. Ingenito, R. Cortese, R. De Francesco, C. Steinkühler, and A. Pessi. 1998. Potent peptide inhibitors of human hepatitis C virus NS3 protease are obtained by optimizing the cleavage products. Biochemistry 37:8906-8914.

    Kriinen, L., and T. Ahola. 2002. Functions of alphavirus nonstructural proteins in RNA replication. Prog. Nucleic Acid Res. Mol. Biol. 71:187-222.

    Kaldalu, N., D. Lepik, A. Kristjuhan, and M. Ustav. 2000. Monitoring and purification of proteins using bovine papillomavirus E2 epitope tags. BioTechniques 28:456-462.

    Karras, G. I., G. Kustatscher, H. R. Buhecha, M. D. Allen, C. Pugieux, F. Sait, M. Bycroft, and A. G. Ladurner. 2005. The macro domain is an ADP-ribose binding module. EMBO J. 24:1911-1920.

    Kim, K. H., T. Rumenapf, E. G. Strauss, and J. H. Strauss. 2004. Regulation of Semliki Forest virus RNA replication: a model for the control of alphavirus pathogenesis in invertebrate hosts. Virology 323:153-163.

    Kolykhalov, A. A., E. V. Agapov, and C. M. Rice. 1994. Specificity of the hepatitis C virus NS3 serine protease: effects of substitutions at the 3/4A, 4A/4B, 4B/5A, and 5A/5B cleavage sites on polyprotein processing. J. Virol. 68:7525-7533.

    Kuehnel, E., R. Cencic, N. Foeger, and T. Skern. 2004. Foot-and-mouth disease virus leader proteinase: specificity at the P2 and P3 positions and comparison with other papain-like enzymes. Biochemistry 43:11482-11490.

    Lamarre, D., P. C. Anderson, M. Bailey, P. Beaulieu, G. Bolger, P. Bonneau, M. Bs, D. R. Cameron, M. Cartier, M. G. Cordingley, et al. 2003. An NS3 protease inhibitor with antiviral effects in humans infected with hepatitis C virus. Nature 426:186-189.

    Lemm, J. A., A. Bergqvist, C. M. Read, and C. M. Rice. 1998. Template-dependent initiation of Sindbis virus RNA replication in vitro. J. Virol. 72:6546-6553.

    Lemm, J. A., T. Rumenapf, E. G. Strauss, J. H. Strauss, and C. M. Rice. 1994. Polypeptide requirements for assembly of functional Sindbis virus replication complexes: a model for the temporal regulation of minus- and plus-strand RNA synthesis. EMBO J. 13:2925-2934.

    Li, K., E. Foy, J. C. Ferreon, M. Nakamura, A. C. Ferreon, M. Ikeda, S. C. Ray, M. Gale, Jr., and S. M. Lemon. 2005. Immune evasion by hepatitis C virus NS3/4A protease-mediated cleavage of the Toll-like receptor 3 adaptor protein TRIF. Proc. Natl. Acad. Sci. USA 102:2992-2997.

    Liljestrm, P., S. Lusa, D. Huylebroeck, and H. Garoff. 1991. In vitro mutagenesis of a full-length cDNA clone of Semliki Forest virus: the small 6,000-molecular-weight membrane protein modulates virus release. J. Virol. 65:4107-4113.

    Lin, Y. H., P. Yadav, R. Ravatn, and V. Stollar. 2000. A mutant of Sindbis virus that is resistant to pyrazofurin encodes an altered RNA polymerase. Virology 272:61-71.

    Magden, J., L. Kriinen, and T. Ahola. 2005. Inhibitors of virus replication: recent developments and prospects. Appl. Microbiol. Biotechnol. 66:612-621.

    Merits, A., L. Vasiljeva, T. Ahola, L. Kriinen, and P. Auvinen. 2001. Proteolytic processing of Semliki Forest virus-specific non-structural polyprotein by nsP2 protease. J. Gen. Virol. 82:765-773.

    Neznanov, N., K. M. Chumakov, L. Neznanova, A. Almasan, A. K. Banerjee, and A. V. Gudkov. 2005. Proteolytic cleavage of the p65-RelA subunit of NF-kappaB during poliovirus infection. J. Biol. Chem. 280:24153-24158.

    Portaro, F. C., A. B. Santos, M. H. Cezari, M. A. Juliano, L. Juliano, and E. Carmona. 2000. Probing the specificity of cysteine proteinases at subsites remote from the active site: analysis of P4, P3, P2' and P3' variations in extended substrates. Biochem. J. 347:123-129.

    Putics, á., W. Filipowicz, J. Hall, A. E. Gorbalenya, and J. Ziebuhr. 2005. ADP-ribose-1"-monophosphatase: a conserved coronavirus enzyme that is dispensable for viral replication in tissue culture. J. Virol. 79:12721-12731.

    Rikkonen, M., J. Pernen, and L. Kriinen. 1994. ATPase and GTPase activities associated with Semliki Forest virus nonstructural protein nsP2. J. Virol. 68:5804-5810.

    Salonen, A., L. Vasiljeva, A. Merits, J. Magden, E. Jokitalo, and L. Kriinen. 2003. Properly folded nonstructural polyprotein directs the Semliki Forest virus replication complex to the endosomal compartment. J. Virol. 77:1691-1702.

    Shirako, Y., and J. H. Strauss. 1990. Cleavage between nsP1 and nsP2 initiates the processing pathway of Sindbis virus nonstructural polyprotein P123. Virology 177:54-64.

    Shirako, Y., and J. H. Strauss. 1994. Regulation of Sindbis virus RNA replication: uncleaved P123 and nsP4 function in minus-strand RNA synthesis, whereas cleaved products from P123 are required for efficient plus-strand RNA synthesis J. Virol. 68:1874-1885.

    Shirako, Y., and J. H. Strauss. 1998. Requirement for an aromatic amino acid or histidine at the N terminus of Sindbis virus RNA polymerase. J. Virol. 72:2310-2315.

    Strauss, E. G., R. J. de Groot, R. Levinson, and J. H. Strauss. 1992. Identification of the active site residues in the nsP2 proteinase of Sindbis virus. Virology 191:932-940.

    Strauss, J. H., and E. G. Strauss. 1994. The alphaviruses: gene expression, replication, and evolution. Microbiol. Rev. 58:491-562. (Erratum, 58:806.)

    ten Dam, E., M. Flint, and M. D. Ryan. 1999. Virus-encoded proteinases of the Togaviridae. J. Gen. Virol. 80:1879-1888.

    Vasiljeva, L., A. Merits, P. Auvinen, and L. Kriinen. 2000. Identification of a novel function of the alphavirus capping apparatus. RNA 5'-triphosphatase activity of Nsp2. J. Biol. Chem. 275:17281-17287.

    Vasiljeva, L., A. Merits, A. Golubtsov, V. Sizemskaja, L. Kriinen, and T. Ahola. 2003. Regulation of the sequential processing of Semliki Forest virus replicase polyprotein. J. Biol. Chem. 278:41636-41645.

    Vasiljeva, L., L. Valmu, L. Kriinen, and A. Merits. 2001. Site-specific protease activity of the carboxyl-terminal domain of Semliki Forest virus replicase protein nsP2. J. Biol. Chem. 276:30786-30793.

    Wang, H. L., J. O'Rear, and V. Stollar. 1996. Mutagenesis of the Sindbis virus nsP1 protein: effects on methyltransferase activity and viral infectivity. Virology 217:527-531.

    Wang, Y. F., S. G. Sawicki, and D. L. Sawicki. 1994. Alphavirus nsP3 functions to form replication complexes transcribing negative-strand RNA. J. Virol. 68:6466-6475.

    Yao, N., P. Reichert, S. S. Taremi, W. W. Prosise, and P. C. Weber. 1999. Molecular views of viral polyprotein processing revealed by the crystal structure of the hepatitis C virus bifunctional protease-helicase. Structure 7:1353-1363.(Aleksei Lulla, Valeria Lu)