• Keine Ergebnisse gefunden

Fast High-resolution Protein Structure Determination by Using Unassigned NMR Data

2.3 Results and Discussions

2.3.6 FastNMR tolerates spurious peaks and multiple conformations

In the various 3D experiments, in which the magnetization is detected on the 1HN

nuclei, one pair of 1HN and 15N chemical shifts is expected for each amino acid in the protein except for prolines and the C-terminal residue. However, when parts of the protein exchange between different conformations, backbone signals can be missing if the exchange is intermediate on the NMR time scale, or additional signals can be present if the exchange is slow on the NMR time scale. Additional signals can also be due to spectral artifacts.

The influence of multiple conformations and spurious peaks on the performance of FastNMR was tested for the 76-aa protein ubiquitin. The ten most C-terminal residues were removed from the primary sequence of ubiquitin, while keeping all 70 experimentally observed spin systems. Despite the presence of four spin systems, which did not correspond to any position in the primary sequence, FastNMR calculated a high-resolution structure of ubiquitin.

The tests on ubiquitin are strongly supported by the results obtained from the de novo FastNMR structure determination of the 65-aa toxin Conk-S2. Initially, Conk-S2 NMR samples were prepared in 50 mM sodium acetate buffer, pH 5.2. In a first 2D 1H-15N HSQC recorded on Conk-S2, however, about 95 peaks potentially corresponding to 1H/15N backbone nuclei were present. Therefore, the pH was raised to pH 6.3 (50 mM sodium phosphate buffer). At pH 6.3, 60 backbone signals were visible in the 2D 1H-15N HSQC spectrum and in the triple-resonance spectra, roughly fitting to the expected 59 backbone signals (Conk-S2 contains 5 prolines). The high-resolution structure of Conk-S2 was then determined at pH 6.3 (see Fig. 1A and Table 1). FastNMR not only produced the high-resolution structure of Conk-S2, but also the assignment of backbone and side chain resonances. Surprisingly, only for 53 residues the backbone signals (and therefore also the side chain resonances) had been assigned by FastNMR, i.e. seven backbone signals remained unassigned. Therefore, additional triple-resonance experiments (3D HNCO, 3D HN(CA)CO, 3D HNCA and 3D HN(CO)CA) were measured to see if the remaining seven 1H/15N backbone signals could be assigned. Manual analysis of the six 3D triple resonance spectra finally proved that two of these signals correspond to V64 and G65. The others five spin systems, however, were multiple conformations of residues G65 (2x), V64 (1x), R6 (1x). One spin system could not be assigned. In addition, after close inspection of some weaker spin systems (not counted in the seven) were assigned to residues Q58,

Q61 and Y62. No signals were observed for residues S8, G42, T53 and N54 due to chemical exchange intermediate on the NMR time scale. Thus, FastNMR was able to determine a high-resolution de novo structure of Conk-S2 despite significant complications due to chemical exchange. Despite these complications, FastNMR produced high-resolution structures, including the de novo structure of Conk-S2.

These results demonstrate that FastNMR is highly robust.

Table 2.4 Deviation between different 3D structures of ubiquitin and cross-validation by 1D(CH) and 1D(C,C) RDCs.a

Cross-validation ensures that no wrong structures are produced by FastNMR: For backbone assignment and fold determination only RDCs and chemical shifts are used, whereas during automated NOE assignment RDCs are not used. Thus, in case the initial fold is incorrect it is unlikely that a sufficient number of NOEs is assigned during automated, structure-based NOE assignment. Even if a large enough number of NOEs is assigned, the NOE-based structure will likely differ significantly from the

a Values given are rmsd values in Å and calculated for residues 2-72 between the mean structure of the NMR ensembles or the 1.8 Å X-ray structure.

b 1.8 Å X-ray structure of ubiqutin (PDB code: 1ubq).

c NMR solution structure of ubiquitin, which was determined with an extremely large number of experimental restraints, including six different types of RDCs (PDB code: 1d3z).

d Structure of ubiquitin that was recalculated from the distance restraints, dihedral angles and three types of RDCs (1DN-H, 1DC-N, 1DC-C), which are available from the PDB (PDB code: 1d3z.mr).

e Pearson’s correlation coefficient for the comparison of 62 1DC-H and 39 1DC-C RDCs with values back-calculated from the various structures using singular-value decomposition.

f Note that the very high RDC correlation is due to the fact that the 1DC-H and 1DC-C RDCs were used in the structure calculation of 1d3z.

g Note that the rms deviation of the 20 lowest-energy structures of the FastNMR ensemble with respect to the average structure of this ensemble is 0.42 Å, whereas the maximum deviation of any of the 20 structures of the ensemble with respect to the average structure is 0.58 Å.

initial fold and disagree with the RDCs. Therefore, in the final stage of FastNMR when all experimental data are combined convergence to a low energy structure is not possible. This is clearly visible from Figure. 2.5D and the additional stability tests:

Only for correct, high-resolution structures a low total energy is obtained by FastNMR. In addition, FastNMR structures have to pass the following check points:

(i) at each stage during FastNMR structures must have converged to a unique conformation, (ii) structural changes during FastNMR must be less than 3.5 Å from

the initial fold to the high-resolution structure, (iii) more than 85% of the backbone resonances must have been assigned prior to starting the automated NOE assignment, Figure 2.6:Comparison of per-residue Ramachandran plot quality Z-score between FastNMR structures and conventionally determined structures. (A) PDB (2CA7) structure (red) and FastNMR structure (green) of Conk-S1. (B) FastNMR structure of ubiquitin (green) and structure of ubiquitin recalculated from manually evaluated NMR data deposited in PDB (1D3Z.mr) (red). Average values for the twenty lowest-energy structures are shown. Additional structural statistics can be found in Table 1.1.

and (iv) FastNMR structures have to pass the standard quality criteria such as violations of experimental restraints (Table 2.2).

2.4Conclusions

The strategy of FastNMR is based on the approach that has proven itself as robust in manual structure determination. This includes usage of information from triple-resonance experiments for sequential backbone assignment, use of iterative NOE assignment and structure calculation, and structure refinement using RDCs. Key to the success of FastNMR is, however, the simultaneous determination of the backbone assignment and the protein fold prior to analysis of NOE data.

FastNMR in its current implementation is limited to domain-sized proteins.

This is mainly due to the fact that the only experiments which are used for extraction of side chain chemical shifts are CCONH- and HCCONH-TOCSY experiments. The performance of these experiments decreases with increasing molecular weight of the protein and they also do not allow access to chemical shifts of aromatic groups. A larger number of chemical shifts will be available, when 3D (H)CCH-COSY and 3D H(C)CH- TOCSY spectra [Kay, Xu 1993] are incorporated into FastNMR. In addition, aromatic chemical shifts can be obtained from two-dimensional (H)C(CC)H and (H)C(CCC)H spectra [Yamazaki, Forman-Kay 1993]. The incorporation of these experiments into FastNMR is in progress.

In conclusion, we have demonstrated that it is possible to determine high-resolution structures of domain-sized proteins within 24 hours starting from unassigned chemical shifts, RDCs and NOE peak lists and have used this approach to determine the de novo structure of Conk-S2. The calculation time can be reduced even further, when FastNMR is executed in parallel on several computers or when faster computers are available. FastNMR runs automatic, avoids wrong structures by cross-validation, works for experimental data, only requires a limited number of NMR spectra and produces high-resolution structures. No manual assignment of chemical

shifts or inter-residue correlations is required. Interactive work is confined to the processing of NMR spectra and to the preparation of input lists, which contain the unassigned, experimental chemical shifts, residual dipolar couplings and NOE peaks.

As FastNMR is highly robust with respect to missing or wrong chemical shift assignments, we expect that it will be possible to perform peak picking and grouping also fully automatic.

FastNMR is a method for de novo structure determination, i.e. no prior structure or fold is assumed. However, with the rapid increase in the number of available 3D structures, it becomes more and more likely to find a close homologue in the Protein Data Bank (PDB) (www.rcsb.org). The structure of this homologue can be supplied as input to FastNMR. This will be important for larger proteins, for which the fold cannot be determined reliably using only RDCs and chemical shifts. When FastNMR is combined with methods for fast data acquisition, such as G-Matrix Fourier Transform NMR spectroscopy [Shen, Atreya 2005], it appears possible to obtain high-resolution NMR structures in less than one week after preparation of suitable NMR samples. Fast and efficient determination of high-resolution structures in solution will make biomolecular NMR a more efficient tool for Structural Biology.

Chapter 3

High-resolution 3D Structure Determination of Kaliotoxin