• Keine Ergebnisse gefunden

1.1 Structure-based drug design

1.1.1 The drug discovery process

Health is one of the most important aspects of life. Therefore a main aspect of science is to provide the knowledge of medicine, therapies and drugs to cure sick persons and restore their health and quality of life. Natural products that can be used as drugs are known during the whole history of mankind.

Still, it is only a century ago, that is was possible to develop drugs in a ra-tional way. This methodology is called drug discovery and aims to identify and modify molecules chemically to fit best the designation of a drug, being able to cure a disease. The first thoughts about a rational drug develop-ment process came up at the end of the 19th century, when Paul Ehrlich postulated the existence of chemoreceptors [Ehrlich, 1900]. The structure of suitable molecules have to be modified and optimized to best fit these recep-tors. This was expressed in the phrase: ’we have to learn to aim chemically’.

Indeed, in his laboratories the first screening was done by using hundreds of new synthesized organic arsenic compounds to discover arsphenamine to cure syphilis [Ehrlich, 1910]. This ground-breaking study influenced the following generations and led to the discovery of many drugs, e.g. penicillin [Fleming, 1929] and the establishment of chemotherapy [Strebhardt and Ullrich, 2008].

In 1905 the concept of receptors was refined by describing them as switches that can be activated (agonists) and blocked (antagonists) [Langley, 1905].

But it took until the early fifties that this finding could be exploited, when the different forms of adrenergic receptors were described [Ahlquist, 1948]

and drugs like adrenalin (β-adrenoreceptor agonist), β-blockers or benzodi-azepines were discovered and further developed [Drews, 2000].

The driving force of drug discovery is still high-throughput screening 1

Target

Drug to market $

Metabolism of

Figure 1.1: Overview of the drug discovery process from target identification to approval for market. Indicated are the topics investigated in this thesis.

[Macarron et al., 2011, Lahana, 1999] which is nowadays fully automated and provides the ’hits’, which are binding scaffolds that are then optimized to lead structures.

Hand in hand with the improvements in structural biology, more 3-D structures of proteins became available and the medicinal chemist can now optimize compounds in a more rational and faster manner. These structural informations also provide the opportunity to screen molecules in a virtual manner, which is obviously less resource-expensive [Kitchen et al., 2004] and still provides a good selection [Clark, 2006]. Additionally the properties of drugs have been understood more profoundly today. Especially with the appliance of the Lipinski rules of five, which states that: a drug molecule should have a i) lipophilicity of logP below 5, ii) weight less than 500 Da and has iii) less than 5 hydrogen bond donors and 10 acceptors [Lipinski et al., 2001]. Implementation of such rules helps to identify drugs at an early stage, that would fail in later stages due to toxicity or too low bioavailability [Kubinyi, 2003]. Besides the developments in screening, the application and optimizations by the addition of functional groups to natural products as

1.1. STRUCTURE-BASED DRUG DESIGN 3 drugs has again come into the focus [Koehn and Carter, 2005].

The overall drug discovery process (Fig. 1.1) is very long and very cost intensive [Lombardino and Lowe, 2004]. The estimations of time and money that are spent differ a lot, but all are in an average range of 10 years and 1 billion dollar [Adams and Brantner, 2006]. To save resources in the devel-opment phase it is even common to use known drugs on new targets [Haupt and Schroeder, 2011].

Ion channels 7%

Transporters 4%

Nuclear hormone receptors 4%

Other receptors 4%

Miscellaneous 2%

Integrins 1%

DNA 1%

Enzymes 47% GPCRs 25%

Figure 1.2: Marketed small-molecules drug targets by biochemical classes [Hopkins and Groom, 2002].

An interesting questions arising from the drug discovery process is: how many drug targets are there in the end? The human genome comprises around 30.000 genes which encode for a much higher number of proteins, if alternative splicing, post-translational modifications and protein complex formations are also considered. Still, predicting a number of how many of these proteins can be targeted by drugs is not possible. The number of approved drug targets is approximately 324 [Overington et al., 2006] and an estimation of 600-1500 possible drug targets is proposed in the literature [Hopkins and Groom, 2002]. As seen in Fig. 1.2 most of the approved drugs are enzymes like protein kinases, but more than 40% are membrane proteins like G-protein coupled receptors, ion channels and transporters. Interestingly the membrane proteins are very difficult to crystallize and therefore for most of these important targets no 3-D crystal structures are available.

1.1.2 3-D protein structures

The starting point of structure-based drug design is a 3-dimensional model of the macromolecular receptor with atomic resolution. Such a model can

be provided by the free apo-form of the protein or the holo-form, when a ligand is bound. The latter provides even more information, especially when dealing with an induced-fit situation. But the holo-form also comprises the side chain conformations of the bound ligand, which might change when using another binder. 3-D protein structures are open to the public, due to the RCSB Protein Data Bank (PDB, www.rcsb.org/pdb/) [Berman et al., 2000], where these data can be deposited. In 2012 the PDB contained around 80.000 structures, many of them with a bound ligand.

The standard technique to obtain a 3-D structure of a molecule is X-ray crystallography, which was applied for 90% of the structures found in the PDB. The technique is well established and documented with its benefits and limitations [Davis et al., 2003]. Initially X-ray crystallography was used only for the structure elucidation of small molecules. This is done since a century and the world repository, the Cambridge Structural Database (CSD, www.ccdc.cam.ac.uk), comprises around half a million deposited structures.

The scattering pattern of protein crystals was obtained already for pepsin in 1934 [Bernal and Crowfoot, 1934]. Structure calculation was possible at that time for small molecules but due to the lack of highly efficient computational resources, protein structures were not solved. The landmark event of protein X-ray crystallography happened in 1958, when the structure of myoglobin [Kendrew et al., 1958] was solved. The technique is based on the observation that an X-ray photon can be scattered by an electron by producing secondary, spherical waves that can be detected. In practice, X-ray crystallography relies on a crystal of the protein, which is often only obtained in a time consuming process. Especially in the case of membrane proteins it is often not possible. Given that a crystal exists, it is exposed to an X-ray beam and investigated from all orientations, by rotating the crystal. Hereby the crystal must be of sufficient quality to remain stable. During this process, the X-ray diffraction is detected and results in a distinct diffraction pattern.

Hereby every spot of the diffraction pattern represents one lattice plane and depends on i) the unit cell, ii) wavelength and iii) crystal orientation in the beam. The unit cell is the imaginary smallest unit inside the crystal, which is normally around 50 Å3in size. A second challenging task, after obtaining the crystal, is to solve the phase problem. Only the amplitudes can be directly derived from the measured intensities, but not the phases. The technique most frequently used to solve this problem nowadays is multiple wavelength anomalous dispersion (MAD). Therefore anomalous diffraction is recorded at different wavelength, created by a special atom like selenium which can be brought into the protein during the expression by the modified amino acid selenomethionine. Another method is isomorphous displacement, which relies on the soaking of heavy atoms into the crystal structure. This has

1.1. STRUCTURE-BASED DRUG DESIGN 5 the disadvantage that several crystals with different heavy atoms have to be grown, largely increasing the amount of necessary protein. When the phase problem is solved, which has to be done only once for a specific protein, an electron density map is obtained and the crystallographer can fit the protein coordinates into the electron density. This task becomes easier with better resolution of the density map, which is normally around 2 Å. The structure obtained by X-ray crystallography is not a photograph in atomic detail, but is a model supported by strong experimental data [Podjarny et al., 2011].

Once the structure of a protein is solved, complexes with many different ligands can be obtained very fast. For the approach of fragment based drug design, it is even common to do a high-throughput crystallography [Blundell et al., 2002].

Figure 1.3: 3-D structure of ubiquitin, derived as a single model by X-ray crystal-lography (left) and as a structure ensemble by NMR spectroscopy (right).

The second important method to reveal 3-D structures of proteins is NMR spectroscopy and it accounts for approximately 10% of the structures in the PDB. The advantage in comparison to X-ray crystallography is, that no crystal is needed and the sample can be measured in solution, which also excludes the problem of crystal contact sides. The limitation on NMR is first the size of the protein, being typically less than 40 kDa, even though much larger structures have also been solved or were investigated, e.g. the 82 kDa protein malate synthase [Grishaev et al., 2008]. The second limitation of NMR is the possibility to express the protein in bacteria and label it with heteronuclear NMR-enabling stable isotopes like 13C or 15N. Protein struc-tures can be calculated based on the experimental NMR data of i) NOESY spectra that yield distances and ii) chemical shifts, that provide chemical en-vironments. Additional informations like residual dipolar couplings (RDC) or pseudo-contact shifts (PSD) are frequently used. The application of these experimental restraints requires an assignment of most of the protein signals

in the NMR spectrum, which can be a very time consuming task, hampering the industrial workflow. The NMR methodology is described in detail in chapter 1.2.