• Keine Ergebnisse gefunden

The canonical version of the genetic code contains a total of 64 possible triplet codons. Out of all these possible triplet combinations, 61 are sense codons and encode for a total of 20 canonical amino acids, whereas the remaining three nonsense (or stop) codons (UAA – Ochre, UAG – Amber, and UGA – Opal) are recognized during translation by release factors (RF-1 and RF-2 for prokaryotes and eRF-1 for eukaryotes; Nakamura et al., 1996).

The genetic code was initially considered to be “frozen” in its canonical form comprising 20 amino acids (Crick, 1968; Thomas, 1970) because of its universality in all organisms known at that time. In recent years, many differences observed in the genetic code of various organisms suggest that it is not only flexible but also that it is evolving together with the current organisms. Small deviations from this canon were discovered, first in mitochondria and then also in the genomes of organisms like Mycoplasma sp., green algae, and Candida sp. (Osawa et al., 1992). Selenocysteine and pyrrolysine are now considered the 21st and 22nd amino acids in the canonical set (Ambrogelly et al., 2007). They are incorporated in response to Opal (UGA) and Amber (UAG) stop codons (Söll, 1988; Srinivasan et al., 2002).

The genetic code has a variable degree of degeneracy, with amino acids being encoded by one up to six different codons. The translation termination signal is also degenerate having three different codons that are recognized by release factors. This feature of the genetic code can be exploited to reassign the least used codons for the encoding of a non-canonical amino acids. This strategy has been optimized especially for the suppression of the Amber stop

Introduction

6

codon, both in vitro and in vivo for both prokaryotic and eukaryotic cells (Noren et al., 1989, 1990; Furter, 1998; Liu et al., 2007).

Non-canonical amino acids can be incorporated by using their structural similarity to canonical amino acids (e.g. L-azidohomoalanine mimics methionine) and are thus used as general protein labels (Johnson et al., 2010). Another way to label proteins with ncAAs is to site-specifically codify them into a protein of interest using a dedicated bioorthogonal machinery (Liu and Schultz, 2010), which implies the lack of cross-reactivity with the endogenous components.

1.2.1 General Protein Labeling

For general protein labeling, homologs of the canonical amino acids are employed. These are similar enough to be recognized by the same synthetase(s) and used for the aminoacylation of the corresponding tRNA(s). For proper labeling to take place, only essential amino acids can be substituted because the rest of the amino acids are produced by the cells and would therefore outcompete the non-canonical amino acid.

Selenocysteine is among the first ncAAs that have been introduced into proteins for phase determination in crystallography (Cowie and Cohen, 1957). More recently, clickable non-canonical analogues of the methionine azido-homoalanine (AHA) and homo-propargylglycine (HPG) were implementated for proteome-wide labeling assays (Link et al., 2006). These ncAAs have also been used to label all proteins either for pulse-chase experiments (Dieterich et al., 2007, 2010) or as tools to investigate the general structure of proteins in membranes (Saka et al., 2014a).

This technique requires only the addition of the ncAA to the medium of the cells that lacks its endogenous analog. But it cannot be used to label specific proteins without the help of larger probes such as antibodies or fluorescent proteins tags.

1.2.2 Specific Protein Labeling

Tagging specific proteins with genetic precision is preferable to the general protein labeling approach.

Genetic code expansion involves adding new (non-canonical) amino acids to the repertoire present in the cells. Several requirements need to be fulfilledfor this technique to specifically label a protein of interest (Figure 1-4). First, a pair consisting of a suppressor

aminoacyl-Introduction

7 tRNA synthetase and a tRNA (RS/tRNA) have to be introduced (via transfection) in the cell.

These heterologous pair has to be bioorthogonal, meaning that it should not cross-react with the machinery present in the cell. Second, a new codon (i.e. nonsense or four-base codons) has to be assigned (or re-assigned) for the incorporation of a ncAA. Last but not least, the ncAA that has to be efficiently uptaken into the cells and recognized only by the orthogonal RS/tRNA pair.

Figure 1-4 The principle of ncAA incorporation

The genetic encoding or incorporation of non-canonical amino acids (ncAAs) involves expressing a bioorthogonal synthetase-tRNA couple (RS/tRNA; shown in red) and a mutagenized protein of interest that accommodates an Amber stop codon in its coding sequence. At the same time, the ncAA should be provided to the cell medium. The synthetase specifically aminoacylates the suppressor tRNA with the ncAA in the presence of ATP (in yellow). The endogenous amino acids, tRNA and sythetase are depicted in grey shades, while the ncAA, the Amber stop and the anticodon region of the suppressor tRNA are shown in blue-green. During translation of the protein of interest, the anticodon region of the suppressor tRNA will recognize the Amber stop codon on the mRNA. Then the ribosome (shown in light brown) will direct the incorporation of the ncAA into the primary sequence of the protein of interest.

Introduction

8

The easiest way to provide a host cell with a new RS/tRNA pair is to use one from a different organism that does not cross-aminoacylate components of the target cell. Many candidate pairs have been described in literature: the glutaminyl-tRNA/synthetase from S.

cerevisiae that tolerates a wide range of ncAAs for incorporation (Liu and Schultz, 1999), the tyrosyl RS/tRNA from Methanococcus jannaschii (Wang et al., 2000) required mutations in the tRNA to reduce aminoacylation by host synthetases (Wang et al., 2001), and the pyrrolysyl RS/tRNA from Methanosarcina barkeri or mazei (Srinivasan et al., 2002; Blight et al., 2004). The pyrrolysyl RS/tRNA pair was demonstrated to be compatible with a wide range of hosts from E. coli to mammalian cells and is at the moment one of the most widely used (Polycarpo et al., 2006; Neumann et al., 2008; Chen et al., 2009).

So far approximately 100 ncAAs (Liu and Schultz, 2010; Li and Liu, 2014) have been described for different applications and incorporated into proteins expressed in bacteria (Wang et al., 2001), yeast (Chin, 2003), mammalian cells (Sakamoto et al., 2002; Liu et al., 2007) as well as animals (Greiss and Chin, 2011; Bianco et al., 2012; Parrish et al., 2012).

Limitations involve the permeability of the ncAA or the propensity with which it is uptaken into cells, as well as the ribosome requirements and its compatibility with synthetase active site (i.e. there is a limited ability to mutagenize the catalytic pocket of the sythetase).

With appropriate RS/tRNA pairs, ncAAs offer broad possibilities to vary the structural, chemical and spectroscopic properties of the proteins they tag. Applications of genetic code expansion include new tools for investigating protein function on a cellular level and also generation of proteins with enhanced functionality and/or properties.