• Keine Ergebnisse gefunden

Chemical mechanism of the catalysed nucleotidyl transfer

1.1 DNA Synthesis - Biological role of DNA polymerases

1.1.3 Chemical mechanism of the catalysed nucleotidyl transfer

Although several crystal structures and numerous kinetic data have been published, the catalytic mechanism of DNA polymerases is still discussed and not fully understood. It is well established that a nucleophilic attack on the α-phosphorous atom of the nucleoside triphosphate by the primer 3´C-OH group leads to formation of the phosphodiester bond and to the release of a pyrophosphate. All known DNA and RNA polymerases require two divalent cations (usually Mg2+) in their active center and use a two-metal-ion mechanism for the catalysed nucleotidyl transfer. The most recent study on the mechanism of DNA polymerases (see Figure 4) proposes an extended two metal ion mechanism by a two proton transfer reaction catalysed by typical acid and basic amino acids: The protonation of the pyrophosphate leaving group by a proposed acid and the deprotonation of the 3´C-OH group by a proposed basic amino acid3.

Figure 4 Extended two-metal-ion mechanism of nucleotidyl transfer which includes a general acid catalysis3. Shown is the active center of a polymerase with a nucleoside triphosphate (red) and two divalent metal cations (Mg2+). One metal ion is coordinated by the phosphates of the triphosphate and an aspartate residue located in motif A of all polymerases (blue), and probably water molecules. The other metal ion is coordinated by the 3´C-OH of the primer terminus (green), the α-phosphate of the nucleoside triphosphate and widely conserved aspartate residues of structural motifs A and C. A proposed acid (A) which protonates the pyrophosphate leaving group is indicated by four different model polymerases which where used in the study of Castro et al. to proof their concept of a general acid catalysis. A proposed basic amino acid (B) which deprotonates the 3´C-OH group stays unidentified until now. The figure was designed according to reference3.

Both metal ions stabilise the structural adjustment. Metal ion A lowers the pKa of the 3´C-OH group and thus supports the deprotonation by an unidentified base B and the subsequent nucleophilic attack at physiological pH conditions. It is believed that the α-phophate adopts during this attack a SN2 type trigonal bipyramidal transition state3. 1.2 Biotechnological role of DNA polymerases

DNA polymerases are used in a plethora of biotechnological applications16. Nowadays, they are the workhorses in numerous applications like DNA sequencing and microarrayed nucleic acid diagnostic tools for the direct diagnosis of single-nucleotide variations within genes, forensic DNA testing, pathogen detection, et cetera.

1.2.1 Polymerase chain reaction (PCR)

A key technology for the use of DNA polymerases in biotechnological applications is the polymerase chain reaction (PCR), which was development by Mullis and coworkers in 198717. In theory, PCR allows the exponential enrichment of a particular DNA sequence by amplification of a single or few copies of template strands. For PCR applications a DNA polymerase needs specific primers (short DNA fragments) which contain sequences complementary to a target DNA region. During repeated cycles of

heating and cooling the DNA is generated and is itself used as a template for replication. Due to the enzymatic replication under consumption of the primers and deoxynucleotide triphosphates (dNTPs) the selected DNA sequence flanked by the primers is exponentially amplified. In almost every PCR application, heat-stable DNA polymerases resistant against the thermal cycling steps necessary to physically separate the two strands of the DNA double helix (usually at high temperatures ~95°C) are employed. Nowadays, PCR methods in presence of fluorescent dyes (e.g.

SYBRGreenI) or fluorescence resonant probes (e.g. TaqMan)18-21 report the amount of amplified DNA in real-time. Both fluorescent dyes and modified DNA polymerases have significantly shortened conventional PCR methods. Consequently, real-time PCR methods are the method of choice for the detection and quantification of DNA and RNA targets such as retroviruses and viral pathogens21.

Today the principle of a PCR is extended in numerous biotechnological applications:

Allele-specific PCR for the detection of single nucleotide variations22-24, multiplex PCR for the multiple amplification of different DNA fragments in one reaction vessel25, nested PCR which increases the specificity of the DNA amplification reaction26, quantitative PCR to quantify and compare certain DNA strands27, reverse transcription PCR for the detection of RNA targets28 et cetera.

1.2.2 Modified and mutated DNA polymerases

Highly processive and accurate DNA polymerases are desired for cloning procedures in order to give shorter extension times as well as a more robust and high yield amplification. The processivity of a DNA polymerase was significantly enhanced recently by protein fusion technology29: DNA polymerase processivity is definded as a value of the average number of nucleotides added by a DNA polymerase per association/disassociation with the DNA template. It is known that the Taq polymerase (see Section 1.1.1) consists of two distinct structural and functional domains, the 5´-3´ nuclease domain and the polymerase domain. The N-terminal shortened form of Taq (KlenTaq) lacking the nuclease domain is significantly less processive than the full-length Taq29, suggesting that the nuclease domain interacts with the DNA template

29

specific, double-stranded DNA binding domain resulting in enzymes with increased processivities without compromising catalytic activity and enzyme stability. By monitoring the single primer extension products by sequencing gels, it was demonstrated in detail that Taq wild-type added up to 35 nucleotides whereas fused Taq produced products up to added 200 nt.

A higher DNA polymerase fidelity may increase the reliability of diagnostic application systems30. Marx and coworkers31 demonstrated that the selectivity of Taq DNA polymerase can be increased by nonpolar substitution mutations of three amino acids QVH (Gln, Val, His) of motif C directly neighboured to the catalytic center.

Furthermore, they showed that these obtained mutants can be applied as a useful tool in genotyping assays like allele specific real-time PCR31,32.

To enhance the efficiency of forensic DNA testing, DNA polymerases resistant to inhibitors from blood and soil would enable the PCR without prior DNA purification.

Barnes and coworkers33 have recently evolved Taq DNA polymerase mutants with enhanced resistance to various known inhibitors of PCR reactions, including whole blood, plasma, hemoglobin, lactoferrin, serum IgG, soil extracts and humic acid, as well as high concentrations of DNA binding dyes. The mutated position of the Taq polymerase (Glu 708) is located in an alpha-helix region on the surface of the enzyme, known as “P-domain” (residues 704–717). This domain is situated about 40 residues apart from the “finger” domain, which binds the incoming dNTPs and interacts with the single stranded DNA template. Because the mutation site is at the hinge region one might speculate that it may affect the movement of the finger domains during incorporation. The described mutation in this example is not directly involved in interaction between substrate and enzyme thus indicating a so-called remote effect34. The recovery of ancient DNA samples, which could be more than 40 000 years old, requires DNA polymerases with an increased substrate spectrum to efficiently amplify and overcome typical DNA lesions35. In 2007 Marx and coworkers36, and Holliger and coworkers35 published successfully evolved DNA polymerases that are able to amplify from highly damaged DNA templates and bypass lesions found in ancient DNA such as abasic sites.

Further improvements of DNA polymerases are required, for example, to meet the requirements of next generation DNA sequencing technologies, which rely on the ability of DNA polymerases to efficiently process modified nucleotides37. For example

the sequencing technology from Illumina Inc. uses fluorescent reversible terminator deoxyribonucleotides38. The triphosphates have a 3’-O-azidomethyl group, which stops the polymerase after incorporation of one nucleoside. All four 2’-deoxynucleoside triphosphates (A, C, G and T) are additionally labelled with a different removable fluorophore to determine the sequence by fluorescence readout after each incorporation step.

Figure 5 Highly modified deoxyribonucleotides are used in the sequencing technology from Illumina Inc. The triphosphate (A) has to be efficiently processed by the DNA polymerase. The next sequencing cycle can begin, after chemical removal of the fluorescent dye and 3´-OH protecting group (B).

To improve the efficient incorporation of these unnatural nucleotides they had to engineer the active site of 9°N DNA polymerase. The figure was designed according to reference 38.

After readout, the 3´-O-azidomethyl group and the fluorescent dye will be chemically removed by tris(2-carboxyethyl)phosphine (TCEP), that the next single incorporation cycle can begin. However, to improve the efficient incorporation of these unnatural nucleotides they had to engineer the active site of 9°N DNA polymerase to gain a sufficient sequencing setup.

Taken together, customized and artificially engineered DNA polymerases that lead to more robust and specific reaction systems are urgently needed.

1.2.3 Directed evolution of DNA polymerases

Native proteins and enzymes are the natural products of several million years of evolution. Alliances of enzymes in one living organism cause it to be good or less good adapted to certain environmental conditions. Natural selection takes place in a way that the best adapted to the given conditions prevails.

This process can be artificially enhanced by modern biochemical methods in order to obtain enzymes, e.g. DNA polymerases, with new features. Alterations are mainly achieved by directed molecular evolution using genetic complementation and/or screening28,31,32,36,39-43

, phage display44-47, or in vitro compartmentalization48-50. In general, three steps are required for a successful directed molecular evolution of DNA polymerases: introduction of mutations by certain methods of mutagenesis, expression of different enzyme variants and screening or selection of best enzyme variants. These steps can be repeated until the desired feature is obtained (see Figure 6).

Figure 6 Scheme of directed evolution of DNA polymerases. 1. Introduction of arbitrary mutations indicated as red dots. 2 Separation of different mutants. 3. Screening/Selection of mutants by appropriate assays. These steps can be repeated until the desired feature is obtained e.g. higher mismatch discrimination.

After creation of a mutant library, a mutant separation process is needed which also ensures a linkage between a specific enzyme genotype and the respective phenotype.

Selection or screening approaches are described in the literature for the directed

evolution of DNA polymerases. Common selection methods are phage-display , compartmentalised self replication35 (CSR) or reporter plasmid assays43.

For example, Vichier-Guerre et al.47 employed phage display to select DNA polymerase mutants with about two orders of magnitude higher catalytic efficiency for reverse transcription when compared with the natural enzyme. In phage-display, the DNA polymerases are expressed and displayed together with a substrate on the surface of a phage. The polymerase mutant displayed onto the phage particle has to convert a linked substrate into desired product, which is then selected for example by affinity chromatography. One disadvantage might be the high degree of cross-reactions between a polymerase on one phage and a substrate attached to another phage.

Holliger and coworkers49 employed CSR to evolve polymerases that can extend mismatches and common lesions found in ancient DNA. They demonstrated that these engineered polymerases could expand the recovery of genetic information from Pleistocene specimens35. For a CSR method, each polymerase gene is encapsulated in a compartment formed by a heat-stable water-in-oil emulsion. Each polymerase has to replicate its own encoding gene and therefore results in a very high adaptive burden depending on the specific selection system.

Loeb and coworkers43 employed a reporter plasmid assay for the selection of DNA polymerase I mutants from E.coli with increased fidelity. In a reporter plasmid assay, a plasmid is used which contains a reporter gene for example containing an antibiotic resistance gene but with an opal codon. Selection of mutants is possible by comparing the reversion frequencies of the wild-type with the mutants.

In standard screening methods, the mutants must be separately expressed in multi-well plates so that the phenotype is directly connected to the corresponding mutant in each well. Subsequent screening reactions of mutants can be processed by either primer extension reactions or PCR. Real-time PCR screening yields a very high sensitivity due to the exponential enrichment of the product32. Barnes and coworkers for example employed a radioactive labelled nucleotide incorporation assay to screen successfully for DNA polymerase mutants, which are more resistant against common inhibitors present in blood and soil samples33.

screen in a multiplexed and parallel manner for several different reactions and new functions. This approach was followed in Section 2.3.

1.2.4 Methods of mutagenesis

Numerous strategies and methods of mutagenesis can be found in the literature:

Site directed mutagenesis is a good strategy when proper information about structural and substrate-enzyme interactions is available. The mutation sites are rationally designed and are introduced by site directed mutagenesis protocols using mutagenesis primers carrying the respective nucleic acid sequence for the desired mutation51. Special enzyme features may change simply by introducing these point mutations. This strategy could be a good starting point for further directed evolution.

Saturated mutagenesis offers another option to test one amino acid position with all native possible amino acid substitutions. The amino acid position can be randomised using degenerated mutagenesis primers. When a single codon is randomised, the library size can be small (3-4 hundreds mutants) and leads to 99.9% probability of having all possible mutations included52.

Arbitrary gene mutations can be introduced by error-prone PCR28,53 (epPCR). In epPCR high magnesium or manganese concentrations and/or imbalanced mixtures of deoxynucleotide triphosphates during the PCR are used, causing the DNA polymerase to produce incorporation errors. Unfortunately this encompass a few disadvantages:

Due to the nature of template amplification, an early occurring mutation in the first cycles of PCR might be enriched during amplification and thus overrepresented in the resulting protein library54. Additionally, DNA polymerases used in epPCR preferentially produce transition than transversions errors due to the steric demands of similar bases (purines A, G and pyrimidines T, C). At least certain amino acid exchanges do occur very infrequently, because exchange of one nucleotide is not enough to change a whole amino acid codon and mutation of two neighboured nucleotides is not occurring very often55. Nevertheless, error-prone PCR creates a good initially library for directed evolution methods especially when few information on enzyme structure and important amino acid residues exists.

Furthermore, arbitrary gene mutations can be introduced by other techniques such as sequence saturation mutagenesis56 (SeSaM). This method uses gene fragmentation by iodine cleavage of previously introduced phosphorothioate groups randomly

distributed in the gene. After fragmentation, these single stranded DNA fragments are used as primers for the following full-length gene synthesis. During this step, artificial bases with universally binding properties are used to ensure arbitrary randomisation.

Disadvantages of this method are that DNA fragments smaller than ~70 nt are not mutagenized due to the employed DNA extraction and purification procedures56. The independence from the mutational bias of DNA polymerases using epPCR is exchanged for the different base-pairing preferences of universal or degenerated bases.

Homologous recombination methods enable the shuffling of different mutations and additionally introduce new mutations as well. In the DNA shuffling method, homologue genes are fragmented by DNase digestion and afterwards reassembled to the full-length gene by PCR57. An alternative method is the staggered extension process (StEP) which also allows the combination of different homologous genes58. It uses highly abbreviated annealing and extension steps during PCR to generate staggered DNA fragments. This procedure promotes crossover events along the full length of the template sequences resulting in a library of chimeric polynucleotide sequences.

1.3 Accuracy of enzymatic DNA synthesis

Due to its complementary structure, DNA can be copied using the respective DNA strand as a template. The energy differences based on hydrogen bonding between a correct Watson-Crick base pair and an incorrect one are not very high (~ 1-3 kcal/mol) and result theoretically in a high error-rate of one per 100 incorporated nucleotides59,60. In living cells during replication, DNA polymerases have to copy the whole genome, in human cells these are more than 6 billion nucleotides61. The genomic sequence would rapidly undergo changes after each cell division. Fortunately, nature has evolved polymerases with error-rates much lower than one would expect from thermodynamic considerations62,63. In bacteria for example, the overall accuracy of DNA synthesis reaches one error of 108-1010 incorporated nucleotides62,64. Even in eukaryotic cells error-rates >1010 are reached8. The impact of incorrectly incorporated nucleotides can vary: On the one hand, mutations in the genetic sequence may be a

Misincorporation rates by DNA polymerases in E.coli are determined by 10 – 10, depending of course on the misincorporated base, indicating enzymatic processes that increase the accuracy beyond thermodynamic limitations65. Factors like proofreading and mismatch repair by base and nucleotide excision repair (BER and NER) improve these rates further to the overall error-rates mentioned above (108-1010).

Kool and coworkers showed that the efficient and selective enzymatic DNA synthesis is not dependent on the hydrogen bonding between Watson-Crick base pairs alone66. In detail, they studied base analogues that lack the ability for hydrogen bonds, but still have similar size and shape compared to the natural bases66 (see Figure 7, A). DNA polymerases were still able to use these analogues as substrates in a selective manner.

The base analogues were selectively incorporated opposite of the respective correct template base and furthermore used as a template base selectively addressing for the correct triphosphate.

Figure 7 (A) Representative isosteric base analogue N in comparison with nature base cytosin C.

(B) Pyrene base opposite an stable abasic site analogue.

Along these lines, they constructed a pyrene instead of a natural base bearing triphosphate, which has similar size and shape as a full base pair. This artificial triphosphate was preferentially incorporated opposite of an abasic site instead of a nucleobase67 (see Figure 7, B). All these findings lead to the assumption that size and shape of the incoming dNTP play an important role for DNA polymerase selectivity.

Further experiments concerning structural studies68,69 have led to the common understanding that the geometry of the DNA base pair is regulated by a close fit in the polymerase active site70.

1.3.1 Chemical approach for increasing the selectivity of DNA polymerases Incorporation of an incorrect nucleotide results in the forming of a mismatch and leads to an altered geometry within the DNA duplex. DNA polymerases are able to sense

these mismatches and display significantly reduced extension rates compared to a matched base pair situation22,71,72 (see Figure 8).

Figure 8 Matched and mismatched primer template situation. DNA polymerases are able to differentiate between these cases resulting in reduced mismatch extension rates compared to matched base pair situations. Black bars represent primer/template duplexes.

One approach to enhance DNA polymerase ability in discriminating between matched and mismatched situations is to use 3´ chemically modified primer probes.

Latorra et al. introduced locked nucleic acid (LNA) modification at the 3'end of primer probes and demonstrated that this modification leads to an increased single nucleotide discrimination in allele-specific PCRs73. Along this line, Gaster et al.24,74 introduced several 4'-C-modifications at the 3´primer end and could show that these modifications in combination with Vent (exo-) reveal highly increased single nucleotide discrimination properties. Especially a polar 4´-C-methoxymethylen group showed superior discrimination properties between matched and mismatched primer template situations. Presumably one might hold thermodynamic reasons responsible for this effect, but thermal denaturation studies and CD spectra revealed that differences between stabilities of canonical over non canonical duplexes at the 3´terminal primer end are negligible22,24. Thus, it is unlikely that this effect derives from differential duplex stabilities of 4'C-modified versus unmodified duplexes. It is more likely that increased steric constraints and slightly disturbed geometries in the

Latorra et al. introduced locked nucleic acid (LNA) modification at the 3'end of primer probes and demonstrated that this modification leads to an increased single nucleotide discrimination in allele-specific PCRs73. Along this line, Gaster et al.24,74 introduced several 4'-C-modifications at the 3´primer end and could show that these modifications in combination with Vent (exo-) reveal highly increased single nucleotide discrimination properties. Especially a polar 4´-C-methoxymethylen group showed superior discrimination properties between matched and mismatched primer template situations. Presumably one might hold thermodynamic reasons responsible for this effect, but thermal denaturation studies and CD spectra revealed that differences between stabilities of canonical over non canonical duplexes at the 3´terminal primer end are negligible22,24. Thus, it is unlikely that this effect derives from differential duplex stabilities of 4'C-modified versus unmodified duplexes. It is more likely that increased steric constraints and slightly disturbed geometries in the