• Keine Ergebnisse gefunden

PRDM9 - optimization of protein expression, purification and DNA binding studies / submitted by Theresa Zacherl, BSc

N/A
N/A
Protected

Academic year: 2021

Aktie "PRDM9 - optimization of protein expression, purification and DNA binding studies / submitted by Theresa Zacherl, BSc"

Copied!
109
0
0

Wird geladen.... (Jetzt Volltext ansehen)

Volltext

(1)

Submitted by

Theresa Zacherl, BSc

Submitted at

Institute of Biophysics

Supervisor

Assoc. Prof. Dr. Irene Tiemann-Boege

December 2020

PRDM9 – optimization

of protein expression,

purification and

DNA-binding studies

Master Thesis

to obtain the academic degree of

Master of Science

in the Master’s Program

Molecular Biology

JOHANNES KEPLER UNIVERSITY LINZ

Altenberger Str. 69 4040 Linz, Austria www.jku.at DVR 0093696

(2)

December 15, 2020 k1120217 2/109

STATUTORY DECLARATION

I hereby declare that the thesis submitted is my own unaided work, that I have not used other than the sources indicated, and that all direct and indirect sources are acknowledged as references. This printed thesis is identical with the electronic version submitted.

(3)

December 15, 2020 k1120217 3/109

Content

1. Abstract ... 6 2. Introduction ... 7 2.1 Meiotic Recombination ... 7 2.2 PRDM9 ... 8

2.2.1 Structure and Function ... 9

2.2.2 Molecular Mechanism ... 12

2.2.3 PRDM9 variants ... 14

2.3 Hotspots ... 15

2.3.1 Binding motifs and specificity ... 16

2.3.2 Hotspot evolution ... 19

2.3.3 The hotspot sequences investigated in this thesis ... 19

3. Objectives ... 20

4. Materials and Methods ... 21

4.1 Materials ... 21

4.1.1 Chemicals, Buffers, Media ... 21

4.1.2 Biological samples ... 24

4.1.3 Cloning and expression ... 31

4.2 Methods ... 32 4.2.1 PCR ... 32 4.2.2 DNA purification ... 33 4.2.3 Size selection ... 33 4.2.4 Cloning ... 33 4.2.5 Sequencing ... 36

(4)

December 15, 2020 k1120217 4/109

4.2.7 Production of specific clones ... 37

4.2.8 Bacterial protein expression ... 38

4.2.9 Lysate preparation ... 39

4.2.10 Protein screening ... 42

4.2.11 Protein purification ... 43

4.2.12 DNA-protein binding studies ... 44

5. Results ... 49

5.1 Cloning ... 49

5.2 Bacterial expression ... 52

5.2.1 Influence of culture size and media change, IPTG concentration variation ... 52

5.2.2 Impact of cell type: BL21-AI vs Rosetta2(DE3)pLacI ... 53

5.2.3 Temperature variation ... 54

5.2.4 Expression of hFcIgG1-mPRDM9cst-∆ZnF0 ... 56

5.2.5 Expression of eYFP ... 57

5.3 Lysate preparation ... 58

5.3.1 Protocol 1: standard protocol ... 58

5.3.2 Protocol 2: Buffer test, Freeze-Thaw cycles and sonication ... 59

5.3.3 Protocol 3: additional Freeze-Thaw cycles, Sarcosyl titration ... 61

5.3.4 Protocol 4: modification with less Freeze-Thaw cycles before SN wash ... 64

5.4 Protein purification ... 65

5.4.1 Purification 1 ... 65

5.4.2 Purification 2 ... 66

5.4.3 Purification 3 ... 68

5.5 Concentration measurements ... 69

(5)

December 15, 2020 k1120217 5/109

5.5.2 DeNovix Measurements ... 73

5.5.3 Comparison of Quantitative Western Blots and DeNovix measurements ... 75

5.6 DNA-protein binding studies ... 76

5.6.1 DNA amplification ... 76

5.6.2 EMSA ... 79

6. Discussion ... 94

7. Outlook ... 101

(6)

December 15, 2020 k1120217 6/109

1. Abstract

Sexual reproduction requires the generation of haploid gametes from diploid precursors in a specialized cell cycle called meiosis. Thereby, maternal and paternal chromosomes undergo homologous recombination by induction of DNA double strand breaks that lead to exchange of genetic material. This step is essential for fertility and remarkably increases genetic diversity. The location and activity of these sites of recombination, called hotspots, is specified by the meiosis-specific protein PRDM9 by DNA binding and methyltransferase activity on adjacent nucleosomes as well as formation of a pioneer complex with HELLS and interaction with other proteins of the recombination machinery. PRDM9's long C-terminal zinc-finger domain specifically recognizes and binds DNA target sequences in a highly specific manner. Different DNA motifs have been found to be enriched in hotspots, amongst them the canonical 'Myers motif', a 13-mer which has been found to be present in roughly 40% of human recombination hotspots and is bound by zinc-fingers 8-12 of the most abundant European Prdm9 allele A. Nevertheless, binding motifs are neither sufficient nor necessary to activate recombination in hotspots, as they are more abundant outside of hotspots than inside. Moreover, many hotspots lack any motif. Many factors determining PRDM9 interaction with DNA still remain unclear. Thus, in this thesis, an attempt to characterize binding of human PRDM9A to human hotspot DNA was performed in vitro to shed light on the interactions of the protein with its target sequence.

For this purpose, expression of various cloning constructs of the human PRDM9A zinc finger domain was optimized by adjusting several parameters. Optimization was also approached for lysate preparation to obtain a crude protein extract with soluble, active protein. An attempt for purification of the protein was overruled as it didn’t result in native PRDM9. Finally, electrophoretic mobility shift assays (EMSA) were optimized in order to characterize binding of the crude extract to DNA. To develop a working protocol, mouse PRDM9cst and human PRDM9A were both used in binding assays together with murine Hlx1 and Baudat hotspot DNA of different lengths as both contain the Myers motif and are known to bind human PRDM9A. After establishment of a stable binding assay, protein titrations with human PRDM9A and DNA sequences of a human hotspot (HS II), which was compartimented into five fragments of equal size, were performed. Some of the fragments contained a Myers motif while others did not – the aim was to investigate at which of the fragments the protein would bind and thereby find out where in the hotspot PRDM9 binds.

(7)

December 15, 2020 k1120217 7/109 provides optimized protocols with which future investigations can be enabled to understand the mechanisms of this protein in the long term.

2. Introduction

2.1

Meiotic Recombination

Meiosis is a specialized type of cell division essential for sexual reproduction in eukaryotes in which haploid gametes are generated from diploid precursors. It consists of a DNA replication followed by two rounds of cell division, called Meiosis I and II, and results in potentially four daughter cells, each containing one complete haploid set of chromosomes. During Meiosis I, homologous paternal and maternal chromosomes segregate, reducing the total chromosomal number by half. To ensure proper segregation of chromosomes, connections between the homologues have to be formed, which are established in meiotic prophase I by recombination events called crossing over (CO). Moreover, this recombination leads to the increase of genome diversity by reshuffling of genetic material (Petronczki et al. 2003).

Meiotic recombination events are initiated by the programmed formation of double strand breaks (DSBs), catalyzed by the meiotic topoisomerase-like protein SPO11 together with accessory proteins. An exonucleolytic degradation of the 5' ends resulting from the cleavage leads to extended 3' single strand overhanging on both sides of the DSB, which can in the following form nucleoprotein filaments with several proteins and search for the homologous chromosome as a template to repair the break. The repair can result either in a crossover or in a unidirectional gene conversion (non-crossover), depending on the involvement of a reciprocal exchange of DNA leading to crossing over, or to repair without exchange. Therefore, recombination intermediates are processed either to double Holliday junctions (crossover intermediates) or single-end strand invasions (non-crossover intermediates) (Baudat et al. 2013). The process is completed with extension by DNA synthesis and mismatch repair (Massy 2003).

Crossover events are not randomly distributed all over the genome but clustered in narrow genomic regions of 1-2 kb in size termed hotspots (HS), where the recombination frequency is significantly higher than in adjacent areas, called cold areas (Baudat et al. 2013). Their position and activity is mainly determined by PR domain containing protein 9 (PRDM9), a DNA-binding histone methyltransferase (Baudat et al. 2010). More in detail, PRDM9 binds to specific DNA motifs where it trimethylates histone 3 lysin 4 (H3K4me3) and lysine 36 (H3K36me3) of surrounding nucleosomes,

(8)

December 15, 2020 k1120217 8/109 which serves as a chromatin modification enriched at mammalian hotspots (Billings et al. 2013; Powers et al. 2016). This epigenetic mark is subsequently recognized by proteins of the DSB machinery leading to SPO11 recruiting and activity. Recent studies reveal the role of PRDM9 in the formation of open chromatin at hot spots together with the chromatin remodeling protein HELLS which subsequently allows interactions of other proteins with the newly opened sites (Spruce et al. 2020; Imai et al. 2020). Furthermore, a role of PRDM9 is implicated in recruiting the hotspots, which originate in DNA loops, to the chromosomal axis where the DSB machinery is located (Imai et al. 2017; Parvanov et al. 2017). However, no proper DSB repair occurs at these breaks which might lead to meiotic arrest and thereby sterility indicating the importance of PRDM9 also in DSB repair (Brick et al. 2012; Baudat et al. 2013). Paiano et al. 2020 suggested that PRDM9 together with ATM “[...] orchestrate mammalian SPO11 processing in a manner that influences meiotic DSB repair.” In the absence of PRDM9, DSB events are directed to other H3K4me3 enriched regions in the genome such as promoter regions.

2.2

PRDM9

PRDM9 is one of the 17 members of the PRDM protein family in humans which is specifically expressed in germ cells entering meiotic prophase I (Wu et al. 2013). It plays an essential role in controlling recombination hotspots. It determines the locations at which recombination occurs, redirecting DSBs away from functional elements such as gene promoters (Brick et al. 2012). Only recently, PRDM9 has been identified to form a pioneer complex together with chromatin remodeling protein HELLS which has the ability to open tightly packed, inaccessible chromatin sites (Spruce et al. 2020; Imai et al. 2020). It has been found that loss of Hells results in phenotypes similar to loss of Prdm9 (Spruce et al. 2020). In absence of PRDM9, efficacy of DSB repair seems to be impacted, making completion of full chromosomal synapsis in pachytene stage more difficult for cells (Hayashi et al. 2005; Sun et al. 2015). This results in a broad range of fertility phenotypes (Wells et al. 2020). PRDM9 knockout mice of both sexes show arrested gametogenesis in pachytene stage leading to sterility and altered meiotic gene expression in both sexes (Hayashi et al. 2005; Segurel et al. 2011; Irie et al. 2009; Sun et al. 2015). The fact that point mutations in the Prdm9 gene have been found to be linked to azoospermia in human patients underlines its importance in fertility. This gene is by now the only known speciation gene so far identified in mammals, it is e.g. responsible for hybrid sterility in male mouse subspecies (Davies et al. 2016; Steiner und Ryder 2013; Irie et al. 2009).

(9)

December 15, 2020 k1120217 9/109 2.2.1 Structure and Function

PRDM9 consists of three main regions: an amino-terminal region containing a Krüppel-associated box (KRAB) and an SSX repression domain (SSXRD), which serve as a protein-protein interaction domain (Imai et al. 2017; Parvanov et al. 2017); a PR/SET domain surrounded by a zinc knuckle and a single zinc finger, which has a H3K4 and H3K36 trimethylation activity; and a long carboxy-terminal C2H2 zinc finger array for specific DNA recognition and binding (Baudat et al. 2013; Parvanov et al. 2017).

Figure 1: structure of PRDM9. Taken from Baudat et al. 2013

Amino-terminal region

Consisting of a KRAB domain and a SSXRD-motif, the structure of the PRDM9 N-terminal region is similar to the family of SSX proteins which are specifically expressed in the testis and serve as transcriptional regulators. In general, KRAB domains are known to mediate protein-protein interactions (Baudat et al. 2013; Imai et al. 2017). Imai et al. 2017 and Parvanov et al. 2017 suggest a role of the KRAB domain in recruiting DSB hotspots to DSB proteins on the chromosomal axis via direct interaction with several proteins, namely CXXC1, EWSR1, EHMT2 and CDYL, as well as associations with meiotic specific cohesin pREC8 and the chromosomal axis/synaptonemal complex proteins SYCP3 and SYCP1. Recently, the formation of a pioneer complex for opening chromatin at hotspots by PRDM9 and the chromatin remodeling protein HELLS has been found (Spruce et al. 2020; Imai et al. 2020).

In mice, truncations of the KRAB domain lead to a loss of PRDM9 function and altered meiotic prophase and gametogenesis and therefore show that this region is "[…] essential for meiosis progression, meiotic DSB repair, and synapsis in both female and male mice" (Imai et al. 2017). A recent study has shown deletion of the KRAB domain in mice has led to meiotic arrest in vivo and only residual PRDM9 methyltransferase activity, meaning this domain being indispensable for meiosis. (Thibault-Sennett et al. 2018).

(10)

December 15, 2020 k1120217 10/109

PR/SET domain

The PR/SET domain has a methyltransferase activity which catalyzes specific trimethylation of Lysin 4 on histone 3 (H3K4me3) as well as H3K36 trimethylation (Wu et al. 2013; Powers et al. 2016). These serve as an epigenetic mark at recombination hotspots, which activates recombination by recruiting the DSB machinery and probably also plays a role for DSB activity and repair (Mahgoub et al. 2020; Wells et al. 2020). Additionally, mono- and dimethylation activity of H3K4 has also been shown (Wu et al. 2013), as well as acetylation of H3K9 (Spruce et al. 2020) and 5-hydroxymethylcytosine (Imai et al. 2020) being enriched at hot spots. Increased levels of H3K4me3 and H3K36me3 double-positive nucleosomes coincide only at recombination regions (Powers et al. 2016). In somatic cells, the trimethylation of H3K36 has been shown to be necessary for the homologous recombination repair of DSB suggesting a similar role in meiosis (Pfister et al. 2014). Koh-Stenta et al. showed that PRDM9 also has the ability to recognize "[…] a remarkably broad range of histone substrates […]" and also possesses automethylation properties (Koh-Stenta et al. 2017).

The domain's structure contains a highly conserved central SET domain flanked by a pre-SET zinc knuckle and a post-SET zinc finger (ZnF) (Wu et al. 2013). Different from other classes of protein methyltransferases, which use S-adenosyl-L-methionine (SAM) as cofactor, the SET domain uses S-adenosyl-L-homocysteine (SAH) as a methyl donor. According to Qian und Zhou 2006, the pre-SET domain contributes to structural stability by interaction with the surface of the core-pre-SET domain while the post-SET domain is part of the active side and forms an hydrophobic channel with the core-SET domain. Its crystal structure indicates an autoregulation of the methyltransferase activity of PRDM9 with the single ZnF in the post-SET region being involved in this regulation (Wu et al. 2013).

Zinc finger domain

The zinc finger region is capable of recognizing and binding DNA highly specifically thereby defining recombination hotspots. Schwarz et al. suggest the formation of a PRDM9 trimer via the zinc finger array (Schwarz et al. 2019).

The domain consists of a long C-terminal Cys2His2 (C2H2) ZnF array. C2H2 zinc fingers have a conserved structure of two β-sheets and one α-helix for each finger from which a tetrahedral C2H2 structural unit is formed with a central zinc ion. Therefore, one cystidine in each of the two antiparallel β-sheets together with two histidines of the helix contact the zinc ion to provide stiffness

(11)

December 15, 2020 k1120217 11/109 to the structure (Wolfe et al. 2000; Baudat et al. 2013). During DNA binding, the α-helix of each ZnF lies in the major groove of the DNA and contacts the DNA while the β-strands lie on the outside together with the C2-Zn-H2 complex (Persikov und Singh 2014; Patel et al. 2016). DNA contact is performed with side chains from the amino acid residues at positions -1, 2, 3 and 6 respective to the helix, with the amino acids at positions -1, 3 and 6 contacting the primary strand while the amino acid at position 2 contacts the complementary strand (Wolfe et al. 2000). A model of the site specific protein-DNA interaction can be seen in Figure 2.

Figure 2: model of DNA-ZnF interaction. (A) crystal structure of three zinc fingers of Zif268 bound to DNA. The amino acids involved in the site-specific DNA recognition are color-coded: –1 – green, +2 – blue, +3 – red, and +6 – purple. (B) A model of the site-specific DNA recognition by α-helical amino acids. (Taken from Fedotova et al. 2017)

The identities of the DNA contacting amino acids are strongly involved in the determination of DNA binding specificity. Interestingly, in PRDM9 these amino acids are highly variable and account for many PRDM9 variants, while the remaining zinc finger array has a highly repetitive, microsatellite-like genomic structure with a 28 amino acid unit tandem repeat, almost perfectly homologous both on DNA and protein levels (Baudat et al. 2010; Patel et al. 2016). Nevertheless, it has been shown that not only the DNA contacting residues are responsible for binding specificity but also the remaining amino acids seem to impact complex formation. Interactions with the DNA-phosphate backbone are beneficial for binding energy (Wolfe et al. 2000; Patel et al. 2016; Billings et al. 2013). As there are many different possible combinations through which different ZnF of the array can contact DNA, multiple sites in the genome are expected to be contacted by PRDM9 (Baudat et al. 2010).

(12)

December 15, 2020 k1120217 12/109 Neighboring ZnFs are connected by a linker region, consisting of the amino acid sequence threonine, glycine, glutamate, lysine and prolin, and which is therefore known as TGEKP-linker. It is found in most C2H2-type ZnF proteins and highly conserved in PRDM9 between different species acting as a spacer between the zinc fingers (Wolfe et al. 2000). These linkers also influence the DNA binding affinity; various mutations have been shown to reduce or completely abolish the binding affinity in mutagenesis experiments (Chou et al. 2010; Smith et al. 1991; Choo und Klug 1993). In the ZnF protein TFIIIA, the TGEKP linker is capable of undergoing a conformational change switching from a non-specific binding mode to a sequence specific binding. A similar mechanism could be occurring in PRDM9 where it could be acting like a "snap lock" to bring the ZnF into place in the major groove of the DNA and stabilize the ZnF-DNA complex (Laity et al. 2000; Striedner et al. 2017).

2.2.2 Molecular Mechanism

To date, the molecular function of PRDM9 is still not fully understood. In normal meiosis, the ZnF domain recognizes and binds to hotspot-specific DNA sequences as multimers (Baker et al. 2015b; Altemose et al. 2017; Schwarz et al. 2019). Most likely before binding, it forms a pioneer complex together with the chromatin remodeler HELLS, which then binds to hot spots, where open chromatin is generated (Spruce et al. 2020; Imai et al. 2020). The chromatin remodeling is required to enable stable PRDM9 binding at hot spots (Spruce et al. 2020). HELLS has also been found to be responsible for the enrichment of 5-hydroxymethylcytosine (5hmC) at PRDM9 sites. This modification could prevent the binding of factors with affinity for 5mC which in the course could interfere with meiotic recombination (Imai et al. 2020). The PR/SET domain of PRDM9 trimethylates the surrounding nucleosomes at H3K4 and H3K36, directing the subsequent DSBs away from H3K4me3 enriched regions like promoters. A characteristic symmetrical nucleosomal pattern is formed consisting of a central nucleosome-depleted region and surrounding nucelosomes enriched in H3K4me3 and H3K36me3 (see also Figure 5). This pattern is unique to hot spots and can be recognized by special reader proteins (Wells et al. 2020). The mechanism is shown in Figure 3.

(13)

December 15, 2020 k1120217 13/109 Figure 3: model of action of the pioneer complex PRDM9 and HELLS. PRDM9 and HELLS bind before the ZnF array of PRDM9 recognizes its binding motif (Spruce et al. 2020). (a) PRDM9 specific DNA motif in chromatin without specific feature. (b) recognition of the binding motif by PRDM9.(c) HELLS promotes chromosomal restructuring, which enables a more stable binding of PRDM9 to its motif. (d) methylation of histone 3 on lysine 4 and 36 of neighboring nucleosomes by the PR/SET domain of PRDM9. (e) conversion of 5mC to 5hmC near the PRDM9 binding site. (f) formation of a DSB within or near the PRDM9 binding site.

Taken and modified from Imai et al. 2020.

PRDM9's KRAB domain interacts with several proteins translocating the hotspot DNA from chromosomal loops to the axis, where PRDM9 remains bound to the hotspot at least until DSBs are initiated by the topoisomerase enzyme SPO11 (Parvanov et al. 2017; Bhattacharyya et al. 2019). One of these proteins is EWSR1, which recently has been found to bind PRDM9 and to link it to the chromosome axis, even though it is not to the only protein to do so (Tian et al. 2020; Parvanov et al. 2017; Bhattacharyya et al. 2019). Additionally, this protein enhances PRDM9 methyltransferase activity, possibly by preventing other proteins of inhibiting the PR/SET domain and/or by stabilizing the complex allowing effective H3K4 and H3K36 trimethylation (Tian et al. 2020). Protein ZCWPW2 is known to be able to recognize H3K4me3, and might help positioning of DSBs which are subsequently introduced by SPO11 (Wells et al. 2020; Parvanov et al. 2017). Histone methylation reader ZCWPW1 recognizes the H3K4me3 histone mark which PRDM9 provides, probably marking the sites for homologue pairing and thereby DSB repairing (Wells et al. 2020). PRDM9 not only binds to the broken chromatide but also seems to act at the unbroken homologous donor which seems to promote strand exchange and recombination by providing a more accessible chromatin

(14)

December 15, 2020 k1120217 14/109 structure needed for repair (Li et al. 2019; Yamada et al. 2020). This may explain differences between symmetrically versus asymmetrically bound hot spots (Davies et al. 2016; Li et al. 2019).

2.2.3 PRDM9 variants

Most PRDM9 allelic variants differ both in the number and identity of their ZnF, or more precisely the identity of their DNA contacting amino acid residues which mainly define the binding specificity (see 2.2.1: Zincfinger domain). The ZnF minisatellite structure results in a high mutation rate due to repeat instability explaining the protein's rapid evolution and high variability. A change in one of the DNA contacting amino acids could mean an altered DNA binding specificity and thereby create a totally new family of hotspots. More than 40 different PRDM9 variants are known in humans, all with different DNA binding specificity which means the use of different sets of hotspots genome-wide (Baudat et al. 2010; Myers et al. 2010). In Figure 4 below, the three main human PRDM9 alleles A, B and C and their allele frequency in European and African population are shown.

Figure 4: Identities of the DNA contacting amino acid residues of the ZnFs at -1, 3 and 6 respective to the helix of the three PRDM9 alleles A, B and C and their population allele frequencies. (Taken from Pratto et al. 2014)

Noticeably, ZnF 0 - 6 are almost identical between the different alleles displayed and also show only little variability amongst other alleles (D, E and K; not displayed) while the C-terminal ZnF are highly polymorphic indicating that those might be mainly responsible for the differences in hotspot recognition (Baudat et al. 2010). Indeed, only four to six fingers seem to substantially contribute to the motif while the remaining fingers mainly stabilize the binding in a nonsequence-specific manner according to several studies (Patel et al. 2016; Patel et al. 2017, 2017; Baker et al. 2015b). Recently, Altemose et al. have shown that alternative spacing in hotspot motifs could have lead to an underestimated sequence specificity of ZnF 1-6 (Altemose et al. 2017; see chapter 2.3.1).

(15)

December 15, 2020 k1120217 15/109 to directly affect DNA binding can alter the ability to bind hotspots (Berg et al. 2010; Berg et al. 2011).

2.3

Hotspots

Hotspots are genomic regions of 1-2kb in size with significantly higher recombination frequencies than their surroundings. In humans, more than 30,000 hotspots have so far been identified which show a great variation of recombination activity of four orders of magnitude (Myers et al. 2010; Baudat et al. 2013; Pratto et al. 2014). As mentioned before, their location is mainly determined by PRDM9 which marks the hotspots with allele-specific H3K4me3 and H3K36me3 chromatin modifications (see Figure 5). Recently, Spruce et al. found a specific chromatin state representing recombination hotspots, which is characterized by H3K4me3, H3K36me3 and additionally H3K4me1 and H3K9ac (Spruce et al. 2020). Hotspots tend to be located outside of genes (Baudat et al. 2013). In the absence of PRDM9, hotspots are relocated to other H3K4me3-modified functional genomic elements such as gene promoters, enhancers or CpG islands where DSB cannot be properly repaired leading to meiotic arrest resulting in infertility (Brick et al. 2012; Hayashi et al. 2005).

Figure 5: PRDM9 binding, nucleosome modification and DBS initiation at a hotspot. Molecular tags describing the hotspot landscape. PRDM9-binding sites determined in vitro (Affinity-Seq, red), in vivo (PRDM9 ChIP-seq, blue), histone trimehtylation (H3K4me3, orange) and DSB initiation (DMC1 mark, green). Above: nucleosomal repositioning after PRDM9 trimethylation activity. Taken and modified from Paigen and Petkov 2018

PRDM9 binding sites are not necessarily positioned in the hotspot centers though (Billings et al. 2013).

(16)

December 15, 2020 k1120217 16/109 2.3.1 Binding motifs and specificity

Different DNA motifs have been found to be enriched in hotspots. A degenerated 13-mer sequence motif CCNCCNTNNCCNC, with N as any base, also known as 'Myers motif' is present in roughly 40% of active hotspots which has been found to be a PRDM9 binding motif contacting ZnF 8-12 of the most abundant allele Prdm9A (Myers et al. 2010; Baudat et al. 2010; Patel et al. 2016). Interestingly, this canonical binding motif is found more often outside than inside hotspots and not all hotspots recognized by PRDM9A contain this sequence (Segurel et al. 2011; Berg et al. 2010; Berg et al. 2011). Since then, several other binding motifs have been discovered but they are "[…] neither sufficient nor necessary to predict genome-wide PRDM9-binding, DSB formation, or recombination events […]"(Altemose et al. 2017). Computational prediction of binding has not always turned out to be successful. ZnF 8-12 of human PRDM9A have a fivefold higher affinity for their actual hotspot than for the computationally predicted binding sequence and a 70-fold higher affinity than for an unspecific sequence (Patel et al. 2016). Moreover, despite the knowledge about several binding motifs, there are hotspots without any so far known motif; e.g. in mice, binding sequences for a single Prdm9 allele sharing only a few nucleotides without an obvious consensus sequence have been identified (Berg et al. 2010; Berg et al. 2011; Pratto et al. 2014; Segurel et al. 2011; Davies et al. 2016).

Altemose et al. have identified seven distinct DNA motifs for PRDM9B, all with a close internal match to the Myers motif, and revealed a high sequence specificity of ZnF 1-6 which were previously underestimated (Altemose et al. 2017). Differences in the motifs mainly consist of variable internal spacings not being able to be described in a single motif which possibly explains the previously shown weak sequence specificity of ZnF 1-6 in other hotspots (Altemose et al. 2017; Myers et al. 2008; Myers et al. 2010; Pratto et al. 2014). In Figure 6, the motifs are aligned to each other and an in silico predicted sequence. The DNA contacting amino acid residues of each ZnF are shown below together with their classification concerning charge, polarity and presence of aromatic side chains. The variable region is spanned by ZnF 5 and 6, both ZnF without a charge and a bulky aromatic side group in PRDM9B. In PRDM9A, the same characteristics of amino acids are given for these ZnF; the only difference is serine instead of threonine as DNA-contacting residue in ZnF 5, a difference of only one methyl group in the side chain. It is suggested that these two ZnF might act as a linker between the stronger bound ZnF upstream and downstream (Altemose et al. 2017).

(17)

December 15, 2020 k1120217 17/109 Figure 6: DNA motifs with variable spacing, bound by PRDM9 B. Alignment of the motifs to each other and an in silico predicted motif with the identity of the DNA-contacting amino acid residues (positions -1, 3 and 6) and their properties. The variably spaced motif region is indicated by vertical dotted lines. The Myers motif is gray shaded in the in silico motif. Taken from (Altemose et al. 2017).

But even though PRDM9 variant A and B only differ by one single DNA-contacting amino acid, they show a different affinity to motif 7 which shows that even slight differences between highly similar alleles can impact binding preferences (Altemose et al. 2017). Furthermore, identical ZnFs of the murine PRDM9Cst variant have been shown to be context dependent and don’t bind similar trinucleotide sequences (Billings et al. 2013).

Only recently, five binding motifs for PRDM9 were identified by an Bayesian convolutional network predicting which were previously unobserved. Three of them are versions of the canonical 13-mer Myers motif while the two others are 22nt long and were not identified before due to its complexity, as only 8 bases contain substantial information (Brown und Lunter 2019). Noticeably, they strongly resemble the motifs found by Altemose et al. 2017.

Striedner et al. replaced parts of PRDM9 target DNA by different unspecific nucleotides and found a similar binding strength for different subsets suggesting that ZnF can interact both specifically or unspecifically, depending on the target sequence. They found the binding strength to be increased when the full ZnF array contacted DNA specifically and even further if flanking DNA regions were added to the target DNA (Striedner et al. 2017). The explanation for this behavior could be the binding plasticity of PRDM9-ZnF (Tiemann-Boege et al. 2017). Different PRDM9 variants bind to their corresponding hotspots with different affinities as has been shown by Pratto et al. 2014 and

(18)

December 15, 2020 k1120217 18/109 Patel et al. 2016. In Prdm9A/C heterozygous individuals, hotspots specific for allele C are more frequent and more active than A-specific hotspots (Pratto et al. 2014). The affinity of variant C ZnF 8-13 to its consensus sequence is 10-fold higher than the affinity of PRDM9A ZnF 8-12 to the THE1B hotspot (Patel et al. 2016). In general it can be said that the affinity differences of different PRDM9 variants to their genomic targets and the number of these targets can explain the proportion of DSB determined by them (Diagouraga et al. 2018, see Figure 7).

Figure 7: heterozygous PRDM9-Driven Hotspot Control. In heterozygous mice, each PRDM9 variant binds to a fraction of its own target sites independently from the second variant. The number of generated DSB for each variant depends on the number of targets and the affinity of the protein to its targets. Taken and modified from (Kang et al. 2018).

A recent study shows that different PRDM9 variants bind their target sequence and trimethylate H3K4 independently from each other without any affection in heterozygous mice (Diagouraga et al. 2018). This is in contrast to another publication suggesting a reduction of H3K4me3 level at co-expression proposing hetero-multimerization of PRDM9 variants as a possible origin of suppression (Baker et al. 2015b). According to the first mentioned study, these interactions may come from PRDM9 overexpression in cultured cells or be specific to human PRDM9 variants (Diagouraga et al. 2018).

According to Chen et al., the longer and stronger the binding motif is, the earlier PRDM9 binds which leads to stronger and more extended H3K4me3 marks. These may allow a more stable and permissive chromatin environment and thereby earlier formed DSBs which are more likely to lead to a CO (Chen et al. 2020).

Besides the affinity, the accessibility of the DNA target site imposed by prior chromatin modifications plays a major role in whether a binding site will be able to be used in vivo (Walker et al. 2015; Pratto et al. 2014). Short tandem repeat (STR) length, in particular short-to-intermediate-sized poly-A

(19)

December 15, 2020 k1120217 19/109 repeats (<12bp), have been shown to be enriched at hotspots, possibly due to the fact that nucleosome occupancy is impacted by poly-A's thereby increasing the accessibility of the DNA by PRDM9 (Heissl et al. 2019).

Additionally, DNA methylation at CpG sites could slightly impact PRDM9 recognition and binding as well as being acting on a larger scale to a higher degree (Tiemann-Boege et al. 2017).

2.3.2 Hotspot evolution

DSB at hotspots in meiotic recombination are repaired using the homologous partner chromatid as a template, thereby replacing the initial sequence. If one of the two homologues has a higher affinity to PRDM9, it will preferentially be a target for a DSB and the high affinity site will be replaced by the homologous sequence with lower or no PRDM9 affinity. This results in the loss of PRDM9 allele-specific binding sequences meaning hotspots being self-destructive as it is seen for the canonical PRDM9A binding motif during human evolution (Myers et al. 2010). This so-called hotspot paradox gives rise to the question how recombination still persists as hotspots drive themselves to extinction (Baker et al. 2015a). The hotspot paradox can be solved looking at PRDM9 as the hotspot-defining factor. The protein's remarkable diversity and exceptionally rapid evolution has been postulated to be the result of the evolutionary pressure to rescue sterility by generating a completely new set of hotspots (Baudat et al. 2010; Myers et al. 2010). The ZnF's minisatellite-like structure is prone for mutations which can immediately generate new Prdm9 alleles relocating binding sites and thereby compensating for hotspot erosion. The flexibility of the multiple ZnF might account for easy hotspot finding as bindings are possible in different variants, probably due to binding plasticity (see 2.3.1).

2.3.3 The hotspot sequences investigated in this thesis

In this work, the focus is on a hotspot on chromosome 16 within the RBFox 1 intron (HSII), with its mean CO center on position 6,360,770 ± 9 bp and its mean NCO center on position 6,360,860 ± 15 bp (Heissl et al. 2019). In Figure 8 below, the crossover distribution derived from several donors for this hotspot is depicted with a Gaussian fit of the crossing over distribution and the CO and NCO centers marked as vertical dashed lines. Additionally, PRDM9A binding motifs on the x-axis are marked as rhomboids. One PRDM9 binding motif occurs in close vicinity to the hotspot center without a mismatch (depicted as yellow rhomboid; spanning chr16:6,361,057 – 6,361,088) which is most likely the most active motif (Altemose et al. 2017) while three further motifs are located peripheral, all of them containing one mismatch to the consensus sequence (orange rhomboids).

(20)

December 15, 2020 k1120217 20/109 Figure 8: Crossover distribution on the investigated hotspot with PRDM9A binding motifs shown as rhomboids on the x-axis. Yellow rhomboid: without mismatch; orange rhomboids: one mismatch. Crossing over center on chr16:6,360,770 ± 9 bp; non crossing over center on chr16:6,360,860 ± 15 bp. The grey shaded region most likely contains the DSB region. Taken and modified from Heissl et al. 2019

To optimize binding conditions, binding reactions of human PRDM9A ZnF with the mouse Hlx1 Hotspot DNA of mouse strain C57BL/6J (B6) was used as it contains a Myers motif and therefore binds human PRDM9A.

3. Objectives

The several binding motifs of the hotspot investigated (HSII in Arbeithuber et al. 2015) give rise to the question where exactly and with which affinity PRDM9A binds. To provide an answer to this question, the aim was to characterize the binding of human PRDM9A to this recombination hotspot.

Therefore, the hotspot was divided into five fragments of approximately equal size and in Electrophoretic Mobility Shift Assay (EMSA) experiments; the binding affinity of the protein to each of the fragments was to be investigated (see Figure 9)

(21)

December 15, 2020 k1120217 21/109 Figure 9: schematic division of the hotspot into five fragments. While fragment 4 contains a Myers motif without any mismatch, fragments 1 and 5 contain a motif with one mismatch. Fragments 2 and 3 do not contain an obvious binding motif with fragment 3 containing the HS center. Modified from Heissl et al. 2019

To get to the final binding experiments, first of all, restriction enzyme-based cloning of different constructs containing the PRDM9 ZnF region in a pOPIN vector was performed. Protein expression in E.coli was optimized, as well as the lysate preparation to gain a crude protein extract in different buffers. An attempt for protein purification using protein A agarose beads in batch was performed. DNA-binding studies using the crude protein lysate and Hlx1 hotspot DNA were optimized using different buffers and salt concentration. Finally, it was tried to measure DNA-protein binding affinity of PRDM9 protein lysate to the five fragments of the human hotspot.

4. Materials and Methods

4.1 Materials

4.1.1 Chemicals, Buffers, Media

Buffers

All buffers are prepared using ultrapure water. 50x TAE buffer

2M Tris-Base, 1M Acetic Acid, 0.1M EDTA in ddH2O 5x TBE buffer, pH 8.3

(22)

December 15, 2020 k1120217 22/109 10x SDS Running buffer, pH 8.4

2500mM Tris-Base, 19200mM Glycine, 1% SDS in ddH2O 1x blotting buffer (friendly transfer buffer)

25mM Tris-Base, 191.8mM Glycine, 0.02% SDS, 10% Isopropanol in ddH2O 10x TBS buffer, pH 7.4

250mM Tris (4.08g Tris-Base, 33.05g Tris-HCl), 1370mM NaCl, 27mM KCl in ddH2O 1x TBS-T

1ml Tween-20 in 1L 1x TBS (=0.1%) 10x TKZN, pH 7.5

100mM Tris, 500mM KCl, 500µM ZnCl2, 0.5% NP-40 in ddH2O 1x Patel lysis buffer

16mM Tris-HCl, 4mM Tris-Base, 700mM NaCl, 5% Glycerol, 25µM ZnCl2, 0.5mM TCEP in ddH2O 1x Patel EMSA binding buffer (Patel 300)

16mM Tris-HCl, 4mM Tris-Base, 300mM NaCl, 5% Glycerol, 25µM ZnCl2, 0.5mM TCEP in ddH2O 4x EMSA washing buffer

676mM Tris-HCl, 525mM Tris-Base, 800mM NaCl, 2% SDS in ddH2O 4x EMSA Equilibration buffer

676mM Tris-HCl, 525mM Tris-Base in ddH2O N- Lauroylsarcosine

loading dyes

6x EMSA loading dye

15% Glycerol, 0.25% Bromophenol blue, 0.25% Xylene cyanol FF, in 0.5x TBE 3x Laemmli sample buffer + ß-Mercapto-Ethanol, pH 6.8

30% Glycerol, 6% SDS, 240mM Tris-HCl, 16% ß-mercapto-ethanol, 0.01% Bromophenol blue 6x DNA loading dye(Fermentas)

10 mM Tris-HCl (pH 7.6), 0.03% Bromophenol Blue, 0.03% Xylene Cyanol FF, 60% Glycerol, 60 mM EDTA

(23)

December 15, 2020 k1120217 23/109

protein staining

Coomassie G-250 (1L)

60-80 mg CBB G-250, 3 ml conc. HCl in ddH2O Ponceau S – staining solution

0.5%(1.25g) Ponceau S, 1% (2.5 ml) Acetic Acid ad 250 ml with ddH2O

Markers

DNA fragment size

GeneRuler DNA Ladder Mix (100bp - 10000bp), Thermo Scientific

GeneRuler Ultra Low Range DNA ladder (10bp - 300bp), Thermo Scientific Protein size

Precision Plus ProteinTM Standards All Blue (10 kDa - 250 kDa), Biorad

Antibiotica

filter sterilized through 0.22µm syringe filter Ampicillin 100mg/ml stock

269mM Ampicillin sodium salt in ddH2O Chloramphenicol 25mg/ml stock

77mM in 100% EtOH

Media

Media are autoclaved before use and antibiotics are added afterwards Lysogeny Broth (LB) Medium (1L)

20 g LB powder in ddH2O

for LB-Amp: addition of 100 μg/ml Ampicillin for LB-CAM: addition of 34 μg/ml Chloramphenicol LB plates (1L)

(24)

December 15, 2020 k1120217 24/109

Production of competent cells

buffers are filter sterilized through 0.22 micron filter RF 1 (500mL)

6g RbCl, 4.95g MnCl2*4H2O, 15mL 1M Potassium actetate, 0.75g CaCl2*2H2O, 75mL glycerol adjust pH at 5.8 using 0.2M acetic acid

RF 2 (200mL)

4mL 500mM MOPS pH 6.8, 0.24g RbCl2, 2.2g CaCl2*2H2O, 30ml glycerol adjust pH at 6.8 using NaOH

Chemicals for cloning and expression

Filter sterilized with a 0.22µm syringe filter and stored in aliquots at -20°C. 1M IPTG 2.38g of IPTG in 10mL ddH2O 20% L-arabinose 4g in 20ml ddH2O 20% glucose 4g in 20ml ddH2O 50mM ZnCl2 0.34g in 50ml ddH2O

0.5M glucose (=> 9% glucose solution) 4.5g in 50ml ddH2O

4.1.2 Biological samples

Genomic DNA

Genomic DNA (gDNA) was used as template for several PCR reactions. Either the sample "blood 1042" from the Kepler Universitätsklinikum Med Campus IV (former Landes- Frauen- und Kinderklinik) or Theresa Schwarz saliva extracted genomic DNA were used.

(25)

December 15, 2020 k1120217 25/109

Primers and DNA fragments for binding

Primers were synthesized either at Eurofins MWG Operon or Integrated DNA Technologies (IDT). They were dissolved to a concentration of 100µM with nuclease-free H2O from Sigma-Aldrich. Dilutions of 25µM or 5µM were prepared to be used in the PCR reactions. Dilutions and stocks were stored at -20°C. Primers for sequencing were synthesized at LGC Genomics and samples with different concentrations stored at -20°C.

Cloning mPrdm9cst delta ZnF0 Fwd: mP9_fwd_KpnI 5’-acgacg GGTACCGAATGGAATCATCGCACTGAAATCTTCC-3’ 40bp, Tm=70°C, %GC=50 (28bp, Tm=60°C, %GC=43) Rvs: mP9_rvs_HindIII 5’-cgtcgt AAGCTTTTACTTCTCTCTTGTATGTGGCCTCTG-3’ 40bp, Tm=70°C, %GC=50 (27bp, Tm=60°C, %GC=44) hPrdm9A delta ZnF0 Fwd: hP9 _KpnI_F 5’-TATCTAGGTACCGAAC GCAATCACTCCTCTCAGAAC -3’ 24bp, Tm=59°C, %GC=50% Rvs: ZnF-hP9_HindIII_R 5’-CGTCGTAAGCTTGT GTGTGGTGACCACATTTGTcttt-3’ 25bp, Tm=60°C, %GC=44% hFcIgG1-tag Fwd: BamHI-hFcIgG_F 5’- TATCTAGGATCCG AGCCCAAATCTTGTGACAAAACTCAC-3’ 27bp, Tm=60°C, %GC=44% Rvs: hFcIgG_EcoRI_R 5’- CGTCGTGAATT CTTTACCCGGGGACAGGGAG-3’ 19bp, Tm=59°C, %GC=63% Sequencing

pOPIN standard & Fc-tag

Fwd: T7_prom (LGC standard) 5’- TTAATACGACTCACTATAGGG-3’ 20bp, Tm=48°C,

%GC=40%

(26)

December 15, 2020 k1120217 26/109 %GC=50% Fwd: FcTag_position452_fw 5’- CCTGCCTGGTCAAAGGCTTC-3’ 20bp, TM=59°C, %GC=60% EMSA fragments mouse Hotspots Hlx1 75bp Fwd: Hlx_75bp_F 5'-GTGGGAGGAGATGGTGGGTG-3 20bp, Tm=60°C, %GC=65 Rvs: Hlx_75bp_R 5'-CCCATGGTTAGTGGAATGCGTAAAG-3' 25bp, Tm=60°C, %GC=48% Biotin-labeled Hlx1 75bp Fwd: Bio-Hlx_75bp_F 5'-Bio-GTGGGAGGAGATGGTGGGTG-3 20bp, Tm=60°C, %GC=65% Rvs: Bio-Hlx_75bp_R 5'-Bio-CCCATGGTTAGTGGAATGCGTAAAG-3' 25bp, Tm=60°C, %GC=48% Biotin-labeled Pbx1 336bp

Fwd: Bio-Pbx1_336bp_F 5'-Bio- ATGCACATAGGCTTGCTTGG-3' 20bp, Tm=57°C,

%GC=50%

Rvs: Bio-Pbx1_75bp_R 5'-Bio- CCTCTAGTACGGTGGTGTAAACA-3' 23bp, Tm=57°C,

%GC=48%

human Hotspots

Hotspot Chromosome 16 (HSII)

All primers are available with a 5'-biotin tag

fragment 1: Prdm9 motif 6,360,129 containing 1 SNP

Fwd: F_HSII_reg1 5'- CGAGGAGCTGGGAATATAGG -3 20bp, Tm=59°C, %GC=55%

(27)

December 15, 2020 k1120217 27/109 fragment 2 neg control; no Prdm9 motif

Fwd: F_HSII_reg2 5'- GGCTTATGATACTAGGCTATG -3 21bp, Tm=56°C, %GC=43%

Rvs: R_HSII_reg2 5'- CAGATTTGGTTTCAAAGTCAG -3' 21bp, Tm=54°C, %GC=38%

fragment 3 mean HS center; no Prdm9 motif

Fwd: F_HSII_reg3 5'- AAACCTGTAGGATGTCAAAC -3 20bp, Tm=53°C, %GC=40%

Rvs: R_HSII_reg3 5'- TCTGATAACACTGTGCCTG -3' 19bp, Tm=55°C, %GC=47%

fragment 4 Prdm9 motif 6,361,060

Fwd: F_HSII_reg4 5'- GTTTAGAACTATTGGCTTGC -3 20bp, Tm=53°C, %GC=40%

Rvs: R_HSII_reg4 5'- AGACAGAGTCTCGTACTG -3' 18bp, Tm=54°C, %GC=50%

fragment 5 Prdm9 motif 6,361,550 containing 1 SNP

Fwd: F_HSII_reg5 5'- GTTCTTGTTAGGCTCACTC -3 19bp, Tm=55°C, %GC=47%

Rvs: R_HSII_reg5 5'- GATCTCTGCACCCTGAA -3' 17bp, Tm=53°C, %GC=53%

all other human Hotspots have an universal tag with which they can be biotinylated using the following primers:

Fwd: Bio-Hlx_39bp_F 5'-Bio-TGAATAGTGTGCAGACTTGGACC-3' 23bp, Tm=58°C, %GC=48%

Rvs: Bio-Hlx_75bp_R 5'-Bio-CCCATGGTTAGTGGAATGCGTAAAG-3' 25bp, Tm=60°C, %GC=48%

Baudat fragment (HS fragment acc. to Baudat et al. 2013)

Fwd: HS-Baudat_F 5'-tgaatagtgtgcagacttggacc AGTTGCTTTGCCAGCTTTCTTCTTAAGGCCTCCC -3' 34bp, Tm=69°C, %GC=50% Rvs: HS-Baudat_R 5'cccatggttagtggaatgcgtaaag AGAAAGCAGAGAGAGGGGTGGGCAGGGAGGCC -3' 32bp, Tm=74°C, %GC=66% Hotspot Chromosome 5

(28)

December 15, 2020 k1120217 28/109 Fwd: HS- Chr5(Prdm9)_F 5'-tgaatagtgtgcagacttggacc CAATCTCACCTAAGTCATCAAGAGTG-3' 26bp, Tm=57°C, %GC=42% Rvs: HS- Chr5(Prdm9)_R 5'cccatggttagtggaatgcgtaaag GTCATTCCTTACATAGATGGAACTGC -3' 26bp, Tm=57°C, %GC=42% Hotspot Chromosome 10 Fwd: HS- Chr10_F 5'-tgaatagtgtgcagacttggacc GCAGTGGCCCAAGCTGTTG-3' 19bp, Tm=60°C, %GC=63% Rvs: HS- Chr10_R 5'cccatggttagtggaatgcgtaaag CAGTTAGTGCTAAGAACCTGCAGG-3' 24bp, Tm=59°C, %GC=50% Hotspot Chromosome 12 Fwd: HS- Chr12_F 5'tgaatagtgtgcagacttggacc CTAGGGTGATGATTTTAGTGGTGGG -3' 25bp, Tm=59°C, %GC=48% Rvs: HS- Chr12_R 5'- cccatggttagtggaatgcgtaaag TACTGACCACCTCGGCATCTC -3' 21bp, Tm=59°C, %GC=57%

Hotspot Chromosome 21 (Arbeithuber et al. 2015)

Fwd: HS- Chr21(Babs)_F 5'-tgaatagtgtgcagacttggacc CCCTCTTCCTCACCCTTTCTC -3' 21bp, Tm=58°C, %GC=57% Rvs: HS- Chr21(Babs)_R 5'cccatggttagtggaatgcgtaaag CACCAAGGTGTATAAGCTTTCTCTG -3' 25bp, Tm=57°C, %GC=44%

Hotspot Chromosome 8 - (Berg et al. 2010)

Fwd: HS-Chr8CF-Berg_F 5'- tgaatagtgtgcagacttggacc GTGGGCTCCTGTGTGATTTCC -3' 21bp, Tm=59°C, %GC=57% Rvs: HS-Chr8CF-Berg_R 5'- cccatggttagtggaatgcgtaaag GTGGTTACATAGTCGTATGCTAGTGTC -3' 27bp, Tm=58°C, %GC=44%

Hotspot human Prdm9A predicted

Fwd: hP9A-predicted_F 5'tgaatagtgtgcagacttggacc

GAGGACTAGACGACGAAGTAGTACCGCC -3'

(29)

December 15, 2020 k1120217 29/109 Rvs: hP9A-predicted_R 5'-cccatggttagtggaatgcgtaaag ATCGGTGGTCACGGCGGTAC -3' 28bp, Tm=63°C, %GC=67% Protein extracts Expression

Table 1: overview of bacterial expressions using different conditions

expression construct expression cells inoculation aim growth and

expression 1 eYFP-mousePrdm9cst -∆ZnF0 fresh trafo, o/n pre-culture culture size 10ml vs. 40ml media change vs. no change IPTG concentration OD600 = 1 after 5h expression o/n RT 2 a eYFP-humanPrdm9A -∆ZnF0

Rosetta fresh trafo, no pre culture, culture size 100ml media change expression cells OD600 = 1.8-4.4 (22h), expression 7h (2h 37°C, 5h RT) 2 b YFP- mPrdm9cst -Exon10

Rosetta fresh trafo, no pre culture, culture size 100ml media change expression cells OD600 = 1.8-4.4 (22h), expression 7h (2h 37°C, 5h RT) 2 c hFcIgG-eYFP-humanPrdm9A-∆ZnF0

Rosetta fresh trafo, no pre culture, culture size 100ml media change expression cells OD600 = 1.8-4.4 (22h), expression 7h (2h 37°C, 5h RT) 2 d eYFP-humanPrdm9A -∆ZnF0

BL21-AI fresh trafo, no pre culture, culture size 100ml media change expression cells OD600 = 1.8-4.4 (22h), expression 7h (2h 37°C, 5h RT) 2 e hFcIgG-eYFP-humanPrdm9A-∆ZnF0

BL21-AI fresh trafo, no pre culture, culture size 100ml media change expression cells OD600 = 1.8-4.4 (22h), expression 7h (2h 37°C, 5h RT) 2 f YFP- mPrdm9cst -Exon10

BL21-AI fresh trafo, no pre culture, culture size 100ml media change expression cells OD600 = 1.8-4.4 (22h), expression 7h (2h 37°C, 5h RT) 3 a/b/c (16°C/RT/37°C) hFcIgG-eYFP-humanPrdm9A-∆ZnF0

BL21-AI direct inoculation from fresh trafo mixture culture size 100ml media change expression temperature growth 5h at 37°C, then o/n RT, then 37°C for 5h. OD=0.5-0.7 wait on RT until evening, then expression 16°C/RT/37°C

(30)

December 15, 2020 k1120217 30/109 4 hFcIgG1- mPrdm9cst -∆ZnF0 fresh clone culture size 100ml media change growth 37°C 8h until OD=0.9 expression o/n RT

5 eYFP glycerol stock

culture size 100ml media change growth 37°C o/n OD600 = 1.6 expression 6h RT Lysate preparation

Table 2: samples lysate preparation

expression construct treatment

6 eYFP-mousePrdm9cst-∆ZnF0 standard lysate preparation

7 eYFP-humanPrdm9A-∆ZnF0 standard lysate preparation

8 - I hFcIgG1-eYFP-hPrdm9A-∆ZnF0 different buffers, protocol 2 (sonication)

8 - II hFcIgG1-eYFP-hPrdm9A-∆ZnF0 Sarcosyl (S) titration in lysate buffer, different buffers, protocol 3

9 hFcIgG1-eYFP-humanPrdm9A

-∆ZnF0

standard lysate preparation

2 a eYFP-humanPrdm9A-∆ZnF0 lysate preparation protocol 2, TKZN buffer with 0.3% sarcosyl (S) or 0.5% S

2 b YFP- mPrdm9cst-Exon10 lysate prep protocol 2, TKZN 0.3%S or 0.5% S

3 a & b - I hFcIgG-eYFP-humanPrdm9A-∆ZnF0 expression temperatures 16°C, RT, 37°C, lysate prep. protocol 3

3 a - II hFcIgG-eYFP-humanPrdm9A-∆ZnF0 protocol 4

4 hFcIgG1- mPrdm9cst-∆ZnF0 protocol 4

5 eYFP protocol 4

11 eYFP-human PRDM9A ∆ZnF0 similar to expression 2a

Purification

Table 3: samples protein purification

expression construct lysate solution lysate preparation

8 hFcIgG1-eYFP-hPrdm9A-∆ZnF0 SN* 1x TBS + 0.3% sarc Sarcosyl titration, different buffers, protocol 2

4 hFcIgG1- mPrdm9cst-∆ZnF0 SN* TKZN 0.3% sarc protocol 4

(31)

December 15, 2020 k1120217 31/109 EMSA

Table 4: samples for EMSA binding reactions – concentration measured with DeNovix (see 5.5.2, Table 18)

expression sample (SN* in TKZN + 0.3% Sarc.) conc. [µM]

9 hFcIgG1-eYFP-hPRDM9A ΔZnF 2.553

2 eYFP-hPRDM9A ΔZnF, lysate prep. protocol 2 1.006

2 YFP-mPRDM9cstZnF, lysate prep. protocol 2 0.961

3 16°C hFcIgG1-eYFP- hPRDM9A ΔZnF lysate prep. protocol 3

0.429

3 16°C hFcIgG1-eYFP- hPRDM9A ΔZnF lysate prep. protocol 4

0.292

3 RT hFcIgG1-eYFP- hPRDM9A ΔZnF lysate prep. protocol 3

0.274

10 hFcIgG1-eYFP-mPRDM9cstΔZnF0, lysate prep.

protocol 4

very low

4.1.3 Cloning and expression

Vectors

For cloning and bacterial expression, the pOPIN-Y vector (MCS.selfmade) was used. It is based on the pOPIN-M vector from Addgene, which was used to exchange the maltose binding protein (MBP) tag by the enhanced yellow fluorescent protein (eYFP) and modify the multiple cloning site (MCS) for our purposes.

(32)

December 15, 2020 k1120217 32/109

Plasmids used as templates for inserts

The exact usage of plasmids as templates for different inserts and the restriction sites targeted for cloning are described in section 4.2.7.

PRDM9 inserts were introduced in between KpnI and HindIII restriction sites. The hFcIgG1-tag was inserted in between the BamHI and EcoRI restriction sites.

YFP could be cut out from the vector using XhoI restriction digest as it is located in between two XhoI restriction sites.

Competent E.coli cells

For plasmid propagation, competent XL1-blue cells from Stratagen/Agilent were used.

For protein expression, BL21-AI competent cells from Invitrogen and Rosetta™2(DE3) pLacI competent cells from Novagen were used.

Competent cells were stored at -80°C.

4.2 Methods

4.2.1 PCR

Polymerase chain reactions (PCR) were performed using a Biorad C1000 Touch Thermal Cycler. The PCR reactions were prepared by mixing a DNA template, appropriate forward and reverse primers, nucleotides (dNTPs), a DNA polymerase, a DNA polymerase reaction buffer and nuclease free water (Sigma) to the desired reaction volume in a MSC advantage hood from Thermo Scientific. For each experiment, an additional reaction containing nuclease free H2O instead of DNA template was prepared, which was called no template control (NTC). Depending on the DNA polymerase used, an appropriate PCR protocol was used for denaturation, primer annealing and extension. The annealing temperature varies depending on the melting temperature of the primers whereas the elongation time depends on polymerase speed and template length. The DNA concentration was estimated by specific absorption of DNA at 260nm with a Nanodrop instrument or with fluorometric quantification using a Qubit device. All samples were stored at -20°C until usage. For analysis of the PCR results, gel electrophoresis was used. Depending on the size of the DNA fragments, the gel electrophoresis was run on a 10% Polyacrylamide gel (small fragments) in 1x TBE at 150V for 45min or a 0.5-1% Agarose gel (fragments longer than ~1000bp) in 1x TAE at 100V for 60min. In both cases, the intercalator Gel Red® (Biotium Inc.) stained double stranded

(33)

December 15, 2020 k1120217 33/109 DNA and thereby made the amplicons visible. Gel Red could thereby be used as precast gel stain (Agarose) or as a post gel stain (PAA). The gel was visualized using the ChemiDoc™ MP imager from Biorad.

4.2.2 DNA purification

To get rid of unwanted DNA fragments, the Wizard SV Gel and PCR clean-up system Kit from Promega was used according to manufacturer's instructions. For very clean bands, the PCR product was purified directly from the PCR tube, for others, the desired band was cut out of Agarose gel.

4.2.3 Size selection

Another way of purification other than cutting out of an Agarose gel was the size selection with AMPure beads. In this Thesis, a double size-selection with a cut off size of 300-700bp was performed. The polyethylenglycol concentration is thereby the relevant component to estimate the size selection. First, it should be gotten rid of high molecular weight products. Therefore, the beads were brought to room temperature and mixed in a bead:DNA ratio of 0.65 by pipetting up and down to mix thoroughly. After a 15 min incubation time at room temperature, the beads were placed on a magnetic rack for 5 minutes to transfer the supernatant to new tubes afterwards. High molecular weight products were now bound to the beads so they were discarded. Now, a bead:DNA ratio of 1.0 was derived by adding fresh undiluted beads to remove fragments smaller than 300bp. After a thorough mixing, the samples were incubated 15 min at room temperature, again placed on the magnet for 5 min to separate the beads from the solution. This time, the supernatant was removed carefully before the beads bound to DNA fragments of the right size (300-700bp) were washed with 80% ethanol for 30s. The ethanol was discarded and this step was repeated for a total of two washes, making sure to remove all traces of ethanol in the last wash. The beads were dried for 5 min at room temperature, afterwards nuclease free H2O was added and after a 5 min incubation, the supernatant containing the DNA was transferred to a new tube.

4.2.4 Cloning

Cloning reactions

The insert of interest was produced either using PCR (PCR program see Table 5) catalyzed by the

Phusion U DNA polymerase creating overhangs which could be digested by restriction enzymes or, if already available, by restriction enzyme digest of a plasmid containing the insert. The target vector

(34)

December 15, 2020 k1120217 34/109 was linearized using the same restriction enzymes. The digest was performed with high fidelity enzymes from NEB together with the corresponding cut smart buffer according to the manufacturer's instructions with incubation at 37°C for 90 min or overnight. Afterwards, the target vector was de-phosphorylated using Antarctica Dephosphatase and Phosphatase buffer from NEB by incubation at 37°C for 15 min followed by 80°C for 20 min. Both the insert and target vector were applied to a gel electrophoresis (0.8% Agarose, 100V, 90min) to be able to be cut out for a purification using the Wizard SV Gel and PCR clean-up system Kit from Promega according to manufacturer's instructions. 50ng vector were ligated with a threefold molar excess of insert with T4 Ligase in an appropriate reaction mix at 16°C overnight.

Table 5: cloning - PCR program of Phusion U for primers with overhangs

94°C 30 sec 94°C 5 sec 60°C 5 sec 10x 72°C 20 sec Ramp 1.5°C/sec 94°C 5 sec 20x 72°C 15-30s/kb 72°C 5 min 12°C hold Transformation

Fresh XL-1 blue competent cells were thawed on ice before 10 µL of ligation were incubated with 50 µL of competent cells on ice for 30 minutes. The transformation mixture was heat-shocked at 42°C for 45 s before the cells were allowed to recover by adding 250 µL pre-warmed LB-medium and agitation at 37°C for at least 1 h. 200 µL of the transformation mixture were plated on pre-warmed LB-Amp plates which were then incubated at 37°C overnight.

Plasmid preparation

Single colonies were picked from the Amp plates after transformation and inoculated in 5 mL LB-Amp medium. After overnight growth at 37°C with agitation, the Promega PureYieldTM plasmid

(35)

December 15, 2020 k1120217 35/109 Miniprep System was used for plasmid preparation. An alternative protocol was used where 2x 2 mL of bacterial culture were harvested by centrifugation in a microcentrifuge for 1 min at maximum speed (~13.3k rpm) before the pellet was resuspended in 600 µL nuclease free water (Sigma Aldrich). 100µL of Cell Lysis Buffer were added and the solution was mixed by inverting the tube for six times. 350 µl cold (4°C) Neutralization Solution were added and mixed thoroughly by inverting. The tube was centrifuged at maximum speed in a microcentrifuge for 3 min before the supernatant of ~900 µL were transferred to a PureYield™ Minicolumn without disturbing the pellet. The column was centrifuged at maximum speed in a microcentrifuge for 1 min. After 200 µL Endotoxin Removal wash were added, another centrifugation step at maximum speed for 1 min took place. The minicolumn was washed with 400 µL of Column Wash Solution before a centrifugation at maximum speed for 1 min. The collection tube was emptied and the column centrifuged again at maximum speed for 1 min to let any residues of the wash solution evaporate. The minicolumn was transferred to a clean 1.5 mL siliconized tube before 30 µL nuclease free H2O (Sigma Aldrich) were added and incubated for 15min at room temperature. The tube was centrifuged at maximum speed in a microcentrifuge for 15 s to elute the plasmid DNA. The DNA concentration and quality was checked using A260 and A260/280 measurements before the sample was stored at -20°C for further use.

Plasmid Screening

The success of the cloning was determined using a colony PCR and a Control Digest.

Colony PCR

For performing a Colony PCR, the same colony, which was picked after transformation from the LB-Amp plates, was shortly dipped into a prepared PCR master mix before using it to inoculate a new cell culture. The primer pair used in the master mix was chosen to amplify a very small region of the insert to make sure it was present in the clone. The reactions were catalyzed using OneTaq Hot Start DNA polymerase from NEB using the PCR temperature program in Table 6. To check the PCR product, it was afterwards applied on a 10% polyacrylamid gel for gel electrophoresis (150V, 50 min).

Table 6: PCR program Colony PCR

94°C 5 min

(36)

December 15, 2020 k1120217 36/109 ~60°C 20 s 30x 68°C 1 min/kb 68°C 7 min 8°C hold Control Digest

300ng of the purified plasmid was digested with restriction enzymes from NEB according to manufacturer's instructions (90 min at 37°C) before it was loaded on a 1% agarose gel and a gel electrophoresis was performed (120V, 1h) resulting in a specific band pattern. The expected fragment size could be predicted using the NEBcutter V2.0, a NEB tool, depending on the plasmid and restriction enzymes used. The resulting band pattern was compared with the expected fragments to check the plasmid.

4.2.5 Sequencing

The purified plasmids of the most promising clones were sent for sequencing to LGC Genomics. Primers used are listed in section 4.1.3. The sequencing results were compared with the desired insert sequence using the Sequencher DNA Analysis Software from Gene Codes.

4.2.6 Storage of cell lines as glycerol stock

To store the cell lines, a glycerol stock was produced from each cell line. Therefore, 800 µL of cell culture grown in LB-Amp medium for XL-1 Blue cells and BL21-AI cells or LB-Amp-Cam medium for Rosetta cells were mixed with 800 µL sterile filtered and autoclaved 80% glycerol in a cryo vial, mixed by inverting and stored at -80°C. If a new culture should be inoculated, a pipette tip had to be scratched over the frozen surface and then brought into medium.

(37)

December 15, 2020 k1120217 37/109 4.2.7 Production of specific clones

YFP-mPrdm9cst-∆ZnF0-pOPIN-Y Insert template DNA primers 1251bp mP9cst-FL_pT7-IRES-MycN mP9_fwd_KpnI mP9_rvs_HindIII

target Vector pOPIN-Y Restriction Enzymes KpnI-HF HindIII-HF YFP-hPrdm9A-∆ZnF0-pOPIN-Y Insert template DNA primers 1600bp hP9A-pOPIN hP9_KpnI_F ZnF_hP9_HindIII_R target Vector pOPIN-Y

Restriction Enzymes

KpnI-HF HindIII-HF

hPrdm9A-∆ZnF0-pOPIN-Y

YFP was cut out of the clone YFP-hPrdm9A-∆ZnF0-pOPIN-Y, which was in between two XhoI restriction sites using the restriction digest protocol described in section 4.2.4.

mPrdm9cst-∆ZnF0-pOPIN-Y

YFP was cut out of the clone YFP-mPrdm9cst-∆ZnF0-pOPIN-Y, which was in between two XhoI restriction sites using the restriction digest protocol described in section 4.2.4.

(38)

December 15, 2020 k1120217 38/109 hFcIgG1-mPrdm9cst-∆ZnF0-pOPIN-Y Insert template DNA primers 696bp mCDH2-FcIgG_pEGFP-N1 vector BamHI-hFcIgG_F hFcIgG_EcoRI_R

target Vector mPrdm9cst-∆ZnF0-pOPIN-Y Restriction Enzymes BamHI-HF EcoRI-HF hFcIgG1-YFP-mPrdm9cst-∆ZnF0-pOPIN-Y Insert 696bp (= mPrdm9cst-∆ZnF0) from vector hFcIgG-mPrdm9cst-∆ZnF0-pOPIN-Y

target Vector hFcIgG1-YFP-pOPIN (empty vector)

Restriction Enzymes

KpnI-HF HindIII-HF

4.2.8 Bacterial protein expression

In this Thesis, the bacterial expression of several PRDM9 constructs was optimized, so several steps were changed.

Competent expression cells (BL21-AI from Invitrogen or Rosetta™2(DE3) pLacI from Novagen) were thawed on ice and 50 µL of them incubated with 25 ng of plasmid on ice for 30 min. The transformation mixture was heat-shocked at 42°C for 45 s before 250 µl pre-warmed LB-medium were added to allow the cells to recover at 37°C for at least 1 hour with agitation. 100 µl of the mixture were plated on LB-Amp plates for BL21-AI cells / LB-Amp-Cam for Rosetta cells and incubated at 37°C overnight. A single clone of the plate (or a pipet tip of a glycerol stock) was inoculated to 50-500 mL of LB-Amp medium with 0.1% glucose for BL21-AI cells / LB-Amp-Cam for

(39)

December 15, 2020 k1120217 39/109 Rosetta cells. In some cases, an overnight pre-culture of 10 mL was inoculated first to inoculate a defined cell number to the big medium flask the next day. A direct inoculation was also tried: therefore, the full 300 µL transformation mixture was directly added to growth medium without plating it out and picking a single clone.

The culture grew at 37°C shaking at 160rpm until the density reached approximately OD600= 0.5 – 1. In some cases, the growth was up to OD600=3 due to growth difficulties in the beginning and a strong overnight growth. After taking a 2 mL aliquot as t0 sample, the culture was pelletized by a centrifugation at 5000 rpm for 5 min at room temperature to decant the growth medium or IPTG was added directly to the growth medium. In most cases though, the growth medium was replaced by induction medium containing LB with 0.2% L-arabinose, 1mM IPTG and 50 µM ZnCl2 for BL21-AI cells or LB with 0.2mM / 0.5mM / 1mM IPTG and 50 µM ZnCl2 for Rosetta cells. Expression took place at 16°C / room temperature / 37°C for 7 hours or overnight shaking at 160rpm. The cells were harvested by centrifugation at 5000 rpm for 10 min at 4°C before the pellets were frozen at -80°C at least overnight.

4.2.9 Lysate preparation

Several approaches for an optimal lysate preparation were tried. In general, there were 3 fractions produced for each lysate preparation: a Supernatant wash (SN wash) which was derived by washing the pellet with buffer without Sarcosyl; a Supernatant (SN*) which was derived by using a buffer containing various amounts of Sarcosyl and from which after centrifugation the pellet was again dissolved in buffer containing Sarcosyl to gain the Whole Cell (WC*) fraction. The buffer volumes to dissolve the pellets depended on the bacterial pellet weight and were the same for all protocols according to the following table. For a quick analysis, a quick WC* was prepared by washing the pellet with 2.5 mL 1x TKZN and dissolving it in 1 mL 1x TKZN + 0.3% Sarcosyl per 0.1g pellet.

Table 7: buffer volumes according to pellet weight

fraction volume per 0.1g pellet [mL]

SN wash 2.5

SN* 1

(40)

December 15, 2020 k1120217 40/109 Several buffers were used differing in the type of salt, pH, salt concentration used and additives to improve protein solubility.

Table 8: lysis buffers

1x TKZN 1x TBS 1x Patel

10 mM Tris, pH 7.5 25 mM Tris, pH 7.4 20 mM Tris, pH 7.5

50 mM KCl 137 mM NaCl 700 mM NaCl

50 µM ZnCl2 2.7 mM KCl 25 µM ZnCl2

0.5% NP-40 5% Glycerol

0.5 mM TCEP

The TKZN buffer has a low salt concentration (50mM), contains zinc which is thought to stabilize PRDM9's zinc fingers and NP-40 as a solubilizing agent. NP-40 is a nonionic detergent and thereby mild detergent used to maintain protein functions.

TBS is a medium salt buffer (139.5 mM) without any additional detergent.

The Patel buffer is a high salt buffer (700 mM NaCl) with zinc added for zinc finger stabilization, glycerol to assist in reduction of hydrophilic interactions on protein surfaces which may interfere with the proper folding of a protein or cause agglomerates and TCEP which is a reducing agent used in crystallization trials.

Sarcosyl is added in all buffers during the course of lysate preparation in different concentrations as an anionic detergent to solubilize inclusion bodies which may occur during expression.

Protocol 1: standard protocol

The frozen bacterial pellet was thawed and resuspended in 1x TKZN buffer. After centrifugation at 5000rpm at 4°C for 10 min, the supernatant SN wash could be removed. The pellet was dissolved in 1x TKZN buffer + 0.3% Sarcosyl, followed by a centrifugation step at 5000rpm at 4°C for 10 min which resulted in the supernatant SN*. The pellet was resuspended in an appropriate volume of 1xTKZN + 0.3% Sarcosyl to gain the WC* fraction.

Referenzen

ÄHNLICHE DOKUMENTE

Since this study included a limited number of patients with severe pancreatitis, we chose to analyse the discriminatory ability of HBP by dichotomizing mild and moderately

Various kinds of predictive models have been conceived for microarray data before: predicting gene expression from TF expression [Soi03, SKB03], predicting gene expression from

Alternatively, methyl-CpG binding domain proteins (MBDs) have been proposed to bind arrays of methylated CpGs and induce chromatin compaction by recruiting histone

FIGURE 1 Single l-phage DNA molecule and l-DNA molecule complexed with minor groove binder (distamycin-A), major groove binder (a-helical peptide Ac-(Leu-Ala-Arg-Leu) 3

A conserved 21 bp region with a palindromic sequence which may constitute the binding site of ExpG was recently found in the promoter regions of expA1, expG, expD1 and expE1 (Bartels

In addition to these specific diseases, it is considered that 80-90% of all human cancers may result, in part, from unrepaired DNA damage (Doll, 1981). Changes in the DNA lead to

Pharmacognosy Department, Faculty ofPharmacy, Mansoura University, Mansoura 35516, EgyptZ.

sive phytochemical investigation of the aerial parts of the Egyptian plant as well as of the activity of taxodione as inhibitor for hepatic stellate cells to- gether with the