• Keine Ergebnisse gefunden

The data and methods used in this thesis are based on laboratory experiments. I can not cap-ture all important principles and list a few basic experimental methods for the identification

of transcription factor binding sites as well as for the determination of long range chromatin interactions like promoter-enhancer interactions (PEIs). For more details please have a look at a textbook like [16].

2.2.1. Determination of TFBSs

The determination of transcription factor binding sites (TFBSs) is important for the identi-fication of the preferred binding site profiles of a certain factor and in turn for the compu-tational prediction of binding sites in the sequences of interest. In the following, I present exemplary the nuclease protection footprinting as a method for the determination of protein binding DNA sequences.

Nuclease protection footprinting Nucleases are enzymes that cut nucleic acids. A com-monly used nuclease in the context of biotechnology is DNase I that cuts one strand of double stranded DNA. If the DNA is bound by proteins the bound regions are protected from a nuclease cleavage. This property is used in the nuclear protection footprinting. One end of the DNA strand is marked (e.g. radioactively labelled) and afterwards the DNA is exposed to a nuclease (e.g. DNase I). The DNA strands are randomly cut by the nuclease and the labeled strands are separated by size in an electrophoresis (see Figure 2.9). The regions bound by the protein cannot be accessed by the nuclease resulting in a lack of DNA strands of particular size (footprint) in the electrophoresis ([16], page 777) .

2.2.2. Determination of promoter-enhancer interactions

In the following I give a brief overview of the idea for the determination of long range chro-matin interactions like promoter-enhancer interactions (PEIs). According to the present state of the art, such long-range interactions are determined by the chromosome conforma-tion capture.

Chromosome conformation capture One of the most popular techniques to determine the topological structure of chromatin is the chromosome conformation capture (3C) method. The method identifies long distance DNA regions that are close to each other in the interphase chromatin enhancer-promoter interactions. The general idea of the method is rather simple: In the first step, the chromatin is fixed using e.g. formaldehyde. This chemical introduces covalent bonds (crosslinks) between DNA and the bound proteins. In the next step, the DNA is digested, either by endonucleases like HindIII or BamHI or in a chemical way, followed by the ligation of the free DNA ends. Afterwards, the number of newly created junctions is quantified and statistically evaluated in order to differentiate noise from real signal. Based on the original 3C method, several further methods have been developed which differ in their coverage and general detection aim. In the original 3C method, one can only determine whether two DNA regions of interest are interacting

footprint fragment length radioactive labels

DNase cuts transcription factor

Figure 2.9.: Nuclease protection footprinting. Two sets of the same DNA fragments are which radioactive labels are cut with DNase I. One of the set contains the transcription factor of interest while the other set is not bound by proteins. After DNase cleavage, the DNA fragments are separated according to their length by a gel electrophoresis and the lack of bands (footprint) of the protein containing DNA set indicates the transcription factor binding site. (Figure based on ([16], page 777))

with each other. In 4C, the contacts between the region of interest and genome-wide DNA fragments were determined (one vs all), where in 5C genome wide interactions were predicted (all-vs-all) [28]. Two newer extensions (all vs all)of 3C method are Hi-C and ChIA-Pet and are explained in the following.

Hi-C Hi-C is one of the latest extended 3C method. The first steps are (as in the original 3C) the fixation of DNA and DNA cleavage using restriction enzymes. However, before the religation takes place, the ends are filled with biotin-labeled nucleotides and the DNA is purified and sheared and a pull down is performed by using a biotin-antibody. Thereby, only the ligated DNA fragments are considered in the following analysis steps. The pull-down is required, because in contrast to the original 3C method, no primers that could be used for PCR are specified. Afterwards, the reads are mapped back to genomic regions, the number of ligations of long-distance DNA regions are counted and a matrix of fragments is created where an entry refers to the number of counts of the links between the respective fragments. Applying a statistical analysis to this matrix results in the determination of significant genome-wide long distance interacting DNA regions [28].

ChIA-Pet A new generation of 3C experiments combines the Hi-C methodology with chromatin immunoprecipitation sequencing (ChIP-Seq). In this method, all potential con-nections between DNA fragments are predicted in a genome-wide manner (all-vs-all) that are bound by a given DNA interacting protein. The overall workflow follows the 3C methodology, fixation of DNA, cleavage and religation. Afterwards, the ligated DNA frag-ments were pulled down using an antibody against the protein of interest. However, it cannot be determined whether the protein of interest is responsible for the chromatin inter-action or just linked to one of the corresponding sequences. The method is restricted in a way that only those DNA fragment connections are determined that are associated with the used protein [28].