• Keine Ergebnisse gefunden

1.1 Clinical Cancer Research

1.1.4 Microarray Technology

miRNAs as well as mRNAs can be measured genome-wide that means all known miRNAs or mRNAs can be measured simultaneously. In the past twenty years microarrays (Schena et al., 1995) have become the defacto standard for large scale biomarker measurements. Besides genome-wide microarrays there are also specialized custom microarrays designed to measure a well defined set of markers.

Thereby, the basic working principle is rather simple. Genomic probes (approximately 30 up to 150 nucleotides long) are attached to a solid slide. The probes are packed at high density. Every probe has a specific sequence and is used to detect a specific mRNA or DNA part.

Since the probes can be designed to match any given sequence , microarrays can cover almost all types of genomic biomarker. SNP and tiling arrays cover DNA based markers. They are used to measure SNPs and genomic aberrations (insertions, deletions, and amplifications of specific chromosomal regions).

However, by far the most often used microarrays are microarrays for RNA quantification especially gene expression microarrays. Basically two types of gene expression microarrays can be distinguished.

cDNA- (complementary DNA(5)) or two-color arrays (Duggan et al., 1999;

Schena, 1999) were mostly used in the beginning of the microarray era. The probes (cDNA, hence the name) were spotted to a solid glass slide. The mRNA of two distinct samples was labeled with two different dyes and afterwards hybridized to the array in a competitive manner. Afterwards the fluorescent intensities are scanned in two channels, one for each dye. Based on the intensities conclusion could be drawn which sample contained more or less of a specific mRNA.

(5)Complementary DNA or short cDNA denotes single stranded DNA that is gained from mRNA via a process called reverse transcription. As the name suggests it is simply the inversion of transcription: from mRNA the complementary DNA is constructed. This is catalyzed by an enzyme called reverse transcriptase that can be found in various RNA viruses.

. Clinical Cancer Research 

Nowadays these kind of microarrays are hardly used anymore. The more precise one-color arrays have been established allowing a higher density of probes (and hence a larger number of mRNAs measurable at once) and more stable measurements. In order to allow density the probes are not spotted but shorter oligos are synthesized directly at the slide (Lipshutz et al., 1999) or are attached to silica beads assembled in microwells (Gunderson et al., 2004; Walt, 2000). While for two-color arrays it was necessary to hybridize the control at the same slide to eliminate slide effects the high reproducibility of modern microarrays make it possible to hybridize each sample (including possible controls) to an independent slide.

The principle of a one-color microarray experiment is illustrated in figure 1.5. Starting with several tissue samples, usually from a condition of interest and a reference (a typical example is a comparison of tumor against normal tissue), the mRNA of these samples is extracted and purified (and in most cases amplified to get more starting material). In a first step this mRNA is reversely transcribed to cDNA (complementary DNA) and at the same time labeled with biotin.

The biotin labeled cDNA is than hybridized to the array. The probes attached to the arrays bind to the cDNA matching their sequence. One spot on the array contains several probes with the identical sequence. The higher a gene is expressed the more mRNA and eventually the more cDNA is contained in the sample, and consequently, the more of the corresponding probes are occupied with cDNA molecules.

After scanning the array the accumulation of biotin labeled molecules cause a bright spot at the image where the cDNA has bound to the array. The signal intensity is then a measure for the gene expression. The higher the intensity of the spot the higher the expression of the corresponding gene(6).

After scanning the array and transforming the image to signal intensity values there are several pre-processing steps (cf. Stekel, 2003; Wit and McClure, 2004 for and overview on microarray analysis). Modern microarrays are designed with a certain degree of redundancy. Since the probes are rather small compared to an mRNA it is possible to design several different probes targeting the same

(6)Of course, other factors like the RNA sequence and hybridization efficacy can also influence in intensity of the spot.

 Introduction

FIGURE 1.5. The figure shows the basic workflow of a microarray experiment. Shown is a one-color mRNA (gene expression) microarray.

. Clinical Cancer Research 

mRNA. The combination of the signal intensities of all these several probes to a so called expression value of the gene is one of these pre-processing steps. Other steps include background correction and normalization steps.

Background correction procedures are used to eliminate possible unspecific background signals caused by e.g. reflections on the slide. Normalization steps include in-array and between-array normalization. In-array normalization should remove spatial effects on the array e.g. caused by a distinct dispersion of the sample on the slide. Between-array normalization is used to eliminate technical variance between the samples (e.g. slight differences in the purification or labeling process) and biological variance (e.g. general higher mRNA level in one sample).

After preprocessing the normalized expression values can be displayed in a so called gene expression matrix which is the starting point of the actual analysis and statistical inference. The rows of the gene expression matrix correspond to the genes, the columns to the samples(7). Similar to the statistical notation the number of genes is denoted with p and the number of samples with n. The expression matrix is therefore a p×n matrix. It is common to use the log2 transformed expression values for further analysis due variance stabilization properties of this transformation and an improved visualization of the transformed expression values.

The described experimental workflow is explained using the example of gene expression microarrays. However, the same principle holds true for microarrays for miRNAs and SNP arrays.

(7)Since in statistical terms the genes are the variables (the expression value of a gene would be the value of that varible) and the samples are the observations, this is contradictory to the traditional statistical notation where the variables are usually the columns and the observations the rows.

 Introduction