• Keine Ergebnisse gefunden

In this last chapter the basic principles of gene expressionwere outlined and the widely used method of microarray technology was introduced. We then discussed various analysis methods based on two different underlying models towards their biological relevance. In most cases the outcome of these methods is a list of

genes of interest. To further interpret these gene lists, a common approach is to create graphs where the nodes represent genes and the edges between the nodes represent distinct relationships between two genes. These relationships can be defined in various ways. For instance one can use gene annotation as provided by the Gene Ontology [Ashburner et al., 2000] or the MeSH database [Nelson et al., 2004]. Another reasonable approach is to use text mining tools as provided by Genomatix® [Genomatix, 2009] or protein-protein interaction networks. The advantages of the representation of genes in a network a diverse.

On the one hand a network presentation allows for a more intuitive interpretation of results, where genes related to a specific pathway or involved in a common biological process become easily visible. On the other hand graphs allow for additional analysis methods based on graph theory. These methods can be used to identify interesting network structures or over-represented motifs. In the following network representation will be repeatedly used to illustrate and to interpret the results.

2 Analyzing M-CSF dependent

monocyte/macrophage differentiation:

expression modes and meta-modes derived from an independent component analysis

2.1 Background

Since microarray technology has become one of the most popular approaches in the field of gene expression analysis, numerous statistical methods have been used to provide insights into the biological mechanisms of gene expression regu-lation. The high dimension of expression data and the complexity of the regula-tory mechanisms leading to transcriptional networks still forces statisticians and bioinformaticians to examine available methods and to develop new sophisticated approaches. However, there are already appropriate methods using different ap-proaches to examine the underlying biological mechanisms determining the gene expression signatures and profiles measured by microarray experiments. Super-vised methods using prior knowledge likeGene Set Enrichment Analysis(GSEA) deliver useful results under certain conditions. But there is still a lack of reliable data needed for non-classical analysis. Widely used unsupervised approaches, like hierarchical clustering andk-means clustering, use correlations or other distance or similarity measures to identify genes with similar behavior under similar con-ditions. But these methods are not able to represent more complex structures and interdependencies in the regulatory machinery.

In contrast to the algorithms mentioned above, independent component analy-sis (ICA) explores higher-order statistics to decompose observed gene expression signatures (GES), which form the rows of the input data matrix, into statisti-cally independent gene expression modes (GEM), which form the rows of matrix Saccording to the data model XT =AS. ICA solves blind source separation

(BSS) problems, where it is known that the observed data set represents a linear superposition of underlying independent source signals. But it can more generally be considered a matrix decomposition technique which extracts informative fea-tures from multivariate data sets like, for example, biomedical signals like EEG (Electroencephalography) [Habl et al., 2000], MEG (Magnetoencephalography) [Vigario et al., 1997] and fMRI (functional magnetic resonance imaging) [Yang and Rajapakse, 2004; Keck et al., 2004; Theis et al., 2004b] recordings. ICA can also be considered a projective subspace technique appropriate for noise reduc-tion [Tom´e et al., 2004; Gruber et al., 2006], or artifact removal [Stadlthanner et al., 2003a, 2005] if generated from independent sources.

In this work we will concentrate on the linear case, in which each single mi-croarray GES is considered a linear superposition of unknown statistically in-dependent GEM. To decompose these mixtures into statistically inin-dependent components, ICA algorithms like FastICA or JADE have been used. Typically, these GEMs can be interpreted as being characteristic of ongoing, largely inde-pendent biological regulatory processes. The philosophy behind can be expressed as:co-expression means co-regulation. But the complexity of gene regulation and the various interactions of cellular processes demands a new interpretation of our ICA-derived components. In the following we use these extracted GEMs to gen-eratesub-modes, which may provide biological pathway information. The genes contained in these pathway-associatedsub-modescan be regarded as more or less self-contained parts of larger regulatory networks, which can be represented by combining thesesub-modesinto meta-modesaccording to the functional role of the associated genes.

Here we used M-CSF dependent in vitro differentiation of human monocytes to macrophages as a model process to demonstrate that ICA is a useful tool to sup-port and extend knowledge-based strategies and to identify complex regulatory networks or novel regulatory candidate genes.

The major known pathways associated to M-CSF receptor dependent sig-naling [Shi and Simon, 2006; Pixley and Stanley, 2004; Ross and Teitelbaum, 2005] include expansion of the role of the MAP-kinase pathway [Wada and Pen-ninger, 2004; Bogoyevitch et al., 2004] and Jun/Fos, Jak/Stat and PI-3 kinase [BehreDagger et al., 1999; Fox et al., 2003; Stephens et al., 2002] dependent signal transduction. Up-regulation of immune-regulatory components involved in innate

immunity response (e.g. MHC), specific (e.g. Fcγ) [Houde et al., 2003; Vieira et al., 2002; Booth et al., 2001] and nonspecific (CRP, complement, galectins) [Sobota et al., 2005; Swanson and Hoppe, 2004; Mina-Osorio and Ortega, 2004;

Lau et al., 2005; Dumic et al., 2006] opsonin receptors as well as charge and motif pattern recognition receptors (e.g. SR-family, LRP, Siglecs etc.)[Fabriek et al., 2005; Minami et al., 2001; Beutler, 2004; Lock et al., 2004], is characteris-tic for monocyte/macrophage differentiation. Beyond this, an increase of mem-brane biogenesis, vesicular trafficking and metabolic pathways including amino acids, glucose, fatty acids and sterols, as well as increased activity of lysosomal hydrolases that enhance phagocytotic function [Desjardins, 2003; Martin and Parton, 2006], autophagy [Schmitz and Buechler, 2002] and recycling is trig-gered through M-CSF signaling as a hallmark of innate immunity [Peiser et al., 2002]. These mechanisms are tightly coupled to changes in cytokine/chemokine response [Branton and Kopp, 1999] and red/ox signaling (NOS e.g. NADPH-Oxidase, Glutathione, Thioredoxin, Selenoproteins) that drive chemotaxis migra-tion, inflammation (e.g.NfκB), apoptosis (eg. Caspases, TP53, NfκB, ceramide) and survival [Forman and Torres, 2002; Nordberg and Arner, 2001; Wang et al., 2006a,b; Cathcart, 2004; Kustermans et al., 2005; Østerud and Bjørklid., 2003].