• Keine Ergebnisse gefunden

It is a major goal to model biological processes and mechanisms on a level that allows the accurate simulation of the process and to make predictions on its perturbation. A less far-fetching goal is to interpret sets of large-scale measurements in the context of such biological

4.5 Conclusion 59 processes in order to assess whether the data sheds light on its workings. According to our experiments the current gene set enrichment methods and the network based analysis methods are not sufficient for any of these goals. Even if set enrichment methods would be perfect in identifying the relevant sets of genes, their further use for the analysis of data and/or biological processes is very limited and the user is left alone with these tasks.

Literally hundreds of publications stop at this point of printing long lists of ”statistically significant” GO categories. Network based enrichment methods use more prior knowledge to improve the ranking of the relevant categories while network search methods use a given network to provide more insights into the internal structure of the data but without using the functional annotation. But there is no way to combine already obtained enrichment results into the network analysis, as the network search methods only use the experimental data and networks as inputs. Our experiments show that current network search methods have severe limitations to really mechanistically interpret the data and a biological process as they lack detail or focus, or both at the same time.

RelExplain is a simple significant area search method, which allows to compactly assem-ble and represent the evidence of the measured data for the prior knowledge availaassem-ble on a given biological process bp in question. RelExplain can work with different kinds of net-works and several sets of heterogeneous measurements and integrates them into a concise network model forbp. This model via the RelExplain score and via visual inspection allows to directly assess the available evidence in the context of the available prior knowledge.

RelExplain is algorithmically simple, very fast and can work with very large networks. It is robust in the sense that the resulting models are compact and focused to the process in question, but at the same time not excluding possible alternative or redundant paths.

Moreover, the RelExplain models serve as entry points for interactive in depth analysis of both the underlying networks and the analyzed measured data. This is facilitated via extending the network by alternative and suboptimal paths as well as exploring network neighborhoods all, of course, in the context of the available experimental data.

Chapter 5

Modeling of the Changes during Yeast Heat Shock Response

Motivation

The following chapter describes an integrative approach that, in contrast to RelExplain, also provides quantitative predictions. It is used to model the protein abundance changes over time in response to a mild and severe heat shock in yeast.

The central dogma of biology states that the DNA is transcribed into mRNA, that is in turn translated to build proteins which fulfill some function in the cell. In many cases, however, it is not that simple and further mechanisms impact one or more steps in the dogma. New high-throughput methods such as ribosome profiling allow to measure the outcome of the individual steps. Integrating these new techniques with other high-throughput datasets can be used to model the changes in a system to identify additional regulatory mechanisms that might deviate from the central dogma or influence the efficiency of the individual steps.

Here, we analyze data on three different stages of the central dogma: gene expression, translation by ribosome profiling and protein levels. The gene expression and translation of many genes is changed for both the mild and severe heat shock, but the corresponding protein levels remain similar to the unstressed cell.

To analyze whether this is nevertheless consistent with the central dogma or if some additional regulatory mechanism is needed to explain this inconsistency, we modeled the changes downstream of the gene expression both qualitatively and quantitatively. The most parsimonious fit was achieved when an increased degradation for translationally upregu-lated and decreased degradation for translationally downreguupregu-lated proteins was assumed.

This would indicate that the altered protein stabilities are compensated for by the changed gene expression and subsequent translation to achieve protein homeostasis.

Publication

The contents of this chapter have not been published yet. A manuscript focusing on the biological implications of the described data and results is in preparation.

Author Contributions

Christopher Stratil performed the microarray measurements and prepared the samples for the unfractionated proteomic measurements. Moritz M¨uhlhofer prepared the samples for the fractionated proteomics measurements, and together with Christopher Stratil per-formed the ribosome profiling measurements. Nina Bach perper-formed the proteomics mea-surements. Stephan Sieber supervised the proteomics meamea-surements. Martin Haslbeck and Johannes Buchner supervised all other measurements.

Gergely Csaba preprocessed the ribosome profiling and proteomics data. Evi Berchtold analyzed all data and performed the integrated analysis and modeling. Ralf Zimmer su-pervised the data analysis and modeling.

5.1 Introduction

Adapting to a suddenly changed environment or stress is a crucial ability of all organisms.

Metabolic processes need to be adapted to the new conditions to function optimally and detrimental effects have to be mitigated. Unicellular organisms need an especially fast stress response as they are in direct contact with the changing environment instead of being contained in the relatively stable environment of a tissue or organ. The response to heat of Saccharomyces cerevisiae is one of the best studied stress response systems.

A number of physiological changes of the yeast cells occur when the temperature is increased above the optimal growth temperature of 25-30C: The cell cycle of yeast cells is arrested in the G1 phase, the cell wall and membrane dynamics change and proteins aggregate as they are misfolded [97].

Already in 1998 and 2000, the first high-throughput microarray measurements were done by Eisen et al. [25] and Gasch et al. [36] to analyze the expression changes upon vari-ous types of stress. Most of the analyzed types of stress showed massive changes in the gene expression affecting hundreds of genes. Using a hierarchical clustering, they could show that a large set of genes show a similar pattern of activation in different types of stress.

These genes can be divided into 300 stress-activated and 600 stress-repressed genes and are together called the environmental stress response (ESR). The repressed genes are involved in various growth related processes, like RNA metabolism and nucleotide biosynthesis, and ribosome protein genes. In contrast, the activated genes were often uncharacterized or involved in carbon metabolism, detoxification of reactive oxygen species, cellular redox reactions, cell wall modification, protein folding and degradation, DNA damage repair, fatty acid metabolism, metabolite transport, vacuolar and mitochondrial functions, au-tophagy, and intracellular signaling. For heat shock specifically, they observed that many

5.2 Data 63