• Keine Ergebnisse gefunden

Processing of Biological Data

N/A
N/A
Protected

Academic year: 2022

Aktie "Processing of Biological Data"

Copied!
2
0
0

Wird geladen.... (Jetzt Volltext ansehen)

Volltext

(1)

Processing of Biological Data

Prof. Dr. Volkhard Helms Winter Semester 2018-2019

Saarland University Chair for Computational Biology

Exercise Sheet 4

Due: January 10, 2019 23:59 Submission

• You are advised to work in groups of two people. If necessary, we will suggest teammates.

• Submit your solutions on paper in Room 3.03, E2 1 or better send an email with a single PDF attachment tomaryam.nazarieh@bioinformatik.uni-saarland.de. Late submissions will not be considered. In any case, hand in all source code via mail. Please also include your output. Otherwise you will loose points.

• Do not forget to mention your names/matriculation numbers.

• You are free to use any programming language to solve the problems. The usage of libraries that allow you to circumvent implementing the algorithms asked for will not grant you points.

Exercise 4.1: Basics of Deep learning methods (50 points, 10 points for each subtask.) (a) What are the benefits of utilizing neural networks in deep learning?

(b) How does the back propogation algorithm achieve the targeted output in the context of a deep learning method?

(c) Formulate linear functions for the values of the nodes in the hidden and output layers as a function of the incoming nodes, see Figure 1.

(d) Determine the values at hidden and output layers ifx1= 1,x2= 2,w1= 2,w2= 4,w3= 3, w4= 5,w5= 1,w6= 2,w7= 3,w8= 3, see Figure 1.

(e) Determine the mean squared error if the observed outputs arey1= 58 andy2= 60.

Figure 1: A graphical representation of an example neural network.

(2)

Exercise 4.2: TensorFlow (50 points)

TensorFlow is a machine learning framework that was developed by Google technology company.

It is used to design, build and train deep learning models. Basically TensorFlow is a flow of multidimentional data arrays that communicate.

Install TensorFlow by following the instructions available for Ubuntu, Windows, macOS, and the Raspberry Pi athttps://www.tensorflow.org/install/.

Download the code given in the supplementary. The code uses the Fashion MNIST dataset.

(a) Submit with your solution a plot showing the first 36 images from the training set and display the class name below each image. (10 points)

(b) Build up two neural network models separately by considering 64 and 256 nodes in the first dense layer. Describe in your solution what parts of the code need to be changed for this and submit the modified code line (s). (20 points)

(c) Compare the performance of these two models on the test dataset. Report the accuracies in your solution. (10 points)

(d) Determine and report the confidence of the model for the 1st and 10th images in the test dataset. (10 points)

Have fun!

Referenzen

ÄHNLICHE DOKUMENTE

The surface normal vector is {0, 0, 1} and the dipole moment vectors are given in the file Supp-Q- 4::dipole-moment-vector-Protein with the corresponding

In the second part of the assignment, you will implement a simple but powerful peak detection algorithm and apply it to yeast cell–cycle gene expression data by Spellman et

Plot the original data and the filtered data using a surface plot (maybe multiply the convo- lution result by −1 in order to have a better look at the result). using python

◦ Data cleaning: fill in missing values, smooth noisy data, identify or remove outliers, and resolve inconsistencies. ◦ Data integration: using multiple databases, data cubes,

In the first part of the assignment you will implement and apply a classical clustering algorithm to preprocessed methylation data for different cell types across blood and

Perform Gene Ontology Analysis on all the found genes with FDR-corrected p-value smaller than 0.05 and return the list of top 10 GO terms for Biological Process and KEGG pathways

Input data: binary matrix of MA data; dimension 1200 x 334 probes PCA identifies local clusters that are characteristic. for particular

• Divide the data points into 20 blocks of equal length, compute and report the average on each block (block averaging). Is the difference betweeen the two simulations