• Keine Ergebnisse gefunden

Solution to Series 1

N/A
N/A
Protected

Academic year: 2022

Aktie "Solution to Series 1"

Copied!
5
0
0

Wird geladen.... (Jetzt Volltext ansehen)

Volltext

(1)

Solution to Series 1

1. Read in the data:

> blood <-c(62,60,63,59,63,67,71,64,65,66,68,66,71,67,68,68,56,62,60,61,63,64,63,59)

> tr <- c(1,1,1,1,2,2,2,2,2,2,3,3,3,3,3,3,4,4,4,4,4,4,4,4)

> b.data <- data.frame(cbind(blood,tr))

> b.data$tr <- as.factor(b.data$tr)

a) Plot the data and compute overall mean and group means.

> plot(b.data$tr,b.data$blood)

1 2 3 4

606570

We see that the coagulation times vary a lot between different diets whereas the variation within a diet group is quite small.

In addition compute the overall mean and the group means. Do this by hand using a calculator.

overall mean = 64

treatment group means

A 61

B 66

C 68

D 61

b) Compute the group sample variancess2i and the pooled estimate of varianceM Sres. Do this also by hand. ForM Sres compute firstSSres.

SSres=112M Sres=5.6 treatment s2i

A 3.333

B 8

C 2.8

D 6.85

(2)

We see that the estimated variance between groups is substantially bigger then the estimated variance within groups. This could indicate an effect of diet on blood coagulation time.

d) Construct an analysis of variance table. Use the R-functionaov(....).

> summary(fit.blood)

Df Sum Sq Mean Sq F value Pr(>F) b.data$tr 3 228 76.0 13.57 4.66e-05 ***

Residuals 20 112 5.6 ---

Signif. codes:

0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Compare your by hand computedSSres,SStreat,M SresandM Streatwith the output ofsummary(fit.blood).

e) Does the diet have a significant effect on coagulation time? From the output above we see that the diet has an significant effect on blood coagulation time.

F-value = 13.57

P-value = 4.65847098469477e-05

2. a) Identify the parameters in a one-way analysis of variance model. The parameters in the one-way analysis of variance modelYij =µ+Ai+ij withPAi= 0are:

µ= 7.2,A1=−2.1, A2=−0.9, A3= 0.7, A4= 2.3 andσ2= 2.82.

b) There are 25 randomly selected staff members for each group. What areE(M Sres) andE(M Streat)?

What do you conclude? E(M Sres) =σ2= 7.84

E(M Streat) =σ2+ 25· P4i=13 A2i = 7.84 + 25·3.666 = 99.5066

Therefore we can conclude that the duration of employment has an effect on the job satisfaction.

BecauseE(M Streat)is way larger thenE(M Sres).

3. Read in the data:

> N2 <- c(19.4,32.6,27,32.1,33,18.2,24.6,25.5,19.4,21.7,20.8,20.7, 21,20.5,18.8,18.6,20.1,21.3)

> strain <- c(1,1,1,1,1,5,5,5,5,5,5,7,7,7,7,7,7,7)

> r.data <- data.frame(cbind(N2,strain))

> r.data$strain <- as.factor(r.data$strain) a) Plot the data.

> plot(r.data$strain,r.data$N2)

(3)

1 5 7

202530

The variance between strains looks larger then the variance within strains. This could be an indicator for a significant difference of nitrogen contents for different Rhizobium strains.

b) Carry out an analysis of variance.

> fit.n2 <- aov(r.data$N2 ~ r.data$strain)

> summary(fit.n2)

Df Sum Sq Mean Sq F value Pr(>F) r.data$strain 2 236.6 118.28 9.723 0.00196 **

Residuals 15 182.5 12.16 ---

Signif. codes:

0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

The F-value equals 9.72. By looking at the P-value we see that there are significant differences in nitrogen contents for different strains of Rhizobium.

c) Check the model assumptions.

> par(mfrow=c(2,2))

> plot(fit.n2)

(4)

20 22 24 26 28

−10−5

Fitted values

Residuals

1

−2 −1 0 1 2

−3−2−1

Theoretical Quantiles

Standardized residuals

1

20 22 24 26 28

0.00.51.01.5

Fitted values

Standardized residuals

Scale−Location

1

52

0.00 0.05 0.10 0.15 0.20

−3−2−101

Leverage

Standardized residuals

Cook's distance

1 0.5

Residuals vs Leverage

1 5 2

From the diagnostic plots we see that there exists an outlier. On the basis of the plots, observation number1can be clearly identified as an outlier. After removing the outlier we repeat the analysis.

> rr.data <- r.data[-1,]

> fit.n2mod <- aov(rr.data$N2~rr.data$strain)

> summary(fit.n2mod)

Df Sum Sq Mean Sq F value Pr(>F) rr.data$strain 2 333.2 166.60 32.6 5.39e-06 ***

Residuals 14 71.5 5.11 ---

Signif. codes:

0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

> par(mfrow=c(2,2))

> plot(fit.n2mod)

20 22 24 26 28 30

−4−2024

Fitted values

Residuals

Residuals vs Fitted

2 7

5

−2 −1 0 1 2

−2−1012

Theoretical Quantiles

Standardized residuals

Normal Q−Q

2

7

5

20 22 24 26 28 30

0.00.51.01.5

Fitted values

Standardized residuals

Scale−Location

2 75

0.00 0.05 0.10 0.15 0.20 0.25

−2−1012

Leverage

Standardized residuals

Cook's distance 0.5

0.5

Residuals vs Leverage

2 7

5

(5)

We see that now the model assumptions are fulfilled.

Referenzen

ÄHNLICHE DOKUMENTE

1 They derived the time cost for a trip of random duration for a traveller who could freely choose his departure time, with these scheduling preferences and optimal choice of

In the last Section, we use his mass formula and obtain a list of all one- class genera of parahoric families in exceptional groups over number fields.. Then k is

If external lines are to be used then the corresponding port pins should be programmed as bit ports with the correct data direction. Finally, theCo~nter/Timer

It aims to facilitate the deployment of CSDP operations by creating a framework which allows willing member states to go ahead with an operation as efficiently

The red-green government of Chancellor Gerhard Schröder enforced promotion of electricity produced from renewable energy sources and the gradual restriction of

It is used to pass data, control and character generator information between the formatter and the printer controller.. A

Since the covariance matrix can be estimated much more precisely than the expected returns (see Section 1), the estimation risk of the investor is expected to be reduced by focusing

Using this alternative estimation approach we derive in Section 3 the condi- tional distribution of the estimated portfolio weights and the conditional distributions of the