• Keine Ergebnisse gefunden

Cook's distance

N/A
N/A
Protected

Academic year: 2022

Aktie "Cook's distance"

Copied!
11
0
0

Wird geladen.... (Jetzt Volltext ansehen)

Volltext

(1)

Solution to Series 5

1. a) From the plots below we can derive the following:

.a Model assumptions valid.

.b Model contains strong non-constant variance.

.c Variance slightly non-constant.

.d Non-linear model (linear model shows systematic error).

> ## yy.a: scatter plots, residuals and Cook's Distance

> par(mfrow=c(2,3))

> plot(yy.a ~ xx, pch=20)

> abline(fit <- lm(yy.a ~ xx), col="red")

> plot(fit,1:5,pch=20)

0 20 40 60 80

04080

xx

yy.a

0 20 40 60 80

−202

Fitted values

Residuals

Residuals vs Fitted

11 63 100

●●

●●

−2 0 1 2

−202

Theoretical Quantiles

Standardized residuals

Normal Q−Q

6311 100

0 20 40 60 80

0.01.0

Fitted values

Standardized residuals

Scale−Location

11 63 100

0 20 40 60 80

0.000.040.08

Obs. number

Cook's distance

Cook's distance

11 100 5

0.00 0.02 0.04

−202

Leverage

Standardized residuals

●●

●●

●●

Cook's distance

Residuals vs Leverage

11100 5

yy.a: For the first model the residual plots look perfect. Only in the plot containing Cook’s distance, there are a few values that are slightly larger than the rest. These are the observations with the smallest/largest x-values. However, since those values are far from 0.5, there is no problem.

> ## yy.b: scatter plots, residuals and Cook's Distance

> par(mfrow=c(2,3))

> plot(yy.b ~ xx, pch=20)

> abline(fit <- lm(yy.b ~ xx), col="red")

> plot(fit,1:5,pch=20)

0 20 40 60 80

−50100250

xx

yy.b

0 20 60 100

−100100

Fitted values

Residuals

Residuals vs Fitted

100

7480

● ●●●●●●

●●

● ●

● ●

●●

−2 0 1 2

−302

Theoretical Quantiles

Standardized residuals

Normal Q−Q

100

7480

0 20 60 100

0.01.0

Fitted values

Standardized residuals

Scale−Location

74100 80

0 20 40 60 80

0.000.100.20

Obs. number

Cook's distance

Cook's distance

100

9698

0.00 0.02 0.04

−202

Leverage

Standardized residuals

●●

●●●●

●●

Cook's distance

Residuals vs Leverage

100 9698

(2)

yy.b: In case of the second model, we see the increasing variance with the magnitude of the fitted values in the Tukey-Anscombe-Plot. The Normal plot shows a violation of the normality assumption, even though the errors do follow a Normal distribution per definition. However, the variance is not constant which also needs to be fulfilled for the Normal plot (so that the points follow a straight line).

So the violation stems from the fact that the variance is not constant. In the scale-location plot we can also see the increase in the variance. There are no leverage points nor influential data points – even though the points with large observation numbers have larger values of Cook’s distance.

> ## yy.c: scatter plots, residuals and Cook's Distance

> par(mfrow=c(2,3))

> plot(yy.c ~ xx, pch=20)

> abline(fit <- lm(yy.c ~ xx), col="red")

> plot(fit,1:5,pch=20)

0 20 40 60 80

04080

xx

yy.c

0 20 40 60 80

−4024

Fitted values

Residuals

Residuals vs Fitted

9293 76

●●

●●

● ●

●●

●●

−2 0 1 2

−3−11

Theoretical Quantiles

Standardized residuals

Normal Q−Q

937692

0 20 40 60 80

0.01.0

Fitted values

Standardized residuals

Scale−Location

9293 76

0 20 40 60 80

0.000.10

Obs. number

Cook's distance

Cook's distance

92 93 100

0.00 0.02 0.04

−3−11

Leverage

Standardized residuals

●●

●●

Cook's distance

Residuals vs Leverage

9293 100

yy.c: For the third model, the analysis is similar as in case of the second model. This is the case because the model violations are similar. The model violation is less accentuated than in the previous example.

> ## yy.d: scatter plots, residuals and Cook's Distance

> par(mfrow=c(2,3))

> plot(yy.d ~ xx, pch=20)

> abline(fit <- lm(yy.d ~ xx), col="red")

> plot(fit,1:5,pch=20)

0 20 40 60 80

−302

xx

yy.d

−0.06 0.00 0.04

−402

Fitted values

Residuals

Residuals vs Fitted

3742 12

●●

●●

●●

●●

● ●

−2 0 1 2

−202

Theoretical Quantiles

Standardized residuals

Normal Q−Q

3742

12

−0.06 0.00 0.04

0.01.0

Fitted values

Standardized residuals

Scale−Location

123742

0 20 40 60 80

0.000.040.08

Obs. number

Cook's distance

Cook's distance

12 14 90

0.00 0.02 0.04

−3−11

Leverage

Standardized residuals

●●●●

●●

Cook's distance

Residuals vs Leverage

129014

yy.d: In case of the fourth model, the systematic error can be easily detected in the Tukey-Anscombe plot since it exhibits a U-shaped pattern. The Normal plot and the scale-location plot do not show any abnormalities. There are no influential data points but the smoother deviates from the horizon in

Referenzen

ÄHNLICHE DOKUMENTE

A usual expander gives a bipartite expander: take two copies of the vertex set for each finite graph and have an edge between vertices in different copies if and only if there is

The descriptive results demonstrated that the majority of high school students in Denmark had access to higher education within commuting distance, when they graduated from

Working Papers are interim reports on work of the International Institute for Applied Systems Analysis and have received only limited review. Views or opinions expressed herein

Hintergrund der Veranstaltung war die Idee, -gemeinsam- ein kleines Fußballturnier zwischen den Schülern der Latina, der Sekundarschule sowie den Kindern aus der

The k-irredundance number of G, denoted by ir k (G), is the minimum cardinality taken over all maximal k-irredundant sets of vertices of G.. In this paper we establish lower bounds

If you want to simulate the diagonal projection in a horizontal or vertical direction, please turn on this function. Note) You cannot turn on this function in both vertical

Käpt´n Cook baut ein Nest Ein Pinguin geht spazieren.. Was macht der Pinguin,

Where FMD*T represents the interaction between a dummy variable for a school being identified by its local education authority as being significantly affected by FMD and a set of