• Keine Ergebnisse gefunden

4.3 Evaluation of HealthyPEP

4.3.1 Study methods

4.3.1.4 Data analysis

For the comprehensive evaluation of HealthyPEP, several methods including both qualitative and quantitative methods for the assessment of the data were implemented. As stated by Mittag (2006), for an efficient analysis of intervention effects in the school setting, a combination of qualitative and quantitative research methods is required. Due to the fact that different kinds of data were collected during the course of this project, the analysis was not the same for all measures. For the analysis of the process measures, qualitative and quantitative methods were used whereas the outcome measures were evaluated only with quantitative methods.

The process measures consisted of three aspects: First, the observations of the health promotion PE lessons and the regular PE lessons were analysed using a quantitative analysis procedure. Here, in the first step, the data gained from the standardised observation sheet specifically developed for this intervention study were systematically described and in the second step, t-tests were carried out to examine differences between the two groups. Second, the gained data from the interview-based evaluation of the PE lessons by the teachers were analysed using a qualitative perspective. The results gained from the interviews were systematically summarised and presented for each question stated in the interview. Third, the

Chapter 4: Intervention Study

117 data gained by the questionnaire-based students’ evaluation of the health-promotion PE programme were analysed using descriptive and inferential statistical procedures. These procedures were similar to the ones described in the following for the outcome variables. The main focus of the analysis was set on the examination of within-group differences in the timeframe of the main intervention (T1-T2) and during the follow-up (T2-T3). Additionally, between-group differences were analysed at T2 and T3 (for the detailed statistical procedure see the following text).

The examined outcome variables were analysed using descriptive and inferential statistical procedures. In the following section, the procedure of the statistical analyses of the assessed outcome variables as well as the students’ evaluation of the health promotion PE lessons are described. In a first step, gender was not taken into account. Only in a second step, since gender was an effect modifier, all analyses were carried out separately for girls and boys.

Analyses were performed using SPSS version 19.

1. Baseline group differences were tested using independent t-tests and chi-square tests depending on the level of measurement to ensure that the IG and the CG did not significantly differ at T1. The statistical significance level was set at p = .05.

2. Lost to follow-up analysis was performed using a chi-square test to examine differences between the number of dropouts in the IG and the CG at T2 and T3, respectively.

Furthermore, independent t-tests were used to analyse differences in all of the outcome variables at T1 between the dropouts and adherers at T2 and T3, respectively (Des Jarlais, Lyles, & Crepaz, 2004). Similar to the baseline group differences analysis, also here the statistical significance level was set at p = .05.

3. Missing values from the KINDL-R questionnaire for the assessment of HRQOL were calculated as the mean of the available items when at least 50% of the items of each scale were answered.

4. Short- (T1-T2) and middle-term (T1-T3) within-group differences were tested using t-tests in order to examine the direction and the stability of the intervention effects. Additionally, figures were drawn that describe the development of students in these outcome variables.

The developments need to be interpreted with caution because the numbers of students vary across the measurements and are therefore drawn with dashed lines. In these figures, T1 includes the students that were measured in the first data assessment, T2 represents the students whose data exist at the measurement T1 and T2, and finally, T3 represents the students that participated in the T1 and T3 data collection.

5. To estimate the short- and middle-term intervention effects, group differences were first examined concerning the entire study sample without separating the students by gender, and second, gender separated analyses were carried out. The group differences were calculated by ANCOVA using the baseline values (T1) of the analysed dependent variable

Chapter 4: Intervention Study

118 and baseline BMI values as the covariates (C. S. Davis, 2010; Vickers & Altman, 2001).

Concerning the differences between IG and CG at T2 and T3, the intervention was evaluated as effective when the statistical test reached a probability of error smaller than 5%. In these cases it was interpreted that a significant difference existed between the IG and the CG.

6. For the further interpretation of the within and between differences, two effect sizes were calculated: Cohen’s d and η2. Cohen’s d was estimated using the standard deviation of the entire group at T1 (Kazis, Anderson, & Meenan, 1989; Leonhart, 2004). For the within-group differences, only Cohen’s d was used and for the interpretation of the intervention effects, both effect sizes were calculated (Bortz & Schuster, 2010). Only by providing standardised effect sizes, comparisons across different measures and studies are possible.

7. Because students were allocated into IG or CG on school level, school clustering effects might occur. A multi-level analysis to examine these school effects could not be carried out because at least 30 schools would be required (Maas & Hox, 2004). Also a regression analysis with dummy variables, as recommended for smaller sample sizes (Demidenko, 2004), to estimate the explained variance by the factors group and school is not expedient because both factors are confounded to a certain extent. Therefore, ANCOVAs within each group were calculated to examine significant differences between the three IG schools and between the four CG schools on the main outcome variables in which significant intervention effects were measured. These were self-efficacy, motor performance score, and BMI.

8. Several analyses were carried out to examine whether possible moderating variables existed that influenced the relationship between the intervention programme and the outcome variables. Therefore, it was examined whether the class composition, students’

initial BMI levels, and the initial motor performance level variables had a moderating effect on the study outcomes. The class composition consisted of three groups: a) mixed-gender classes, b) only girls classes, and c) only boys classes. Concerning students’

baseline BMI levels, three categories were created. These included the “underweight”

group of students with the lowest BMI levels at baseline (BMI ≤ 16.5), the “normal weight” group (16.5 < BMI ≤ 20), and finally, the “overweight” group, which had the highest baseline BMI levels (BMI > 20). Also concerning the baseline motor performance levels, three categories were built. These included students with low motor performance levels (MP score ≤ 105), students with medium levels (105 < MP score ≤ 110), and finally, the group of students with high baseline motor performance levels (MP score > 110).

Differences between IG and CG in these subcategories were calculated by ANCOVA using the baseline values (T1) of the analysed dependent variable as the covariate and the level of significance was set to p = .05. This procedure is analogous to the concept of statistical

Chapter 4: Intervention Study

119 interaction, with the association AC varying across levels of the moderator B (Bauman et al., 2002).

9. Sum score calculations were made for motor performance and the KINDL questionnaire in order to provide an estimation of the intervention effects of the overall construct. The motor performance score was created by calculating the z-values of each motor performance test and then summing up the seven tests (stand-and-reach test was excluded from this calculation) and dividing them by seven. Concerning the KINDL questionnaire the average of the 24 items used to assess students’ HRQOL was used to define the sum score of the scale.