• Keine Ergebnisse gefunden

6.3 An exploratory example: data-driven analyses of F 0

6.3.1 FPCA

FPCA, an extension of ordinary PCA, produces a parametrisation of the set of input curves or trajectories includingF0contours in terms of a small set ofPrincipal Compo-nents(PC1, PC2, etc.), each one capturing a different and independent shape variation found in the data. The output of the FPCA is both numerical and graphical, where the former allows further statistical investigations (e.g. hypothesis tests), while the latter al-lows a graphical interpretation of results. Since the FPCA is purely data-driven and is neither based on a prosodic model nor on the perception of a human annotator, it pro-vides a complementary perspective to the analysis based on the annotation using a ToBI system. Additionally, the FPCA makes it possible to bring together the analysis of the shape ofF0contours and that of segment durations in one joint analysis (the procedure proposed by Gubian et al., 2011).

Procedure

In this thesis I follow the method proposed by Gubian et al. (2011) for the automatic analysis ofF0contours. This is an FPCA-based procedure that has been adapted to the specific needs and constraints characterising the study of speech prosody. Two main concepts are introduced in this section forming the essence of this method, namely how theF0signal is represented and how it is analysed.

6.3 An exploratory example: data-driven analyses ofF0 141 The input to the FPCA consists ofF0contours and of temporal positions of segments (syllables or morae). In order to minimise gender effects, F0values were expressed in normalised semitones according to the formula; whereti’s andn are the time positions ofF0samples and the total number of sam-ples for a specific contour, respectively. Some adjustments had to be done in order to accommodate to the absence ofF0signal during voiceless sounds, since the FPCA rep-resents contours as continuous curves without missing values. For this reason, the first fricative /s/ insumimasenwas eliminated from the analysed time interval, and internal boundaries were placed as follows: | s| mi | ma | se | n| (a strike-through line indicates the cut phone). The missingF0values for the second /s/ were linearly interpolatedF0values.

For the wordEntschuldigung, segmented as | Ent.sch ul | di | gung| , the first syllable /ent/

was eliminated because some participants did not produce it. For the same reason as above, the initial fricative was not taken into account.

By performing a first operation calledsmoothing, each of the extractedF0contours is interpolated by a continuous smooth curve represented by a mathematical function (Ramsay and Silverman, 2005). A second transformation, calledlandmark registration, aligns the curves with the syllabic and moraic boundaries. The purpose of this opera-tion is to obtain an analysis ofF0contours that is synchronised to those boundaries, so that it is possible to locate e.g. a pitch rise or fall within certain syllables or morae. This step is equivalent to a time normalisation carried out on each segment separately, but the operation is global and smooth. Contrary to most traditional approaches to time normalisation, where the analysis of normalised contours and of segment duration oc-cur separately, FPCA allows to carry out a joint analysis. This is because the difference between original and normalised durations are internally represented by smooth time warping curves, which paired to their corresponding normalised F0 curves produce a complete representation of the originalF0contours. This representation is the input to the FPCA, the actual statistical analysis tool.

Each input curve is described in terms ofPC scores, which quantify how much each PC is applicable to that particular curve. For example, PC1 may capture the variation in height of a peak in the curves, while PC2 captures its shift in time. Then each curve will be parametrised in terms of a PC1 score determining the height of that peak and a PC2 score determining its position. The crucial difference between implementing this

142 General discussion and outlook parametrisation manually and applying the FPCA is that the latter finds the principal shape traits automatically from the data itself, while in the case of ad-hoc parametrisa-tions, e.g. in terms of peak heights, slope etc., the investigator has to identify the shape traits of interests and to implement a convenient quantitative representation. Once the FPCA is completed, the analysis terminates by using PC scores as variables in an ordinary statistic (or statistic model).

Results

The FPCA procedure (see details in Appendix B) was applied on the 90F0contours from sumimasen. I considered only the first PC that explained 60.4% of the variance in their respective FPCA model. The amount of variance explained by a functional PC reflects the “importance” of a variation measured across the entire time interval under study.

Figure 6.1 shows mean PC1 scores with 95% CIs.

-1 0 1

L1 L2

language group

PC1 scores attempt

first second third

PC1 scores for sumimasen

Figure 6.1Mean PC1 scores and 95% CI error bars for each language group and for each attempt.

6.3 An exploratory example: data-driven analyses ofF0 143 The plots show a large difference between the L1 and L2 speakers’ PC1 scores and an interaction between language groups and the number of the attempt. The L2 speakers’

PC1 scores decreased in the repetitions, while those of the L1 speakers did not change across the number of the attempt. In order to relate scores to the corresponding contour shapes, Figure 6.2 shows the panel that translates the PC1 score values into effects on the shape of time-normalisedF0contours by applying Equation (1) to PC1 scores. The three lines in the plot correspond to a PC1 score value of -1, 0 and 1 in Figure 6.1. Figure 6.2 shows that PC1 modulates a variation that mainly affects the rightmost part of the curves, corresponding to the morae /se/ and /n/. The comparison between Figure 6.1 and Fig-ure 6.2 shows that the L1 speakers constantly producedF0contours with a drastic pitch fall, while the L2 speakers produced flatF0contours in the first attempt and more falling F0contours in the third attempt.The L1 speakers generated curves that exhibit a steep falling accent in all attempts, while this phonetic characteristic of the Japanese pitch ac-cent was not found in the L2 data.

Figure 6.2 Time-normalised F0 curves for the L1 speakers (left) and for the L2 speakers (right) that correspond to the PC1 scores.

The outcome visually confirmed that the variations of the L1 speakers’ F0 curves were situated within the category of a falling boundary tone, while the variations of the L2 speakers might lie across phonological categories (falling, flat and rising boundary

144 General discussion and outlook tones). This observation suggests the limitation of a data-driven analysis that does not account for perceptual categorical boundaries of the contours.