• Keine Ergebnisse gefunden

2. Theoretical background: Word boundary markers in German speech 22

2.4. Silent intervals

Silent intervals can be found at all places of an utterance. They are predictable to a certain extent. For instance, speakers tend to take breath - and thus insert a silent in-terval - at syntactic boundaries (Lieberman, 1967). However, the area we are interested in is the word level. The questions we follow-up are the following: Do speakers insert a silent interval for the sake of unambiguousness of phonemically identical, but syntacti-cally differing sequences, for instance between gut and haben to distinguish the phrase [es] gut haben/being well off from the noun Guthaben/credit balance? And if they do, will listeners be able to use this information for disambiguation when the two sequences are heard in isolation?

An interesting situation arises when a wb and a stop closure fall together (e.g. in Koda k[lingt]/coda sounds, the underscore shows the wb, as opposed toKodak/name of a company). In both sequences, there is some portion of silence after vowel /a/. The question is if the silent interval in Koda k is longer, because the stop closure coincides with the wb.

Trouvain (2004) also mentioned “perceived” silence. Syllable lengthening, for instance, is often interpreted as a silent interval, which might be due to the fact that both features often appear in unison at the beginning or at the end of IP boundaries.

2.4.1. Production

One characteristic of speech is that silent intervals at the word level appear in unusual places (focus, hesitation), also within words (Ulbrich, 2005), and they are very difficult to predict. Butcher (1981) analysed which word types follow after a speech pause in German. He reported that more pauses are to be found before function words (especially before connectors likeund/and andoder/or) than before lexical words. The reason may be that function words are often located at IP boundaries which are also the common places for a speech pause. Within the IP, the number of silent intervals before lexical words and before function words was nearly equal.

Trouvain (2004) tackled the question of speech pause duration. His aim was to predict the length of speech pauses through the evaluation of prosodic markers like pitch accent or segment duration. He asked the participants of a production experiment to read a text in three self-chosen tempi: normal, fast and slow. The examined language was German.

While he found general tendencies like reduction as well as a decrease of the number of pauses and prosodic breaks with a growing speech tempo, no predictable pattern for short, medium and long pauses could be derived.

Another uncertainty we have to deal with is the fact that silent intervals are part of the normal articulation process. Stop closures, for instance, might reach considerable dura-tions. A study of German stops based on the Kiel Corpus of Spontaneous Speech (IPDS, CD-ROMS 1995, 1996, 1997) reveals maximum closure durations between 211 and 984 ms8. We therefore assume that deliberately inserted silent intervals and articulation-induced silent intervals share a large overlapping area of duration (cf. de Pijper and Sanderman, 1994).

Stop Number Median Maximum

/t/ 11702 43.13 457.31

/d/ 7464 38.31 983.56

/k/ 2296 51.50 403.50

/g/ 2435 45.88 430.19

/p/ 1311 63.75 261.50

/b/ 2986 48.09 211.25

Table 2.1. Analysis of stop duration (ms) by label. Source: Kiel Corpus of Spontaneous Speech (as published on CD-ROM, IPDS 1995, 1996, 1997).

8We thank Henning Reetz for contributing the data.

Let us now turn to another language - English - and look at studies that are connected with our research topic. Gee and Grosjean (1983) analysed speech in order to examine if pauses and their relative durations were predictable. They used read speech that had been delivered at different speech tempi and developed a complex algorithm involving syntactic and prosodic features that enabled the prediction of pause placement and pause duration relative to other pauses of the utterance. One major finding was that a pause was longer the more other wb markers co-occurred at a break. Wijk (1987) re-analysed the results reported by Gee and Grosjean (1983) and concluded that prosodic features already contained sufficient information in order to predict pauses. Phonological words have to be kept as an uninterrupted unit, so that the next possible pause can be inserted between content words within a phonological phrase. Larger pauses are to be expected at phrase boundaries. Wijk mentioned that it is difficult to distinguish between phonological phrases and intonational phrases because, for the latter, linguists have not yet agreed about a definition, but see e.g. Nespor and Vogel (2007) and Ladd and Selkirk (1986) for discussions.

In a more recent study, Ramanarayanan et al. (2009) pointed out that grammatical pauses are significantly longer on average than ungrammatical ones. Grammatical pauses are considered as planned, ungrammatical pauses as unplanned breaks. The researchers used real-time magnetic resonance imaging and scanned the test persons’ oral tract dur-ing spontaneous speech production. In their experiment, Ramanarayanan et al. (2009) asked speakers to produce either read or spontaneous speech and than monitored the movement and interaction of the participants’ speech organs. Applying a specific al-gorithm that used criteria like jaw angle, articulator position etc., the speed of the articulators was evaluated. Ramanarayanan et al. concluded that grammatical pauses are part of the phonetic plan (cf. Levelt, 1993) while ungrammatical ones are not. Their results also show that silent intervals become more reliable as words boundary markers at certain sentence positions.

2.4.2. Perception

How do silent intervals influence speech segmentation? The research of de Pijper and Sanderman (1994) provides answers to this question. They recorded Dutch speakers who read texts. The recorded audio sequences were played to listeners who had a print-out of the texts in front of them. Listeners were asked to mark the places where they

perceived an audible boundary. They should also score the strength of each boundary using a 10-point scale. Afterwards, the authors measured the number and duration of all silent intervals in their production data and compared them with the data obtained in the perception experiment. A silent interval exceeding 100 ms resulted in high values for a prosodic boundary. The assumption that longer silent intervals indicated stronger boundaries was not found to be true. Larger durations (200-299 ms,> 300 ms) did not differ among each other significantly with respect to perceived strength.

While de Pijper and Sanderman (1994) showed how silent interval duration affects boundary peception, Repp et al. (1978) demonstrated that silent intervals also play an important role in speech segmentation. He carried out a perception study with American English compounds built from four words - gray, great, ship and chip - which listeners heard in all adjective-noun combinations possible (gray ship, great ship, gray chip, great chip). The insertion of a silent interval between gray and ship had no influence on the perception of the fricative in ship, but it supported the perception of word-final /t/ in gray so that listeners decided for great ship instead. That effect was reached when the silent interval was approximately 100 ms long. Apparently, the inserted silence was in-terpreted as the closure time of an unreleased stop.

The presented production and perception studies suggest that boundary-marking silent intervals should rather be expected at important prosodic places like IP boundaries than between words within an IP. However, the finding of de Pijper and Sanderman (1994) according to which Dutch listeners perceived a silent interval > 100 ms as a break indicator, is an intersting result which gives a good orientation for our perception experiment. However, stop closures reach similar durations, which might complicate wb identification, especially, when targets with a stop in the wb area are heard in isolation.

After this summary regarding the contribution of silent intervals to speech segmentation, we come to the next wb marker - stress.