• Keine Ergebnisse gefunden

2.4 Characteristics of Japanese

2.4.4 Intonation

I employ the term intonation and prosody roughly in the same way. Here I outline studies on the associations between intonation and functions such as information structure. For detailed phonetic descriptions and analyses of Japanese intonation, see Beckman & Pierrehumbert (1986); Pierrehumbert & Beckman (1988); Sugito (1994b); Venditti (2000); Igarashi et al. (2006); Igarashi (2015). Also, I only discuss units smaller than the clause; I do not discuss discourse structure although there are many interesting interactions between intonation and discourse structure in Japanese (e.g., Nakajima & Allen 1993; Venditti & Swerts 1996; Murai & Yamashita 1999; Koiso et al. 2003; Okubo et al. 2003; Koiso & Ishimoto 2012). I focus on studies on intonation units and information structure.

2.4.4.1 Definition of intonation unit

Before reviewing the previous literature, I briefly discuss how an intonation unit is defined. The definition of intonation unit makes use of a labeling system for Japanese prosodic information called X-JToBI, which has already been annotated in the Corpus of Spontaneous Japanese. I discuss X-JToBI in the following para-graph, and introduce intonation units afterwards.

2.4.4.1.1 X-JToBI and intonational phrases X-JToBI (Maekawa et al. 2002; Iga-rashi et al. 2006) is based on J-ToBI, proposed in Venditti (1997; 2000) – which is itself modified from ToBI (Tones and Break Indices), a labeling system for English prosody (Silverman et al. 1992; Pitrelli et al. 1994; Beckman & Elam 1997).

Here I mainly discuss the break indices (BI) tier of X-JToBI since this is the most relevant feature for intonation units. The BI labelings are determined by human annotators and represent the strength of the prosodic boundaries (Maekawa et al. 2002; Igarashi et al. 2006). BI labelings basically consist of 1,2, and3.281 cor-responds to a word boundary,2corresponds to an accentual-phrase boundary, and3corresponds to an intonational-phrase boundary. An intonational phrase consists of more than or equal to one accentual phrase. An accentual phrase con-sists of a pitch contour with a single F0peak. Intonational-phrase boundaries are the place where a pitch reset occurs; if the pitch range of the current accentual phrase is smaller than the next accentual phrase, an intonational-phrase bound-ary is identified in the current accentual-phrase boundbound-ary.

Below is an example of an intonational-phrase boundary (label3), the bound-ary type most relevant to our study. Figure 2.1 shows the pitch contour of the utterance in (93).

(93) aoi blue

yane-no roof-gen

ie-ga house-nom

mieru visible

‘A house with the blue roof is visible.’

The vertical lines in the figure across the pitch contour indicate the peak and the bottom of F0. A contour with a single pitch peak corresponds to a single accentual phrase. Comparing the first (aoi‘blue’) and the second (yane-no ‘roof-gen’) accentual phrases, the pitch range of the second is smaller than the first one; i.e., downstepping occurs in the second accentual phrase. Downstepping, a.k.a. catathesis, is “a phonological process by which the [pitch] range is com-pressed after a lexical accent” (Venditti (2000: 17), see Poser (1984); Beckman &

28In addition, there are diacritics:m, -, p. There are also labels for disfluency; word fragments, fillers, and so on. See Igarashi et al. (2006) for a detailed description.

Pierrehumbert (1986); Pierrehumbert & Beckman (1988); Kubozono (1993)). In Fig-ure 2.1, the first accentual-phrase boundary is not an intonational-phase bound-ary. On the other hand, comparing the second (yane-no‘roof-gen’) and the third (ie-ga‘house-nom’) accentual phrases, the second pitch range is smaller than the third one. Therefore, the second accentual-phrase boundary is an

intonational-phrase boundary.412 7章 韻律情報

[図]図7.61参照。

࠻࡯ࡦጀ ! ! ! "# ! $ #""#! ! ! ! $! ! ! #""# ! ! ! ! $ ! ! #""#! ! $ ! ! ! ! #"

ಽ▵㖸ጀ ! ! % &! ! ! ! ! ' ( % )! ! ! * ) &! ! ! '! ! ! ! * + ! ! % , ! ' * -! ! ! .

න⺆ጀ ! ! ! %&/' ! ! (%/)*! ! ! ! )& ! '*/ ! +% ! ,'*/-.

01 2 3 4 3 2 4

ޟ㕍޿ደᩮߩኅ߇⷗߃ࠆ㧔ࠕࠝࠗࡗࡀࡁࠗࠛࠟࡒࠛ࡞㧕ޠ

᦭ᩭࠕࠢ࠮ࡦ࠻ฏ߇5ߟㅪ⛯ߒߡ޿ࠆ߇ޔ4ߟ⋡ߩࠕࠢ࠮ࡦ࠻ฏߢ࠳࠙ࡦࠬ࠹࠶ࡊߩലᨐ߇ᢿߜಾࠄࠇޔࡇ࠶࠴࡟ࡦ

ࠫߩ᜛ᄢ߇↢ߓߡ޿ࠆߚ߼ޔ0164ߣߥࠆޕ

7.61 BI=3

:有核アクセント句の連鎖

7.2.2.4 1+

[機能]

語境界(BI=1)とアクセント句境界(BI=2)の中間を示す。

[説明]

つまりアクセント句境界の有無が不明のとき(BI=1BI=2か迷うとき)用いる。

[図]図7.62参照。

࠻࡯ࡦጀ ! ! "# 78 ! ! ! $ ! ! #"

ಽ▵㖸ጀ , ! % (! ! ! .! ! , ! '! ! ! + ! % ! ! &! ! (! ! & ! '! ! 9! ! ! ! %

න⺆ጀ ! ! ! ,%(.,' +% ! &(&/'! ! ! ! ! 9%

01 3 3: 3 4

ޟ⌀↱⟤߇ᵒ޿ߛ㧔ࡑ࡙ࡒ࡛ࠟࠝࠗ࠳㧕ޠ

ήᩭᢥ▵ߣᩭᢥ▵߇㓞ߒޔߘࠇࠄ߇Ⲣวߒߡ޿ࠆ߆ޔ2ߟߩࠕࠢ࠮ࡦ࠻ฏ᭴ᚑߒߡ޿߆ߩ್ᢿ߇㔍ߢ޽ ߚ߼0163:߇↪޿ࠄࠇߡ޿ࠆޕ

7.62 BI=1+

[注意]

1+に対応するトーン層には,いかなるラベルも付与しない。

BIの値の判断に迷う場合は,BI=1+p7.2.2.5節参照)のように,判断を困難にさせる音声学的根拠 を「+」記号の後に明示することが推奨される。

Figure 2.1: An example of annotation of BI (Igarashi et al. 2006: 412)

2.4.4.1.2 Intonation unit Based on X-JToBI, Den et al. (2010) and Den et al.

(2011) propose the definition of intonation unit which I will employ in this study.

They call it short utterance-unit as opposed to long utterance-unit, but I use the term “intonation unit (IU)” throughout since I do not discuss long utterance-units.

An intonation-unit boundary is identified where there is an intonational phrase (the boundary labelled as3in CSJ) discussed above, a clause boundary,29 or a pause equal to or more than 0.1 seconds. As discussed in Enomoto et al. (2004), it is difficult for human annotators to agree when deciding on intonation-unit boundaries based on the system proposed in Du Bois et al. (1992) and Iwasaki (2008). Den and his colleagues made it possible to identify intonation units in spontaneous speech consistently across annotators.

In the following section, however, I review studies on various kinds of into-nation units including those defined Du Bois et al. (1992); Maekawa et al. (2002);

Iwasaki (2008); Den et al. (2011). Also, whereas prominence marking, down-step-ping, and boundary pitch movements are more popular topics than intonation units, I review those studies in relation to the current study. See Venditti et al.

(2008) for an overview of such studies.

29To be more precise, this is a long utterance-unit boundary. See Den et al. (2011) for the definition of this unit.

2.4.4.2 Intonation units and related phenomena

In this section, I present a review of the literature on the association between prosodic units and related characteristics of language. Note again that the review includes various kinds of prosodic units based on slightly different definitions, although they agree in many cases.

2.4.4.2.1 Prominence and downstepping Prominence and downstepping are crucial features in determining intonation units. It is well known that a focus receives prominence (pitch peak). Pierrehumbert & Beckman (1988: 99–101) re-port that “sequences with focus on the noun almost always had an intermedi-ate phrase [i.e., intonational phrase] boundary between the adjective and the noun[...] an intermediate phrase boundary blocks catathesis [i.e., downstepping]”.

The conclusion was reached through production experiments where subjects were asked to produce a sequence of an adjective and a noun with different focus positions. The target sentences and contexts used by Pierrehumbert and Beckman are like the ones in (94). The capital letters indicate that those words are in focus, and the bold-faced letters indicate that they are the target of analysis.

(94) Q: [In America,] are there sweet beans or carrots like there are in Japan?

A: amai sweet

NINZIN-wa carrot-top

ari-masu-ga exist-plt-though

amai sweet

MAME-wa bean-top ari-mase-n

exist-plt-neg

‘There are sweet CARROTS, but there aren’t sweet BEANS.’

(Pierrehumbert & Beckman 1988: 59) Pierrehumbert and Beckman showed that there is an intonational phrase (i.e., in-termediate phrase) boundary between the adjective (amai‘sweet’ in (94-A)) and the noun (mame‘bean’ in (94-b)) when the noun is a focus, as in (94). Although the results are complicated, they conclude that their generalization applies to both accented and unaccented words.30

30Kubozono (2007) compared two definitions of downstepping (syntagmatic and paradigmatic) and investigated whether a pitch reset occurs before the focus. He found conflicting results:

from a syntagmatic perspective, the focus receives higher pitch than the preceding phrase, which indicates that downstepping is blocked. From a paradigmatic perspective, on the other hand, he had to conclude that downstepping is not blocked before the focus. The present study employs the definition of syntagmatic downstepping and assumes that the conclusions in Pier-rehumbert & Beckman (1988) and Kubozono (2007) do not contradict each other. See Kubozono (2007) for detailed discussion on this issue.

2.4.4.2.2 Focus projection There has been a cross-linguistic question of how human beings distinguish broad focus and narrow focus: the issue of focus pro-jection. This has been investigated for English, German and Dutch (Selkirk 1984;

Gussenhoven 1983). Ito (2002), who investigated this question in Japanese, com-pared the response time and acceptability of each of the intonation types in (95-A1-A3) followed by a broad focus question like (95-Q). The capital letters indicate the phrases whose pitch range is expanded.

(95) Q: yokoyama-kun-wa

‘What will Mr.Yokoyama do when he gets a bonus?’

A1: kare-wa

‘He starts (scuba) diving.’

A2: kare-wa

‘He starts (scuba) diving.’

A3: kare-wa

‘He starts (scuba) diving.’ (Ito 2002: 412)

Ito found that “though dual prominence [like (95-A1)] is preferred for answers to broad focus questions, utterances with a single intonational prominence on the object [like (95-A2)] may be comprehended equally quickly as those with dual prominence” (op.cit.: 413) – where A1 is significantly more acceptable than A2.

Also, she reports that the response time and acceptability of the A3-type do not significantly differ from those of A1 and A2. She concluded that “it is possible that the relation between argument structure and intonational focus marking is not universal” (ibid.).

Kori (2011) investigated the intonation of broad and narrow focus and reports that, by default, only the first word receives pitch peak, whereas the following word is suppressed – although some speakers put prominence on the second word too. (96-a) is the target sentence that he asked participants to read aloud and (96-b-c) are the contexts. In (96-b-c), bothaoi‘blue’ andmahuraa‘scarf’ are focused, because both of them contrast with ‘red’ and ‘gloves’ or ‘sweater’, re-spectively. In (96-d),aoi‘blue’ is narrowly focused because it is the only element that contrasts with ‘red’, while ‘scarf’ is not contrasted.

(96) a. aoi blue

mahuraa-dat-ta-n-desu scarf-cop-past-nmlz-cop.plt

‘(It) was a blue scarf.’

b. I ordered red gloves, but I receiveda blue scarf. (Broad focus) c. I ordered a red sweater, but I receiveda blue scarf. (Broad focus) d. I ordered a red scarf, but I receiveda blue scarf. (Narrow focus) Kori concludes that the default intonation for broad focus is to suppress the sec-ond word (mahuraa‘scarf’ in this case) because most of the participants produced the sentences as such, although some participants chose the sentence with promi-nence both onaoi‘blue’ andmahuraa‘scarf’ when they were asked to choose a good sentence.

2.4.4.2.3 Functional and cognitive motivations for intonation units Iwasaki (1993), applying the style of IU identification proposed in Du Bois et al. (1992) and Chafe (1994) to Japanese, argues that a Japanese intonation unit corresponds to a phrase rather than a clause, in contrast to the English IU, which corresponds to a clause according to Chafe (1987; 1994). According to Iwasaki’s survey, 42.2%

of IUs in Japanese are clausal, whereas 57.8% are phrasal. Their intonation unit is a “stretch of speech uttered under a single coherent intonation contour” (Du Bois et al. 1992: 17). Iwasaki (1993: 39) states that the beginning of an IU “is often, though not always, marked by a pause, hesitation noises, and/or resetting of the baseline pitch level”, whereas the ending of an IU “is often, again though not always, marked by a lengthening of the last syllable.” Iwasaki (1993) provides (97) to exemplify how intonation units in Japanese correspond to phrases. Each line in (97) corresponds to a single intonation unit and (97-a-e) as a whole consist of a single proposition, “I heard that broadcast at home with my family.”

(97) a. atasi-wa-ne:*

1sg-top-fp

‘I, you know...’

b. uti-de home-loc

kii-ta-no-ne?

hear-past-nmlz-fp

‘heard at home, you know...’

c. sono that

are-wa-ne?

that-top-fp

‘that thing, you know...’

d. hoosoo-wa-ne?

broadcast-top-fp

‘that broadcast, you know,’

e. kazoku-de.

family-with

‘with my family.’ (Iwasaki 1993: 40)

The pitch and intensity of (98) are shown in Figure 2.2 from Iwasaki (2008: 109), in which the same example and figure are explained. The IU (98-a) ends with final vowel lengthening, whereas boundary pitch movements are observed in the ending of IUs (98-b-d), which are indicated by “?”. (98-e) ends with a final lowering, indicated by “.”.

Iwasaki distinguishes between four types of ”functional components”:

(98) Four functional components

a. Lead (LD)such as fillers, which have no substantial meaning.

b. Ideation (ID), which conveys the content of speech.

c. Cohesion (CO)such as conjunctives andwa, which relate the previ-ous and the current IUs.

d. Interaction (IT)such asne‘fp’ andyo‘fp’, which are associated with communication.

Based on this, he shows similarities among different IUs. For example, (99-a) is an IU which only contains an NP followed by particles, whereas (99-b) is an IU which only contains a VP, also followed by particles. The structure of these two IUs is essentially the same in terms of functional components, although they are different in terms of grammatical structure.

(99) a. [mami-ni-dake]

Mami-dat-only ID

[-wa]

-top CO

[-ne]

-fp IT b. [ik-ase-ta-rasii]

go-caus-rep ID

[-no]

-nmlz CO

[-yo]

-fp IT

‘(I heard that she) let only Mami go.’

Iwasaki analyzed his data based on his classification and found that more than 80% of the IUs consist of two or less functional components. He states that “this might be due to the limitation of work that the speaker can handle within one IU. [...] Japanese speakers [...] are faced with a constraint which permits them to exercise up to two functions per intonation unit” (p. 49).

By contrast, Matsumoto (2000: 68) reports that “one clause comprises an aver-age of 1.2 IUs” and argues that “the clause is the syntactic exponent of Japanese substantive IU”. She proposes the “one new NP per IU” constraint in Japanese, comparing it to the one new idea at a time constraint in Chafe (1987; 1994). How-ever, Matsumoto (2003: §5.6) also reports that one new or given NP per IU is

Figure 2.2: Example of an intonation unit (Iwasaki 2008: 109)

preferred in Japanese conversation. Therefore, new as well as given NPs appear in an intonation unit without other NPs.

Nakagawa et al. (2010) focused on the difference between phrasal IUs and clausal IUs and analyzed them in terms of information structure. They measured referential distance and persistence (Givón 1983) and concluded that one of the functions of phrasal IUs is to introduce or re-introduce important topics in dis-course. They compare this function of phrasal IUs to left-dislocations observed in many languages.

2.4.4.2.4 Remaining issues Most studies on phonetics and phonology concen-trate on foci rather than topics. Among different focus types, most of the studies (except for those on focus projection) concentrate on narrow focus rather than broad focus. Moreover, almost all of them are experimental studies rather than corpus studies. By contrast, I focus here on the differences between broad foci and topics in spontaneous speech, although I also carry out a production experi-ment.

Previous functional studies such as Iwasaki (1993); Matsumoto (2000; 2003);

and Nakagawa et al. (2010) have methodological issues since they rely on an im-pressionistic definition of intonation units. This study, on the contrary, is based on a strict definition of intonation unit and aims at revealing associations be-tween intonation and information structure.

The results in Chapter 6 show that an intonation unit corresponds to a unit of information structure – e.g., topic or focus – which frequently but not always overlaps with a unit of the syntactic structure.

2.4.4.3 Pause

Sugito (1994a) showed in a perceptual experiment that pauses appear before pitch reset. She recorded trained announcers reading the news and had subjects lis-ten to the recording. She found that, when pauses were eliminated, subjects per-ceived the voice as though two people were overlapping with each other when the pauses were substituted by pitch resets. According to her, it is in fact impos-sible to reset pitch without pauses and vocal cords are tensed 0.1 seconds before speech production. Based on this, I assume that pauses correlate with pitch reset.