Compensatory coarticulation, /u/-fronting, and sound change in Standard Southern British: an acoustic and perceptual study.*

(1)

Compensatory coarticulation, /u/-fronting, and sound change in Standard Southern British: an acoustic and perceptual study.*

Jonathan Harrington, Felicitas Kleber, Ulrich Reubold

*

submitted Journal of the Acoustical Society of America

(2)

General aim of this paper

To establish to what extent a sound-change in

progress, /u/-fronting in Standard Southern British (SSB) – can be linked to diminished perceptual

compensation for coarticulation in Ohala's (1993) model of sound change.

Background

A. Perceptual compensation for coarticulation B. Ohala's model of sound change

C. /u/-fronting in SSB

(3)

A. What is perceptual compensation for coarticulation?

si

su

Frequency of fricative noise

ACOUSTICS

1. Anticipatory coarticulatory lip-rounding causes spectral centre of gravity lowering in /s/

PERCEPTION

Listeners reverse the effects of

coarticulation

si su

2. Listeners know this and reverse its effect (=

compensation for coarticulation*)

*e.g. Fujisaki & Kunisaki, 1977; Mann & Repp, 1980

;

(4)

If you synthesise a continuum from /s/ to // that can be

done by spectral COG lowering and prepend it to a vowel, then listeners are more likely to perceive the SAME

synthetic token as /s/ before /u/ than before /a/

Perceptual compensation for coarticulation

s



F re q ue nc y o f no is e

(5)

+ /u/ + /a/

Listener compensates for

coarticulation (= factors out COG lowering assumed to be

attributable to /u/):

s s s s



s s

   s



(6)

B. Cental ideas in Ohala's model of sound change

Ohala: "Today's variability is tomorrow's sound change"

The origin of many sound changes is not always in the mouth of the speaker, but in the ear of the listener

Contra sociolinguists: sound change is not teleological = it's not done on purpose or for any reason, it happens by

accident because of an unintended error on the part of the listener (and also for this reason, the origin of sound change is not cognitive nor phonological)

Hypoarticulation-induced sound change: one that

arises out of the natural processes of coarticulation

and in which the listener fails to compensate for

coarticulation…

(7)

(the listener thinks: "the speaker meant to say /ci/")

Hypoarticulation-induced sound change in Ohala

Listener as speaker Speaker Listener

/c/ has been phonologised because it is planned,

produced and perceived,

even in contexts that can't be explained by coarticulation plans /ki/

[ci]

produces

acoustic [ci]

signal

compensates for coarticulation

reconstructs /ki/

Sound change:

/k/ -> /c/

plans /ci, cu/

[ci, cu]

reconstructs /ci/

(8)

C. /u/-fronting in Standard Southern British

Extensive auditory and some acoustic evidence that SSB /u/ has fronted in the last 50 years e.g., Gimson, 1966, Wells 1982,

Henton 1983, Deterding 1997, Hawkins & Midgley 2005, Roach, 1997).

([i] ‘heed’, [] ‘hoard’, [u], ‘who’d’)

/u/-fronting and possible chain-shifting in the Queen's Christmas broadcasts (Harrington, 2008)*

Harrington (2008), Laboratory Phonology IX, in press

(9)

This could be just such an example of a hypoarticulation- induced sound change. Why?

1. Taking into account word-frequency, /u/ frequently (p

≈ 0.7) occurs in a coarticulatory fronting context (e.g.

'you', 'too', 'lose', 'do', 'new').

j

u

 _F2-locus

target distance plans

/ju/ [j  ]

produces

F2

Time

Speaker

(10)

2. In an analysis of the Christmas broadcasts at 20 year intervals, the Queen, Harrington (2008)* shows just such a reduction of F2 locus-target distance:

Time F2

locus-target distance

Decade

* Harrington (2008), Laboratory Phonology 9 in press

(11)

Our extension of Ohala's model to these data and age-differences in SSB speakers is as follows:

/u/-fronting and speech perception

Acoustic input: [sn]

compensate for coarticulation

Perceived as: /sun/ /sn/

OLD listeners YOUNG listeners

(12)

/u/-fronting, speech perception and production

PRODUCTION PERCEPTION

/fj  ^d/

/fd/

[f  d]

sound change Front

Back

Young

Old

For the Old, the allophones diverge in production, but not in perception (this is the trigger for sound change)

For the Young, the allophones are aligned in perception and production and NO compensation for coarticulation

[fjd]

[fud]

/fjud/

/fud/

compensate for coarticulation

(food)

(feud)

(13)

[fjd]

[fud]

PRODUCTION PERCEPTION

/fjud/

/fud/

/fjd/

/fd/

[fd]

sound change compensate for coarticulation Front

Back

Young

Old

Predictions about age differences

1. (trivially) /u/ vowels are fronted for the Young.

2. C-on-/u/ coarticulation is greater in the Old (their /u/ allophones show greater divergence).

Production

3. Young and Old differ primarily on the back allophones (if sound change involves a shift of these to the front).

Perception

4. The Old but not the Young compensate perceptually

for coarticulation.

(14)

Experimental analysis I:

Production

(Predictions 1 -3)

(15)

30 Standard Southern British speakers recruited through University of Cambridge and University College London.

YOUNG: 14 subjects aged 18-20 (11 F, 3 M) OLD: 13 subjects aged 53-88 (7 F, 9 M)

Method: Speakers

Subjects were carefully checked to ensure that they were SSB speakers.

only 1 subject took part in the production study only

(16)

Materials

Isolated word production of words each produced 10 times

/u/

C(C) Word 

j used 269

fj feud 266

hj hewed 266 kj queued 270

f food 269

s soup 270

k cooed 264

h who'd 270

sw swoop 268

 2412

/i/

C(C) Word  j yeast 269 f feed 267 h heed 270 k keyed 270 s seep 269 sw sweep 269

 1614

//

C(C) Word  h hard 270

Recordings made in U.K. with SpeechRecorder

(Draxler & Jänsch, 2004)

(17)

Acoustic parameters

Formants calculated and F2 was checked and manually corrected.

Each F2-trajectory was reduced to a single point in a three- dimensional space formed from the first three coefficients of the discrete-cosine-transformation (Watson & Harrington, 1999; Harrington, 2006; Zahorian, and Jagharghi, 1993) We did this because we wanted to assess vowel fronting in the entire F2-trajectory (from onset to offset) rather than just at the vowel's temporal midpoint (which encodes no dynamic information).

Also with the DCT, we avoid having to make an often arbitrary

decision about the location of the vowel target.

(18)

Discrete-cosine-transformation

decomposes any signal into a set of ½ cycle cosine waves which, if summed, reconstruct entirely the original signal.

The amplitudes of these cosine waves are the DCT coefficients. Moreover, the cosine waves at the

lowest frequencies encode important properties of the trajectory's shape…

…at frequency (rad/sample)…

…is proportional to the trajectory's:

DCT-coeff

DCT-0 average

mean linear slope curvature

0 DCT-1 1

DCT-2 2

(19)

so you can use this technique to smooth formants…

620 660 700 740

140016001800

Raw

Time (ms)

F2 (Hz)

620 660 700 740

140016001800

DCT-smoothed

Time (ms)

A n a l y s i s

DCT coeffs

But the important point for this paper is that each F2

trajectory is reduced to a single point in a 3D-space which encodes a smoothed trajectory, like the one on the right .

F2 (Hz)

620 660 700 740

140016001800

Raw

Time (ms)

F2 (Hz)

620 660 700 740

140016001800

DCT-smoothed

Time (ms)

S y n t h e s i s

F2 (Hz)

(20)

For each speaker separately, we quantified the extent of /u/-fronting in this DCT space by

calculating each [u] token's relative distance to the front and back /i/ and // vowel centroids

Quantification of /u/-fronting

d

_u

= log(E

₁

/E

₂

) = log(E

₁

) – log(E

₂

) Log Euclidean distance ratio

d

_u

= 0, [u] equidistant bet. /i/ and //

d

_u

< 0, [u] nearer // (back) d

_u

> 0, [u] nearer /i/ (front)

So if following hypothesis 1 /u/ is phonetically

more back in the Old, then d

_u

should be lower for

the Old compared with the Young

(21)

2. Quantification of C-on-/u/ perseverative coarticulation

We measured separately for each speaker the Inter- Euclidean distance in the DCT space between 'swoop' and 'used'

swoop ju

ju

ju juju

ju ju

ju used

wu

wu wu wu wu wuwuwuwu

wu

DCT-0

DCT-1

swoop tokens to used centroid used tokens to swoop centroid

If the coarticulatory influences of C-on-/u/ are greater in

Older speakers (= hypothesis 2), then 'swoop' (/w/ has a

backing influence) and 'used' (/j/ has a fronting influence)

will be further apart i.e., the distances between them will

be greater than for the Young.

(22)

Hypothesis 3: the age-difference is context-specific

i.e., if sound change involves a shift of back allophones towards the front, then Young and Old should differ more on words with a non-front allophone ('food') than those with a front allophone ('feud')

We calculated the distance in the DCT-space between Old and Young speakers together separately for each word.

Thus the prediction is that the distance between e.g.

Old/Young 'food' is expected to be greater than

between Old/Young 'feud'.

(23)

Results 1: Young speakers have a fronter /u/

F1 x F2 plots at vowel midpoint

(24)

(When d

_u

= 0, /u/ is equidistant between /i/ and //) Log Euclidean distance ratio, d

_u

Front

Back

Results 1: Young speakers have a fronter /u/

(25)

Results 2: a greater C-on-/u/ influence for the Old

Averaged, linearly time-normalised F2

trajectories in used and swoop

(26)

Euclidean distance between 'used' and 'swoop'

Results 2: a greater C-on-/u/ influence for the Old

(27)

Results 3: smaller age difference for words where /u/

is in a fronting context

C has a fronting effect on /u/?

No Yes

Euclidean distance between Young and Old separately per word (i.e., the sound change involves a shift of back

allophones to the front)

(28)

Part II: Speech Perception

(29)

Method: synthetic continua

We used HLSYN to create two 13 step synthetic /i-u/ continua at equal Bark intervals by varying F2 in two sets of minimal pairs : (a) /jist/ --- /just/ YEAST---USED (p. tense)

(b) /swip/ --- /swup/ SWEEP---SWOOP

A separate group of listeners verified that the endpoints of the continua could be correctly identified.

Stimuli randomised and both continua presented in one session 5 times (5 x 13 x 2 = 130 randomised stimuli).

Forced-choice identification task: Subjects

responded with one of ''used'', ''yeast'', ''swoop'',

''sweep'' to each stimulus.

(30)

Speech perception predictions

F2 high F2 low

i u

F2 high F2 low

i u

F2 high F2 low

i u

OLD

left-shift relative to Old because they have a fronter /u/

and no (or much less) compensation for coarticulation

YOUNG

yeast- used

boundary

sweep-

swoop

boundary

(31)

so the different predicted responses are:

F2 high F2 low

i u

1. WORD: yeast-used boundary left-shifted relative to sweep-swoop (red vs blue)

2. AGE: Young left-shifted relative to Old (dash vs. straight)

3. AGE x WORD Small difference between Young vs. Old on

yeast-used (red dash vs. red straight), big difference between

Young vs. Old on sweep-swoop (blue dash vs. blue straight).

(32)

Results 1: WORD

Significantly greater proportion of /u/ responses (across

both age groups) in YEAST-USED relative to SWEEP-

SWOOP (compatibly with Mann & Repp, 1980).

(33)

Results (2): AGE

The /i-u/ boundary is significantly left-shifted (greater

proportion of /u/ responses) in YOUNG compared with

OLD speakers.

(34)

Prediction 3. Small difference between YOUNG vs. OLD on yeast-used … big difference between YOUNG vs. OLD on sweep-swoop

YOUNG OLD

Results (3): WORD x AGE

(35)

In the 1950-60s. The large phonetic separation between allophones of /u/ in production required coarticulatory compensation to realign them as the same category in perception.

For whatever reason, young listeners give up on

compensating for coarticulation. They do not attribute [  ] to context but presume that /  / (not /u/) was

intended by the speaker. Thus, /  ^{/ becomes}

phonologised...

…leading to [  ] in their own productions even in contexts like 'food', 'move' where it has no coarticulatory explanation.

Discussion I: /u/-fronting and sound-

change in terms of Ohala's model.

(36)

But which comes first?

Ohala: first give up on compensating for coarticulation, then there is a realignment in speech production (= sound change): loss of coarticulatory compensation in perception results in sound change in production.

Or perhaps: first there is sound change and then there is a loss of perceptual compensation for

coarticulation? This story would be compatible with

exemplar theory (Pierrehumbert, 2003) as follows:

(37)

1. /u/ in Standard British English occurs most often (about 70% of the time) after consonants with a high F2 locus ('you', 'too', 'lose', 'do', 'new')

2. According to exemplar theory, this imbalance in lexical and phonological frequency is the trigger for sound

change – i.e., the infrequently occurring back [u] in low F2-locus contexts (e.g., 'move', 'swoop') is perceptually unstable and so will be inclined to shift towards the more frequently occurring front allophone [  ^].

3. As this shift occurs in production, there will be less need to compensate for coarticulation in perception, because the

allophones will be in progressively closer alignment. So this story has: sound change first, then loss of perceptual

compensation.

/u/-fronting, sound change, exemplar theory

(38)

A way forward may be to investigate compensatory

coarticulation in young children. Perhaps children learn to compensate for coarticulation relatively late (Ohala, 1993) – perhaps they are more inclined than adults to perceive

(incorrectly) an allophone as intended by the speaker.

We might then have an alternative explanation for why sound change is so often led by the young.

According to the sociolinguistics, this is intentional: the young lead sound change because they want to sound different from their parents/elder generation.

Compensatory coarticulation, /u/-fronting, and sound change in Standard Southern British: an acoustic and perceptual study.*