• Keine Ergebnisse gefunden

Songbirds are excellent auditory discriminators, irrespective of age and experience

N/A
N/A
Protected

Academic year: 2022

Aktie "Songbirds are excellent auditory discriminators, irrespective of age and experience"

Copied!
14
0
0

Wird geladen.... (Jetzt Volltext ansehen)

Volltext

(1)

Research Collection

Journal Article

Songbirds are excellent auditory discriminators, irrespective of age and experience

Author(s):

Narula, Gagan; Hahnloser, Richard H.R.

Publication Date:

2021-05

Permanent Link:

https://doi.org/10.3929/ethz-b-000478133

Originally published in:

Animal Behaviour 175, http://doi.org/10.1016/j.anbehav.2021.02.018

Rights / License:

Creative Commons Attribution 4.0 International

This page was generated automatically upon download from the ETH Zurich Research Collection. For more information please consult the Terms of use.

ETH Library

(2)

Songbirds are excellent auditory discriminators, irrespective of age and experience

G. Narula

a,b,*

, R. H. R. Hahnloser

a,b

aInstitute of Neuroinformatics, University of Zu rich and ETH Zurich, Zurich, Switzerland

bNeuroscience Center Zurich (ZNZ), University of Zurich and ETH Zurich, Zurich, Switzerland

a r t i c l e i n f o

Article history:

Received 10 March 2020 Initial acceptance 3 June 2020 Final acceptance 8 December 2020 MS number 20-00165R

Keywords:

age

auditory discrimination categorical perception critical period sensory experience songbird

Human infants but not adults possess the ability to perceive differences between non-native language phoneme categories. The predominant explanation for this age-related decline in discriminative ability is the effect of statistical learning driven by sensory exposure: phoneme categories of the native language take precedence, have a higher frequency of occurrence and may encompass category distinctions in non-native languages. Alternatively, one could explain the decline through a reduction in discriminative abilities attributable to ageing. Thus, to what extent is auditory perception influenced either by expe- rience or by age-related processes? Here, we attempted to answer this question, which cannot easily be disentangled in humans, in songbirds, which share many properties with humans: both learn the sta- tistical distribution of sounds in their environment, both possess neural circuits to process vocalizations of their own species and plasticity in these circuits is subject to critical periods. To study the effects of experience and ageing, we trained zebrafinches,Taeniopygia guttata, to discriminate short from long versions of a single zebrafinch song syllable type. Birds in four groups distinguished by their age (old versus young) and level of auditory experience (with song experience versus completely isolated from song) could learn to discriminate arbitrarilyfine differences between song syllables, although we found a trend that upholds the statistical learning hypothesis: birds with song experience performed better than birds with no experience. Furthermore, birds in all groups were able to generalize their learning to new stimuli of the same type, and they were able to rapidly adapt their learned discrimination boundaries.

Finally, we found that songbirds could accurately discriminate randomly selected renditions of a ste- reotyped adult song syllable, revealing aflexible ability to discriminate conspecific vocalizations.

©2021 The Authors. Published by Elsevier Ltd on behalf of The Association for the Study of Animal Behaviour. This is an open access article under the CC BY license (http://creativecommons.org/licenses/

by/4.0/).

The defining feature of categorical perception of human speech is the inability to detect acoustic feature variations that occur within a phonetic category, as opposed to high discriminability between features near category boundaries (Diehl, Lotto,&Holt, 2004; Eimas, Siqueland, Jusczyk, & Vigorito, 1971; Liberman, Cooper, Shankweiler, & Studdert-Kennedy, 1967; Liberman &

Mattingly, 1985). For example, Japanese speakersfind it difficult to distinguish the English phonemes /r/ and /l/ (Miyawaki et al., 1975). By contrast, infants can successfully discriminate non- native phonemes from various languages (Eimas, 1975; Eimas et al., 1971; Trehub, 1976; Werker, Gilbert, Humphrey, & Tees,

1981). This ability to perceive natural categories in non-native languages is only temporary: infants older than 10 months are as poor as adults in their discrimination performance (Kuhl, Williams, Lacerda, Stevens,&Lindblom, 1992;Werker&Tees, 1984).

A crucial open question is whether categorical perception is the result of experience or ageing or both. In support of experience- dependent mechanisms, for many researchers, language exposure is thought to affect the formation of auditory category boundaries (Abramson&Lisker, 1973;Kuhl, 2004), which agrees with theo- retical models that adopt a statistical learning perspective (Kuhl, 2004;Vallabha, McClelland, Pons, Werker,&Amano, 2007).

In contrast with the statistical learning hypothesis, an ageing brain can have a significant impact on behaviour through changes in neural structure during critical periods of synaptic plasticity (Gordon & Stryker, 1996; Hensch, 2005). For example, Pe~na, Werker, and Dehaene-Lambertz (2012) showed that infants born 3 months preterm are exposed to their native language longer than

*Corresponding author.

E-mail address:gnarula@ini.ethz.ch(G. Narula).

ORCID iDs:

https://orcid.org/0000-0002-7445-3710(G. Narula)

Contents lists available atScienceDirect

Animal Behaviour

j o u r n a l h o me p a g e : w w w . e l s e v i e r . c o m / l o ca t e / a n b e h a v

https://doi.org/10.1016/j.anbehav.2021.02.018

0003-3472/©2021 The Authors. Published by Elsevier Ltd on behalf of The Association for the Study of Animal Behaviour. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).

(3)

full-term infants, but this does not accelerate speech acquisition, indicating that age-related processes provide a bottleneck for lan- guage acquisition.

The age versus experience hypothesis is difficult to test in humans without isolating infants from exposure to native lan- guage. For this reason, we tried to disentangle the roles of experi- ence and age by studying auditory discrimination in a model system, namely songbirds (Passeriformes), which are a vocal learning group of species, like humans and whales. In most song- bird species, songs are used to signal individual identity for terri- torial defence or to signify sexual intent, and are considered vital for sexual selection (Byers&Kroodsma, 2009). Similar to humans, in songbirds, developmental vocal learning happens early in life and is triggered by exposure to adult vocalizations (Barrington, 1773; Brainard&Doupe, 2000;Doupe &Kuhl, 1999). Songbirds such as the zebra finch, Taeniopygia guttata, also show striking parallels to humans in the sensory processing of vocalizations, as exemplified by the manner in which they distinguish both human vocalizations (Ohms, Escudero, Lammers,&ten Cate, 2012,2010;

ten Cate, 2014) and conspecific vocalizations (Sturdy, Phillmore, Price, &Weisman, 1999). Some songbirds also show categorical perception of vocalizations, for example swamp sparrows, Melo- spiza georgiana(Nelson&Marler, 1962;Prather, Nowicki, Anderson, Peters,&Mooney, 2009).

Few comparative studies in songbirds have directly addressed the explicit roles of age versus experience in song perception.

Braaten, Petzoldt, and Colbath (2006)andSturdy, Phillmore, Sartor, and Weisman (2001)have previously compared juvenile songbirds to conspecific adults, as well as isolated to vocalization- experienced songbirds in auditory categorization tasks. However, the acoustic stimuli in both studies were not ideal for assessing categorical perception as described in the psychology of speech perception: Braaten et al. trained zebra finches to discriminate conspecific songs from the same songs played in reverse order, while Sturdy et al. (2001) used synthetic pure-tone stimuli.

Furthermore, neither study examined all four conditions (i.e.

young, old, experienced and isolated) simultaneously, which is a prerequisite for weighting age versus experience.

Here, we trained male and female zebrafinches following an established GoeNoGo operant conditioning paradigm (Canopoli, Herbst, & Hahnloser, 2014; Tokarev & Tchernichovski, 2014) to discriminate short from long versions of a single zebrafinch song syllable. We used this arbitrary category choice to test whether zebrafinches can categorize their native sounds into classes that seemingly carry no functional meaning.

We divided the birds into four groups factored by age and auditory experience: adult (A;>90 days old) or juvenile (J; 35 days old at the start of the experiment), with experience, i.e. exposed to song (þ), or isolated from song exposure (-) (notation: Aþ, A-, Jþ and J-).

Our experimental design was inspired by the infant versus adult comparison in theWerker and Tees (1981) study, where infants with English as their native language were exposed to a stream of phonemes belonging to a phonetic contrast (e.g. /ta/ versus /Ta/

from Hindi or /ba/ versus /da/ from English). When a phonetic boundary was crossed (e.g. /ta/ -> /Ta/), the infants were condi- tioned to respond with a head turn to indicate the change. English language infants were as good as Hindi-speaking adults in discriminating Hindi phonemes, even though the distinction be- tween /ta/ and /Ta/ has no functional meaning in English. Given our stimulus choice as an analogy to the human infant work, we hy- pothesized that more experienced and older birds (Aþ group) would fare the worst at the task, and younger, inexperienced birds (J-) would fare the best. We performed additional experiments to characterize the percepts learned by the birds.

GENERAL METHODS

Animals

We tested 34 zebrafinches, 15 males and 19 females. Birds were partitioned into four groups, factored by age and song experience.

Here, the term ‘song experience’ refers to auditory exposure to normal adult zebrafinch songs.

(1) The Aþgroup (adult with song experience; six females, three males) were raised in a cage by their parents and siblings until about 65 days posthatch, and then they were transferred to a cage with same-sex older juveniles and adults. They were introduced to the experimental set-up at an age>90 days posthatch. (2) The A- group (adult without song experience; five females, four males) were transferred to a sound isolation chamber (custom design) at 15 days posthatch with their mother and female siblings and reared until>90 days posthatch, following which they were introduced to the experimental set-up. Because female zebrafinches do not sing, the absence of adult male birds during development ensures a lack of song experience (A-). (3) The J- group (juvenile without song experience;five females, three males) were reared in song isolation (as for A- birds above) from 15 to 33 days posthatch and were subsequently introduced to the set-up at 34 or 35 days posthatch.

(4) The Jþgroup (juvenile with song experience; three females,five males) were introduced to the set-up at 34 or 35 days posthatch, after being raised by their parents (including the father) and sib- lings in the same cage.

Apart from the experimentally tested animals, another 34 birds (all female) were used as a social incentive for the tested birds (see Experimental Apparatus).

Ethical Note

All experiments were approved by the Veterinary Office of the Canton of Zurich, Switzerland. The experiments were performed under the specific licence ZH207/2013. The experimental birds were born and reared in our avian breeding facility, in agreement with the Swiss Animal Welfare Act and Ordinance (TSchG, TSchV, TVV). After the experiment they were returned to the aviary. The birds were handled only when they were taken from the aviary to their home cage in the experimental chamber and twice when their home cages were cleaned during the experiment. Apart from this, human disturbance occurred once a day when their food, water, grit and cuttle bone were replaced.

Our experiment used air puffs as an aversive conditioning stimulus. Birds learned to discriminate stimuli based on the pre- dicted occurrence of an air puff after a 1 s delay. Although the air puff pressure was kept high enough to displace birds from their perch, none of the birds showed any signs of injury or panic such as incessantlyflying around the cage or calling. Birds continued to voluntarily perform hundreds of trials per day and improve at the task over several days. In our experience viewing (via webcam) birds performing the task, we did not observe any noticeable in- crease in anxiety in/between the birds. We checked the false negative rate (FNR, excess staying on the perch and getting hit by the air puff) and false positive rate (FPR, escaping too often) on completion of training as a proxy for noticeable signs of anxiety. In general, FPR and FNR were low across birds (FPR¼0.31±0.18;

FNR¼0.14±0.1; N¼34). We found no statistically significant differences in both FPR and FNR between groups (KruskaleWallis test of difference in FPR distribution among four groups (Aþ, A-, Jþ, J-):c23¼0.9,P¼0.83,N¼9,9,8,8; KruskaleWallis test of difference in FNR distribution between groups: c23¼1.04, P¼0.79, N¼9,9,8,8).

(4)

Experimental Apparatus

We adapted a GoeNoGo operant conditioning paradigm including a component of social reinforcement (Narula, Herbst, Rychen, & Hahnloser, 2018; Tokarev & Tchernichovski, 2014).

During the experiment, all birds were housed with unrestricted access to food, water for drinking and bathing (separate), grit and cuttle bone, all provided in individual cages (30 x 30 cm and 40 cm high; Qualipet, 8305 Dietlikon, Switzerland) placed inside a custom sound isolation chamber. The cage bedding was made of dry wood chips. The light:dark cycle was 14/10 h.

The temperature in the isolation chamber was maintained at room temperature, approximately 25 oCelsius and it never exceeded 30 oCelsius. The chamber contained a speaker for playing the stimuli, a microphone for sound recordings and a webcam. The cages for the experimental bird and a female companion bird were placed adjacent to each other. Each cage contained three perches, two for unrestricted food and water access and a third (window) for viewing the other cage. This perch was equipped with a Hall sensor that measured de- viations of a magneticfield which occurred when a bird sat on the perch. These deviations were used as a signature of perch occupancy. We placed a cardboard screen with a small (15 x 15 cm) peeping window between the two cages to block the view into the other cage from all vantage points except the window perch. Experimental birds and companion birds frequently visited their window perches (henceforth referred to as‘perches’) to interact with each other.

Stimuli

Duration discrimination (training set)

For the training set (TRAIN), we created a set of 10 stimuli synthesized from the songs of an adult male zebrafinch (o7r14) from our colony. Songs werefiltered with a fourth-order Butter- worth high-passfilter (600 Hz cutoff) and digitized at 32 kHz. We collected all renditions of a syllable produced during 1 day of singing in February 2015. We computed syllable durations via thresholding of sound amplitude traces. Based on the full range of durations of the selected syllable, we defined 10 stimuli of increasing duration, ranging from 144.1 to 190.8 ms. Each stimulus Si (i¼1,2,…,10) in this set was made of a string of six syllable renditions, wherein each rendition was longer than the six rendi- tions in stimulus Si-1. Within a stimulus, the six renditions were arranged in order of increasing duration. In total, the stimulus set comprised 60 different syllable renditions (10 stimuli of six rendi- tions each). To avoid sound onset artefacts, we smoothed syllable onsets and offsets by multiplying the sound waveform in the time domain with sigmoid functions of width 16 ms. Intersyllable gaps were set to 22 ms. All stimuli were produced with MATLAB (Mathworks Inc, Natick MA, U.S.A.). Stimuli were played at 70e75 dB sound pressure level from a HarmaneKardon speaker (HKS 4 BQ 2.0, Harman Deutschland GmBH, 3098 Schliern bei Koeniz, Switzerland), preamplified with an Alesis RA150 amplifier (Alesis, Cumberland, RI, U.S.A.).

Based on the 10 stimuli, we defined two stimulus classes: the class‘short’was formed by stimuli S1to S5, and the class‘long’was formed by stimuli S6to S10. The stimuli were distinguishable by duration but also by other sound features (Fig. 1b, right), allowing birds to‘overfit’their discriminative systems.

We implemented a GoeNoGo operant conditioning paradigm using an aversive air puff that followed the playback of stimuli from one of the two classes, either short or long. We counterbalanced the

aversively reinforced class across birds. For all groups exposed to the same training set, there was no difference in learning time between short- or long-puffed birds (Appendix 1). We therefore use the terms Puff and NoPuff as class labels, irrespective of whether short or long stimuli were reinforced.

Permuted set

To create a permuted set (PERM) in which the duration of each of the five stimuli within a class was the same, we separately randomized the position of each of the 30 syllable renditionsSij(i:

syllable number [1 ->6],j: stimulus identity [1 ->5 or 6 ->10]) within each class in TRAIN (seeAppendix 2and Fig. A1).

NOV set

To test for the generalization of learned categories, we created a novel set (NOV) of 10 stimuli {S01,…,S010} from renditions of the same syllable recorded on the next day and processed in the same manner. The stimulus durations in NOV tended to be slightly shorter than in TRAIN, including near the class boundary (Fig. 1c).

Moved boundary set

To move the boundary between stimulus classes, we increased the stimulus duration that defined the boundary between short and long stimuli by 8 ms towards the long category: Stimuli S6- S8

originally defined as belonging to the long category were now deemed short (i.e. mapped to S3- S5) and three new long stimuli from the original recording to create TRAIN were added to the long category (new S8 - S10; Fig. 1c). This was termed the moved boundary set (MOV).

Randomly shuffled set

We constructed a random set (RAND) of 10 stimuli by randomly assigning the 60 song syllables from TRAIN to the Puff and NoPuff classes, resulting in a uniform distribution of durations across both classes (Fig. 1c).

Training Procedure Pretraining phase

Each trial began with a pretraining phase that lasted 3e4 days.

On thefirst day, birds could accustom themselves to the set-up, discover the perch and use it to view the social incentive. From the second day on, sitting on the perch (which led to an upward deflection of the Hall sensor signal) for more than 3.5 s triggered the playback of a stimulus (Fig. 1a). In the pretraining phase, we played the most easily distinguishable stimuli from TRAIN, namely the longest and the shortest from each class (S1from class short and S10from class long). One of the two classes of stimuli was associated with the aversive reinforcer: a puff of air delivered 1 s after the stimulus offset.

We gradually increased the strength of the puff by increasing its duration. The duration ranged from 0.03 s (light) to 1 s (strong), which we accomplished by approximately doubling the duration each day. With increasing puff strength, the probability of dis- placing a bird from the perch also increased, and as a result avoidance behaviour such as flying away from the perch to an escape perch also increased.

The probability of presenting a stimulus from the punished class was kept at 0.25. In a pilot study not included in this paper, we found that a probability of 0.5 induced anxiety in the birds, which we inferred from their escaping from the perch on almost every trial. A probability of 0.25 led to birds staying on the perch longer and more often.

(5)

Birds voluntarily performed more than 500 trials per day, and we noticed no adverse effects of the air puff. We monitored birds’ behaviour through a webcam placed in the chamber. Across all birds, significant differences between the probabilities of escape associated with the Puff and NoPuff classes appeared after about 1

week. We used the ztest of individual proportions (see Perfor- mance Measures and Criterion) to test for significance of this dif- ference in escape probability.Once differences were significant on 2 consecutive days, we switched the birds to the Training phase and maintained a puff duration of 1 s (strong).

(a) (d)

Stimulus period ....

S1 p = 3/4

p = 1/4 Bird on perch ?

S5

No puff Stay or leave?

>= 1 s

Air puff ....

S6 S10

1.08 to 1.19 s

Delay period

1.24 to 1.33 s

S1 S3 S5S6 Stimulus ID

S8 S10

Pesc1 0.8 0.6 0.4 0.2

Stimulus ID

1

0 0.2 0.4 Prob. of escape

0.6 0.8 2

3 4 5

6 dPesc = 0.4 7

8 9 10 10

20

Blocks of 100 trials

30 40 50 60 70 80

(c)

(f)

0.7 Kruskal-Wallis P = 0.21 0.6

0.5 0.4

dPesc at training completion

0.3 0.2 0.1 0 –0.1

A+

N = 9 A–

N = 9 J+

N = 8 J–

N = 8 Stimulus ID

Short Long

140

2 4 6 8 10

150 160 170 180

Stimulus duration (ms)

190 200

Training Novel

Moved boundary Randomly shuffled S1

S10

4 KHz

200 ms (b)

(e) 1

* * * * * *

Pearson correlation coeff.

0.6 0.2 0

–0.2 Duration

* P < 0.01 –0.8

0 5 10 15

Acoustic features 20

0.8 0.7 0.6 0.5 0.4

Prob. of escape (Pesc) 0.3 0.2 0.1

2 4 6 8 10

A+ (N = 9) J+ (N = 8) A– (N = 9) J– (N = 8)

Stimulus ID

Figure 1.Song syllable discrimination task solved by young, old, experienced and inexperienced zebrafinches. (a) Birds triggered the playback of auditory stimuli by perching for 3.5 s. Puff class stimuli (here long) were followed by a strong air puff after a delay of 1 s. Birds learned to escape from the perch before the arrival of the puff and to remain on the perch following stimuli from the NoPuff class. (b) Log-power spectrograms of the shortest stimulus (S1) and the longest (S10) in the training set. Each stimulus was composed of six successively longer renditions of a particular song syllable (top, highlighted in yellow). Some acoustic features were significantly correlated (*P<0.01; Pearson correlation) with duration (last column in the list of features). (c) Stimulus durations in the four sets: training (TRAIN,N¼34 birds), novel (NOV,N¼12 birds), moved boundary (MOV,N¼7 birds) and random (RAND,N¼4 birds). Mean and SD for each stimulus (across the six syllable renditions) are shown. Horizontal dashed lines show the class boundaries (Puff versus NoPuff) for the TRAIN, NOV and MOV sets. The vertical dashed line separates the two classes by stimulus ID. (d) Probability of escape (Pesc) as a function of stimulus ID and time (blocks of 100 trials) for one Jþbird. The block at which the criterion is reached (‘criterion block’) is marked by the black arrow. The graph shows the average Pesc(i) vector of the last three blocks including the criterion block. Our performance measure is dPesc, the difference between the average Puff (red) and NoPuff (blue) Pesc(i) vectors. (e) The average Pesc vector for each of four groups at training completion. (f) dPesc at training completion (either at learning criterion or after 5000 trials). Bars represent the mean and error bars represent SD within each group. Each point is an individual bird's dPesc at training completion. In (e, f): Aþ: adults with experience; A-: adults without experience; Jþ: juveniles with experience; juveniles without experience.

(6)

Training phase

The training phase was identical to the pretraining phase except that we presented all TRAIN stimuli in a pseudorandom order, with the probability of an air puff stimulus held constant at 0.25.

Completion of the training phase was indicated when birds’per- formance reached a target criterion. We chose the performance criterion to reflect both the discriminatory ability and the stability of behaviour (see Performance Measures and Criterion).

In several birds, we observed that the performance measure fluctuated from day to day. We trained these birds for a further 2e3 days after they reached the performance criterion, to obtain stable estimates of discriminatory ability. Following the training phase, birds were partitioned into groups, each performing a particular subset of the experiments.

Experimental transitions

Changes of stimulus sets were immediate. Stimulus sets were ordered based on the following rules. (1) NOV and PERM: if a bird was assigned to NOV and PERM groups, the order of NOV and PERM set presentation was chosen at random, because we wanted to average out an effect of stimulus set order (among NOV and PERM) across birds (note that in thefirst few birds tested with the PERM set, no reduction in performance was observed). (2) NOV/PERM and MOV: if a bird was tested on MOV as well as on NOV/PERM, the MOV set was always introduced after NOV and/or PERM because the class boundary in the MOV set was shifted, which could affect subsequent discrimination performance. Thus, MOV was never followed by NOV, because the bird would have to readjust its de- cision boundary twice,first after the transition TRAIN ->MOV, and then again after the transition MOV -> NOV. In our view, this readjustment would impede test performance on NOV and so would not truly test the bird's generalization ability based on TRAIN. (3) RAND group birds were not tested on any other stimulus set orders apart from TRAIN followed by RAND.

Performance Measures and Criterion

For each bird and all experimental phases, we partitioned the trials into nonoverlapping blocks of 100 trials. We chose this number to obtain sufficient trial statistics for performingztests of independent proportions. In each 100-trial block, we computed the true positive (escape) rate (PT) as the probability of escaping on Puff trials and the false positive (escape) rate (PF) as the probability of escape during NoPuff trials. Our single measure of performance in each block was the difference in escape probabilities dPesc¼ PT PF(Fig. 1d).

Within a block, to decide whether a bird escaped significantly more often on Puff trials than on NoPuff trials, we performed az test of independent proportions of the null hypothesisH0 ¼PT¼ PFand alternative hypothesisHa ¼PTsPF. To obtainzscores for the test, we applied the Yates's continuity correction. Thus, for each block we computed thezstatistic according to:

zstat¼

jPTPFj

nTnF

2nTnF

s

ε

s

ε¼

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi pq

nTþnF nTnF s

p¼ðnTPTþnFPFÞ

nTþnF ;q¼1p

wherenTis the number of Puff trials andnFis the number of NoPuff trials in each block. ThePvalue Pr½z>zstatwas computed with the normcdf function in MATLAB (Mathworks Inc, Natick, MA, U.S.A.); a block was statistically significant if thePvalue in that block was smaller than 0.01.

Statistical Tests

Performance comparisons between groups were performed with either parametric (ttest and ANOVAFtest) or nonparametric (Wilcoxon rank sum and KruskaleWallis) tests. Before performing a ttest or Wilcoxon rank sum test, we did a Shapiro test of normality.

Attest was performed when the null hypothesis of normal dis- tribution was not rejected at the 95% (P<0.05) confidence level. All tests were two tailed and paired-sample tests are mentioned explicitly. Note that, when comparing performance between NOV and TRAIN groups, only a small subset of the birds performed both sets, which prevented us from using a paired-sample test. Also, one of the TRAIN birds was part of another experiment (Narula et al., 2018) where it acted as an observer bird. As a result of having been an observer, it performed the TRAIN set discrimination faster than an average bird. For this reason, we did not include its TRAIN performance in the statistical comparison with NOV; henceN¼11 for TRAIN.

Task Completion, Criterion and Trials to Criterion

If a bird did not improve its performance dPesc for at least 5000 trials based on visual inspection of daily vectors and on statistical testing (usingztests as described above), we stopped its training.

We measured dPesc at training completion to be either the dPesc value after these 5000 trials or, when the bird's performance improved gradually, the value after reaching criterion.

We established our performance criterion to include two key features: statistically significant difference betweenPTandPFand stability of this behaviour over several hundred trials. That is, we computed the fraction of 100 trial blocks that were significant in a sliding window of eight consecutive blocks. When this fraction crossed 87.5% (i.e. 7/8 blocks), we took the last block in the window as the block at which the performance criterion was reached (‘criterion block’).‘Trials to criterion’is then simply the number of all trials performed by the bird up to and including the criterion block.

METHODS AND RESULTS

In the duration discrimination task, several birds did not reach our learning criterion within 5000 trials, which resulted in training being aborted. To test whether this happened more often in some bird groups than in others, we performed two Fisher exact tests of independence between the factor (experience or age) and training completion (or not completed). First, we found that a larger pro- portion of birds (0.71) with auditory experience completed training than those without (0.35), and the difference in proportions showed a trend towards significance (see Table 1). Second, we

Table 1

Number of birds with and without auditory experience and their training outcomes Training completed Training aborted Total

Experienced 12 5 17

Isolated 6 11 17

Total 18 16 34

Fisher exact test of an association between auditory experience and training completion: odds ratio: 4.19,N¼34,P¼0.084.

(7)

found that a similar fraction of juveniles (0.44) and adults (0.61) completed training (Table 2).

We compared the discriminative performance (dPesc) of birds at the end of training (training completion), either when birds reached the training criterion or after 5000 trials when training was aborted due to poor performance. The average Pesc curve for each group at training completion is depicted inFig. 1e. On the group level, there was no significant difference between sample distri- butions of dPesc for birds that completed training (Fig. 1f,Table 3).

We then performed pairwise comparisons between groups. Juve- niles performed marginally better than adults (mean±SD: juvenile dPesc¼0.32±0.13, N¼16; adult dPesc¼0.3±0.2, N¼18), but the difference was not statistically significant (two-samplettest:

t29.6¼-0.46,N1¼16,N2¼18,P¼0.65;Fig. 2a). Birds with auditory experience (dPesc¼0.37±0.12,N¼17) fared significantly better than those without (dPesc¼0.25±0.18,N¼17; two-samplettest:

t19.5¼2.2,N1¼17,N2¼17,P¼0.04;Fig. 2a).

We also computed the number of trials birds needed to reach the performance criterion but found no effect of either age or experience on learning speed (Table 3).

To quantify the explanatory effects of age and auditory experi- ence on discrimination performance, we fitted a general linear model to dPesc at criterion, using age and auditory experience as factors. We observed a main effect of auditory experience (Table 4).

To extend our search for possible effects of either age or expe- rience on categorization performance, we also inspected the per- formance of birds at the onset of training on the 10 stimuli, right after pretraining when they had learned to categorize the two stimuli of extreme durations (see Methods, Training Procedure, Pretraining phase). We computed the average dPesc over thefirst 800 trials (eight blocks) of TRAIN for all four groups (Aþ ¼0.18±0.13, N¼9; A ¼0.12±0.13, N¼9; Jþ ¼0.11±0.1, N¼8; J ¼0.08±0.1,N¼8) and found no group level differences in median dPesc (KruskaleWallis test:c23¼1.6,P¼0.66,N¼34).

At training onset, we also found no statistical difference between young and old birds (juvenile dPesc¼0.1±0.1, N¼16; adult dPesc¼0.15±0.2, N¼18; Wilcoxon rank sum test: W¼120, N1¼16,N2¼18,P¼0.42) and between birds with (þ) and without (-) experience (dPescþ ¼0.15±0.12, N¼17; dPesce¼0.1±0.1, N¼17; Wilcoxon rank sum test: W¼169, N1¼17, N2¼17, P¼0.41). Thus, none of the bird groups stood out in its ability to discriminate the stimuli based on familiarity with just extreme versions.

Following this negative result, we set out to characterize the categorical percept formed by the birds and to probe itsflexibility.

First, an open question was whether birds discriminated the string of six syllables in any training set stimulus as a composite pattern, or whether they extracted class-predictive information from each of the six syllables and integrated (accumulated) this information before making a decision. When we tested birds on the PERM set of stimuli (see Methods, Stimuli) we found that they integrated in- formation from each syllable, and did not use the composite pattern when making a decision (Appendix 2).

Our stimulus set consisted of actual syllable renditions, which reflected the natural within-syllable distribution of acoustic fea- tures. Apart from duration, birds could have used other features for discrimination. We used the acoustic analysis software Sound Analysis Pro (soundanalysispro.com) to compute song features such as pitch goodness, mean frequency modulation, variance of amplitude modulation and Wiener entropy (see Appendix 3, Table A1 for full list of acoustic features). Indeed, we found that features such as mean frequency modulation and variance of amplitude modulation across a syllable's time course were signif- icantly correlated with duration (Pearson correlation coefficient:

mean frequency modulation versus duration: r¼ 0.37, N¼60, P¼0.004; variance of amplitude modulation versus duration:

r¼ 0.73,N¼60,P¼3.8 x 1010;Fig. 1b, right). In previous work (Narula et al., 2018), wefitted a logistic regression classifier using these exact stimuli and stimulus features and showed how an unregularized classifier can develop strong dependence on the aforementioned correlated features, which results in very high classification performance on the training set, but poor general- ization to novel stimuli, an effect termed‘overfitting’in statistical learning literature (Geman, Bienenstock,&Doursat, 1992). Simi- larly, we hypothesized that if birds highly weighted such non- duration features, they might also show a tendency to overfit which would be contrary to the idea of a categorical percept that gener- alizes well to other instances of the same category. If the overfitting hypothesis were correct, we should observe a significant drop in performance when birds were made to generalize their percept. We tested a subset of 12 birds(5 Aþ, 3 Jþ, 2 A- and 2 J-) after they reached the criterion on TRAIN, on new renditions of the same song syllable (novel set, NOV), with durations closely matching those in TRAIN (Fig. 1c). We observed large dPesc values after switching from TRAIN to NOV (Fig. 2b, red squares) revealing a performance comparable to that with the training set, which implies good generalization. The average dPesc in thefirst 300 trials of NOV was at par with dPesc at criterion on TRAIN (average dPesc infirst three blocks of the NOV¼0.37±0.14,N¼12; average dPesc in last three blocks of TRAIN¼0.42±0.11,N¼11; two-samplettest:t20.5¼1.1, N1¼12,N2¼11,P¼0.28; Fig. 2e). Birds achieved a dPesc after reaching criterion on the NOV set of 0.44þ/0.15, which was significantly more accurate than performance on the training set (two-samplettest on dPesc at criterion NOV versus dPesc at cri- terion TRAIN:t20.7¼-;2.49,N1¼12,N2¼11,P¼0.021). Also not surprisingly, birds reached the criterion on the NOV significantly faster (trials to criterion NOV: 1.8±1.3 x 103,N¼12) than on the Table 2

Number of juveniles and adults and their training outcomes

Training completed Training aborted Totals

Adult 11 7 18

Juvenile 7 9 16

Total 18 16 34

Fisher exact test of an association between age and training completion: odds ratio¼1.98,N¼34,P¼0.49.

Table 3

Descriptive statistics and group level KruskaleWallis rank sum (KW) test for discrimination performance at training completion (dPesc) and the number of trials needed to reach the learning criterion

Experimental group dPesc at task completion

(mean±SD;N¼18/34 reached criterion)

KW test on dPesc at training completion

Trials to criterion (mean±SD) x 103

KW test on trials to criterion

(N¼9) 0.37±0.11 (N¼7/9) c23¼4.5,N¼34, P¼0.211 5.7±7.1 c23¼1.13,N¼18,

P¼0.769

A(N¼9) 0.23±0.24 (N¼4/9) 7.2±4.4

(N¼8) 0.37±0.14 (N¼5/8) 6±7.2

J- (N¼8) 0.26±0.1 (N¼2/8) 2.4,±5.6

Aþ: adults with experience; A-: adults without experience; Jþ: juveniles with experience;J-: juveniles without experience.

(8)

TRAIN (5.9±6.3 x 103 trials, N¼11; Wilcoxon rank sum test:

W¼8,N1¼12,N2¼11,P¼0.02;Fig. 2h). Thesefindings indicate that birds tended to focus their attention on duration as a discriminating feature and less on correlated features.

Fast and accurate generalization to novel stimuli suggests that birds learned the category underlying the stimuli, in some sense the‘category’defined by duration. Categorical perception in human speech postulates that vocalizations belonging to the same cate- gory (e.g. a syllable such as‘b’in English) cannot be discriminated from each other, but are easily discriminated against syllables of another, phonetically similar category (e.g.‘p’as a comparison to‘b’ in English;Eilers, Gavin,&Wilson, 1979). Essentially, perceptual categories are indivisible and clearly separated from other cate- gories. We explored whether the arbitrary duration categories learned by birds in the training phase possessed such an indivisible nature. We tested this feature of categorical perception with the MOV set of stimuli. If birds had learned an indivisible categorical percept such as phonemes in human speech, we expected to see a significant drop in performance when tested on the MOV set.

We tested seven birds on the MOV set after they reached the criterion on TRAIN (see Methods, Experimental transitions for de- tails on transition to MOV). We observed behaviour similar to that on NOV: dPesc trajectories showed no decline after stimuli were switched to MOV (Fig. 2c). In thefirst 300 trials of MOV, the birds were as accurate as in the last 300 trials of TRAIN (average dPesc MOV set¼0.29±0.18, N¼7; average dPesc TRAIN¼0.36±0.08, N¼6; Wilcoxon rank sum test:W¼29,N1¼6,N2¼7,P¼0.62;

Fig. 2f). Good early performance on the MOV set shows that the birds required very few trials to adapt their learned duration boundaries. Thus, the seemingly learned categorical percept in birds was not indivisible because the boundaries wereflexible. On the MOV set, birds achieved a dPesc at criterion of 0.39±0.15 within 2.26±1.78 x 103 trials, which was faster than the time it took them on TRAIN (TRAIN trials to criterion¼5.9±6.3 x 103, N¼6; Wilcoxon rank sum test:W¼20.5,N1¼6,N2¼7,P¼0.11;

Fig. 2i). That birds learned to rapidly categorize MOV stimuli was indeed a feat, not to be expected from naïve and inflexible re- sponses to the moved boundary. To simulate naïve responses, for each bird in the MOV set, we constructed a synthetic performance vector Pescsof length 10 (one component for each stimulus S1eS10) by removing from the measured Pesc on TRAIN the three compo- nents Pesci(i¼1,2,3), shifting the Pesc curve towards the shorter stimuli Pescsi¼Pesciþ3 (i¼1,…,7), and by adding three naïve components Pescsi¼0.5 (i¼8,9,10). This construction captures the hypothesis that birds maintain the responses to the known stimuli and are ignorant on the three new stimuli. Thus, simulated inflexible birds would get four stimuli right (MOV stimulus IDs S1-2 and S6-7) and three wrong (MOV stimulus IDs S3-5), and they would be undecided on the three remaining stimuli (MOV stimulus IDs S8- 10). On these synthetic birds with their hypothetical Pescsvector, our test very sensitively detected a drop in performance upon switching from TRAIN to MOV (Appendix 4, Fig. A2). Thus, the lack of disrupted performance of our tested birds on the MOV set em- phasizes how quickly birds can adapt their decision boundary.

Our results indicate that even though auditory experience may be slightly advantageous for accurate discrimination, birds were generally successful at (1) learning an arbitrary vocal distinction (duration of a song syllable), (2) generalizing the learned distinc- tion and (3) flexibly relearning decision boundaries. The good generalization to new stimulus sets argues for the existence of a perceptual system that canflexibly learn stimulus categories, even when these are subject to relatively fast change, unlike the rigid phonemes in human speech.

We then asked whether this perceptual acuity andflexibility in songbirds could allow them to discriminate a random shuffle

among all syllable renditions in the training set, such that each class contained both short and long syllables (Fig. 1c, green curve). The aim of this experiment was to test whether birds can treat each stimulus class relationship independently when there is no feature- based definition of a category. We hypothesized that if birds adapt well (and quickly), it would indicate that they have aflexible rep- resentation of the stimuli and did not actually learn any categorical information. A significant reduction in performance would indicate that true feature-based categories were learned. We first trained four birds on TRAIN and then switched to the randomly shuffled (RAND) set. Performance at criterion on RAND was comparable to the performance on TRAIN (dPesc RAND¼0.22±0.13,N¼4; dPesc TRAIN¼0.37±0.21, N¼4; statistical test not performed due to small sample size; Fig. 2g). The criterion was achieved after an extensive learning period comparable to that of TRAIN (trials to criterion RAND¼4.32±3.15 x 103; trials to criterion TRAIN¼6.95±7.9 x 103; Fig. 2j), indicating no rapid transfer of discriminative competence to the RAND set. Most importantly, the average dPesc in thefirst three blocks after the switch to the RAND set was similar to the average dPesc in thefirst three blocks of training (RAND¼0.087±0.11,N¼4; TRAIN¼0.037±0.15,N¼4).

The longer time taken to reach criterion on the RAND set in com- parison to MOV and NOV sets reinforces the view that, to some extent, birds learned duration categories and did not memorize stimuli in terms of acoustic features that were spuriously correlated with duration (otherwise learning of RAND might have been as fast as for the MOV and NOV sets).

Because only male zebrafinches can produce the chosen stimuli (song syllables), we also examined whether the sex of the birds had an influence on discrimination but did not find any statistically significant difference in performance between males and females, both experienced and isolated, on the TRAIN set (Appendix 5, Fig. A3).

DISCUSSION

We have shown that adult and juvenile zebrafinches with or without experience of learned vocalizations can readily learn to discriminate sequences of a single song syllable type. We used duration as an arbitrary definition of‘category’and found a small but significant main effect of auditory experience on discrimination performance, which is in line with results obtained in songbirds on pure tone pitch discrimination (Njegovan&Weisman, 1997;Sturdy et al., 2001). Sturdy et al. (2001), for example, found that zebra finches reared in isolation showed deficits in both absolute and relative pitch discrimination compared to zebrafinches raised with siblings and adults of both sexes.

We observed high variability in performance, with some A- birds (adult song isolated, without experience) learning just as well as the best Jþbirds (juvenile with experience). This good learning in isolated birds agrees with a study by Phillmore, Sturdy, and Weisman (2003) who trained black-capped chickadees, Poecile atricapillus, reared either in the wild or in isolation, to discriminate vocal distance cues from chickadee song and zebrafinch calls. They found that both chickadees with and without experience discrim- inated accurately.

Unlike in discrimination studies using synthetic pure-tone stimuli presented to black-capped chickadees (Njegovan &

Weisman, 1997), budgerigars,Melopsittacus undulatus, and zebra finches (Dooling, Best,&Brown, 1995), we chose natural stimuli that were generated from zebrafinch song syllables without sig- nificant postprocessing. Our stimuli allowed the learning of dis- criminants along acoustic features other than duration.

Arguably, such natural stimuli make the task easier because they are more ethologically relevant. Our chosen natural stimulus

(9)

dPesc on moved boundary (MOV) 0 0.2 0.4 0.6 0.8 1

–0.2

50

0 100

100 trial blocks

150 200 250

dPesc on randomly shuffled (RAND)

0 0.2 0.4 0.6 0.8 1

–0.2

50

0 100

100 trial blocks

150 200

0.7 0.6

P = 0.65 P = 0.04*

0.5 0.4

dPesc at training completion

0.3 0.2 0.1 0

Adult (A) N = 18

Juv (J) N = 16

+ Exp N = 17

– Exp N = 17

dPesc on novel (NOV)

–0.2

0 50 100

100 trial blocks

150 200 250

0 0.2 0.4 0.6 0.8 1

dPesc

0.1 0.2 0.4 0.6

0.3 0.5

0.7 P = 0.28

0

Train (N=11) NOV (N=12)

0.1 0.2 0.4 0.3 0.5 0.6

P = 0.62

0

Train (N=6) MOV (N=7)

Trials to criterion

0.5 1.5 2

1 2.5

×104

P = 0.02*

0

Train (N=11) NOV (N=12)

0.5 1.5 2

1 2.5

×104

P = 0.11

0

Train (N=6) MOV (N=7)

dPesc first 3 blocks

0.25 0.2 0.15 0.1 0.05 0 –0.05 –0.1 –0.15

Train (N=4) RAND (N=4)

0.5 1.5

1 2

×104

0

Train (N=4) RAND (N=4) (a)

(c)

(e) (f) (g)

(h) (i) (j)

(d) (b)

Figure 2.Discrimination performance on training, novel, moved boundary and randomly shuffled stimulus sets. (a) Performance (dPesc at training completion) of adults and juveniles, with and without experience, on the training set (TRAIN) (two-samplettests). Bar height represents mean and error bars represent SD. (b) Performance (dPesc) on the

(10)

distributions were likely to match the perceptual bias inherited from the genome. It is well known that there is a strong genetic component to song learning in species such as zebrafinches (Feher, Wang, Saar, Mitra,&Tchernichovski, 2009;Mets&Brainard, 2019), white-crowned sparrows,Zonotrichia leucophrys nutalli(Marler&

Tamura, 1964; Nelson & Marler, 1993) and chaffinches, Fringilla coelebs(Thorpe, 1958), demonstrating a propensity to process own- species vocalizations in songbirds.

One particular reason why songbirds’ auditory system is genetically primed to recognize, filter and categorize species- typical vocalizations is that auditory feedback plays an important role in successful song learning in males (Brainard&Doupe, 2000;

Konishi, 1965). Since female zebrafinches do not sing and, there- fore, auditory feedback plays a lesser role in self-evaluation, we might expect isolated females not to learn as well. However, we did notfind such a trend but instead found roughly equally good per- formance in isolated males and females. We speculate that one evolutionary reason why isolated females would perform equally well is that females need to listen and pay attention to male songs to choose a suitable mate (Byers&Kroodsma, 2009).

Birds were able to accurately memorize stimuluseresponse re- lationships that were generated by a random assignment of stimuli to classes, showing that they can categorize individual vocalizations in an experimentally induced, arbitrary manner. This ability sug- gests that birds may be capable of recognizing not only individuals in the wild (Falls, 1982;Gentner, 2004;Gentner&Hulse, 1998) but even subsets of renditions of an individual's songs.

Zebrafinches are known tofind it more difficult to classify song syllables into pseudocategories than into natural categories (Sturdy et al., 1999,2000). Given that we found relatively slow learning of randomly classified song syllables (TRAIN ->RAND) compared to fast and accurate generalization of learning from duration cate- gories (TRAIN -> NOV), this suggests that our duration-based stimulus classes were perceived as natural categories by our birds.

The good learning performance in some A- birds may be explained by theoretical considerations of the neural un- derpinnings of perceptual decision making. It has been suggested that the transformation of stimulus representations between pri- mary sensory areas such as the nucleus ovoidalis and secondary sensory areas such as the caudomedial and caudolateral meso- pallium (Jeanne, Thompson, Sharpee,&Gentner, 2011) can act as a random projection of stimuli onto many dimensions (Babadi &

Sompolinsky, 2014). The synaptic transformations between these areas would be naïve and random in birds without exposure to song. Provided the neural activity patterns representing the stimuli are sparse enough, these projections can increase the signal-to-

noise ratio for effective stimulus discrimination. In other words, birds without experience may be good discriminators when their sensory cortices recruit sufficient neurons with sufficiently low activity.

We showed that songbirds were able to generalize their knowledge to novel instances of the same stimulus. Similar ease of generalization has been shown in adult zebrafinches (Geberzahn&

Deregnaucourt, 2020; Narula et al., 2018) and swamp sparrows which can discriminate between dialects of neighbouring species (Nelson&Marler, 1962). Our work suggests that the generalization ability is not limited to a stringent critical period but instead con- tinues into adulthood, regardless of experience. Learning in our task was flexible because birds could rapidly relearn already ac- quired decision boundaries, demonstrating their capability to adapt to new environmental contingencies. For example, in a territorial defence context, songbirds may use a single acoustic feature to categorize song/call types from a large set of individuals belonging to a neighbouringflock. In a mate recognition or courtship context, a male songbird might use acoustic features from the distance calls of a specific female to define one category and features of calls from other females nearby as multiple (or a single ‘other’) categories.

This would be a scenario similar to discriminating the RAND set in our experiment. Overall, these experiments suggest that the audi- tory system of songbirds canflexibly assume a learned categorical state, supported by results fromJeanne et al. (2011)who showed that neurons in the mesopallium of the European starling,Sturnus vulgaris, respond to song features that encode learned categorical representations of conspecific song.

We used operant conditioning to train birds to discriminate the stimuli, which contrasts with other categorical perception experi- ments based on the detection of aggressive behaviour (Nelson&

Marler, 1962), which have been carried out on swamp sparrows but not zebrafinches. Our paradigm also contrasts with the head turn paradigm used to elicit responses to vocalizations from human infants (Werker&Tees, 1984). Both these latter paradigms make no explicit use of reinforcement, since they are dishabituation exper- iments in which reinforcing feedback is either absent or not contingent. In principle, these differences in paradigms could constitute a confounding factor, suggesting that categorical learning is facilitated in the presence of reinforcers, in line with Narula et al. (2018). Indeed, our experiments bear a resemblance to experiments aimed at training adult Japanese speakers in /r/ versus /l/ phoneme discrimination (McClelland, Fiez,&McCandliss, 2002;

Vallabha&McClelland, 2007). The /r/e/l/ phoneme distinction has no functional significance in the Japanese language and is difficult for Japanese adults to disambiguate. However, training with feed- back improves discrimination performance remarkably within 1500e3000 trials, which corresponds to roughly half the average number of trials to criterion in our birds. Thus, our choice to use reinforcers could have been the key factor that explains the excellent categorization performance in all bird groups and that sets our experiment apart from the pure perceptual experiments used to probe categorical perception in infants.

The statistical learning hypothesis (Kuhl, 2004; McClelland, 2014) describes the shaping of auditory perception through re- petitive exposure to the distribution of naturally occurring Table 4

General linear modelfit of age and auditory experience to performance (dPesc) at training completion

Coefficients Estimate SE T Pr(>jtj)

Intercept 0.22 0.047 5.4 6.79e-6***

Age 0.06 0.056 1.06 0.29

Experience 0.12 0.052 2.34 0.025*

*P<0.05;***P<0.001.

novel stimulus set (NOV) composed of different renditions of the same syllable (N¼12). Red squares indicate the switch from the training set to the novel set. Larger squares indicate performance at criterion blocks. For comparison, performance on the training set is shown by black dots and squares (N¼11). (c) Performance (dPesc) on the moved boundary set (MOV) of stimuli (N¼7); symbols as in (b). (d) Performance (dPesc) on the randomly shuffled (RAND) set; symbols as in (b). (e) Average dPesc in thefirst three blocks of NOV compared with dPesc at criterion on TRAIN (two-samplettest). (f) Average dPesc in thefirst three blocks of MOV compared with dPesc at criterion on TRAIN (Wilcoxon rank sum test). (g) Average dPesc in thefirst three blocks of RAND compared with dPesc (first three) on TRAIN. (h-j) Number of trials needed to reach criterion on (h) the NOV and TRAIN sets, (i) the MOV and TRAIN sets and (j) the RAND and TRAIN sets. (h-j) Wilcoxon rank sum test. (e-j) Bar height represents mean, error bars represent SE and points represent individual birds.*P<0.05.

(11)

categories present in conspecific vocalizations. Our results, in particular the main effect of auditory experience and not age on discrimination performance, suggest that this hypothesis may hold true provided the acoustic categories are behaviourally relevant, in a similar way to a word's meaning being independent of dialect, speaker identity and gender. However, the statistical learning hy- pothesis may be irrelevant for discrimination when the signals exchanged between senders and receivers are deliberately rich such as a song syllable without allocated meaning within the context of surrounding vocalizations. In this sense, it would be interesting to probe whether differences arising from age and experience are amplified when birds are trained to discriminate within an acoustic category that has behavioural relevance, for example food-begging calls produced by juvenile zebrafinches and directed at their parents. Traditionally, calls and songs have been considered functionally distinct vocalizations, as songs are learned and calls are not (Catchpole&Slater, 2008;Elie&Theunissen, 2015;

Zann, 1996). However, in juvenile zebra finches, the distinction between calls and songs is less clear.Lipkind et al. (2017)showed that juvenile zebrafinches learning to imitate a model song often modify calls before inserting them into the song motif. This sug- gests a future avenue of research would be a closer examination of the perceptual implications of discriminating call instances within a category such as the begging call.

Acknowledgments

We thank Aleksander Jovalekic and Heiko Hoerster for help with experiment planning and execution, and Joerg Rychen and Joshua Herbst for their support in data acquisition software development.

This work was supported by the Swiss National Foundation (grants 31003A_127024 and 31003A_156976) and the European Research Council under the European Community's Seventh Framework Programme (FP7/2007e2013/ERC Grant AdG 268911) to R.H.R.

References

Abramson, A. S., & Lisker, L. (1973). Voicing-time perception in Spanish word-initial stops.Journal of Phonetics, 1, 1e8.

Babadi, B., & Sompolinsky, H. (2014). Sparseness and expansion in sensory repre- sentations. Neuron, 83(5), 1213e1226. https://doi.org/10.1016/

j.neuron.2014.07.035

Barrington, D. (1773). Experiments and observations on the singing of birds.Phil- osophical Transactions, 63, 249e291.

Braaten, R. F., Petzoldt, M., & Colbath, A. (2006). Song perception during the sen- sitive period of song learning in zebrafinches (Taeniopygia guttata).Journal of Comparative Psychology, 120(2), 79e88. https://doi.org/10.1037/0735- 7036.120.2.79

Brainard, M. S., & Doupe, A. J. (2000). Auditory feedback in vocal behavior.Nature Reviews Neuroscience, 1(October), 1e10.

Byers, B. E., & Kroodsma, D. E. (2009). Female mate choice and songbird song repertoires. Animal Behaviour, 77(1), 13e22. https://doi.org/10.1016/

j.anbehav.2008.10.003

Canopoli, A., Herbst, J., & Hahnloser, R. H. R. (2014). A higher sensory brain region is involved in reversing reinforcement-induced vocal changes in a songbird.

Journal of Neuroscience, 34(20), 7018e7026.

Catchpole, C. K., & Slater, P. (2008).Bird song: Biological themes and variations(2nd ed.). Cambridge, U.K: Cambridge University Press. https://doi.org/10.1017/

CBO9780511754791

Diehl, R. L., Lotto, A. J., & Holt, L. L. (2004). Speech perception.Annual Review of Psychology, 55(1), 149e179. https://doi.org/10.1146/

annurev.psych.55.090902.142028

Dooling, R. J., Best, C. T., & Brown, S. D. (1995). Discrimination of synthetic full- formant and sinewave /raela/ continua by budgerigars (Melopsittacus undu- latus) and zebrafinches (Taeniopygia guttata).Journal of the Acoustical Society of America, 97(3), 1839e1846.https://doi.org/10.1121/1.412058

Doupe, a J., & Kuhl, P. K. (1999). Birdsong and human speech: Common themes and mechanisms. Annual Review of Neuroscience, 22, 567e631. https://doi.org/

10.1146/annurev.neuro.22.1.567

Eilers, R. E., Gavin, W., & Wilson, W. R. (1979). Linguistic experience and phonemic perception in infancy: A crosslinguistic study.Child Development, 50(1), 14e18.

http://www.ncbi.nlm.nih.gov/pubmed/446199.

Eimas, P. D. (1975). Auditory and phonetic coding of the cues for speech:

Discrimination of the [r-l] distinction by young infants.Perception&Psycho- physics, 18(5), 341e347.https://doi.org/10.3758/BF03211210

Eimas, P. D., Siqueland, E. R., Jusczyk, P., & Vigorito, J. (1971). Speech perception in infants.Science, 171(3968), 303e306.https://doi.org/10.1037/e509922011-001 Elie, J. E., & Theunissen, F. E. (2015). Meaning in the avian auditory cortex: Neural

representation of communication calls.European Journal of Neuroscience, 41(5), 546e567.https://doi.org/10.1111/ejn.12812

Falls, J. B. (1982). Individual recognition by sounds in birds. In D. E. Kroodsma, &

E. H. Miller (Eds.),Acoustic communication in birds(Vol. 2, pp. 237e278). New York, NY: Academic Press.

Feher, O., Wang, H., Saar, S., Mitra, P. P., & Tchernichovski, O. (2009). De novo establishment of wild-type song culture in the zebrafinch.Nature, 459(7246), 564e568.https://doi.org/10.1038/nature07994

Geberzahn, N., & Deregnaucourt, S. (2020). Individual vocal recognition in zebra finches relies on song syllable structure rather than song syllable order.Journal of Experimental Biology, 223(9), Article jeb220087. https://doi.org/10.1242/

jeb.220087

Geman, S., Bienenstock, E., & Doursat, R. (1992). Neural networks and the bias/

variance dilemma. Neural Computation, 4(1), 1e58. https://doi.org/10.1162/

neco.1992.4.1.1

Gentner, T. Q. (2004). Neural systems for individual song recognition in adult birds.

Annals of the New York Academy of Sciences, 1016, 282e302.https://doi.org/

10.1196/annals.1298.008

Gentner, T. Q., & Hulse, S. H. (1998). Perceptual mechanisms for individual vocal recognition in European starlings,Sturnus vulgaris.Animal Behaviour, 56(3), 579e594.https://doi.org/10.1006/anbe.1998.0810

Gordon, J. A., & Stryker, M. P. (1996). Experience-dependent plasticity of binocular responses in the primary visual cortex of the mouse.Journal of Neuroscience, 16(10), 3274e3286.https://doi.org/10.1523/jneurosci.16-10-03274.1996 Hensch, T. K. (2005). Critical period plasticity in local cortical circuits.Nature Re-

views Neuroscience, 6(11), 877e888.https://doi.org/10.1038/nrn1787 Jeanne, J. M., Thompson, J. V., Sharpee, T. O., & Gentner, T. Q. (2011). Emergence of

learned categorical representations within an auditory forebrain circuit.Journal of Neuroscience, 31(7), 2595e2606. https://doi.org/10.1523/JNEUROSCI.3930- 10.2011

Konishi, M. (1965). The role of auditory feedback in the control of vocalization in the white-crowned sparrow.Zeitschrift für Tierpsychologie, 22, 770e783. https://

doi.org/10.1111/j.1439-0310.1965.tb01688.x

Kuhl, P. K. (2004). Early language acquisition: Cracking the speech code.Nature Reviews Neuroscience, 5(11), 831e843.https://doi.org/10.1038/nrn1533 Kuhl, P. K., Williams, K. A., Lacerda, F., Stevens, K. N., & Lindblom, B. (1992). Lin-

guistic experience alters phonetic perception in infants by 6 Months of age.

Source: Science, New Series, 255(5044), 606e608. https://doi.org/10.1126/

science.1736364

Liberman, A. M., Cooper, F. S., Shankweiler, D. P., & Studdert-Kennedy, M. (1967).

Perception of the speech code.Psychological Review, 74(6), 431e461.https://

doi.org/10.1037/h0020279

Liberman, A. M., & Mattingly, I. (1985). The motor theory of speech perception revised.Cognition, 21(1), 1e36.http://www.ncbi.nlm.nih.gov/pubmed/4075760.

Lipkind, D., Zai, A. T., Hanuschkin, A., Marcus, G. F., Tchernichovski, O., &

Hahnloser, R. H. R. (2017). Songbirds work around computational complexity by learning song vocabulary independently of sequence.Nature Communications, 8, 1247.https://doi.org/10.1038/s41467-017-01436-0

Marler, P., & Tamura, M. (1964). Culturally transmitted patterns of vocal behavior in sparrows. Science, 146(3650), 1483e1486. https://doi.org/10.1126/

science.146.3650.1483

McClelland, J. L. (2014). Learning to discriminate English / r / and / l / in adulthood : Behavioral and modeling studies. Studies in Language Sciences.Journal of the Japanese Society for Language Sciences, 13, 32e52.

McClelland, J. L., Fiez, J. A., & McCandliss, B. D. (2002). Teaching the /r/-/l/

discrimination to Japanese adults: Behavioral and neural aspects.Physiology&

Behavior, 77(4e5), 657e662.https://doi.org/10.1016/S0031-9384(02)00916-2 Mets, D. G., & Brainard, M. S. (2019). Learning is enhanced by tailoring instruction to

individual genetic differences.eLife, 8, Article e47216.https://doi.org/10.7554/

eLife.47216

Miyawaki, K., Strange, W., Verbrugge, R., Liberman, A. M., Jenkins, J. J., & Fujimura, O.

(1975). An effect of linguistic experience: The discrimination of [r] and [1] by native speakers of Japanese and English.Perception&Psychophysics, 18(5), 331e340.https://doi.org/10.3758/BF03211209

Naoi, N., Watanabe, S., Maekawa, K., & Hibiya, J. (2012). Prosody discrimination by songbirds (Padda oryzivora).PloS One, 7(10), Article e47446.https://doi.org/

10.1371/journal.pone.0047446

Narula, G., Herbst, J. A., Rychen, J., & Hahnloser, R. H. R. (2018). Learning auditory discriminations from observation is efficient but less robust than learning from experience.Nature Communications, 9(1), 3218.https://doi.org/10.1038/s41467- 018-05422-y

Nelson, D. A., & Marler, P. (1962). Categorical perception of natural stimulus con- tinuum: Birdsong.Science, 244, 976e977.

Nelson, D. A., & Marler, P. (1993). Innate recognition of song in white-crowned sparrows: A role in selective vocal learning?Animal Behaviour, 46, 806e808.

Njegovan, M., & Weisman, R. G. (1997). Pitch discrimination infield- and isolation- reared black-capped chickadees (Parus atricapillus). Journal of Comparative Psychology, 111(3), 294e301.https://doi.org/10.1037/0735-7036.111.3.294

Referenzen

ÄHNLICHE DOKUMENTE

A total of 15 CTD/rosette stations was occupied using a General Oceanics 12 bottle rosette equipped with 12 1.7-liter Niskin water sample bottles, and an NBIS MK III B CTD.. No

Models for simultaneous hermaphrodites typically predict the constant optimal sex allocation (relative investment in male versus female function), independent of individual

The most important example is the synergy between the register allocation and the instruction scheduler he scheduler tends to do much better at scheduling a block in which most of

Because dyslexics are reported to be deficient in categorizing speech- sounds (Mody et al., 1997; Adlard &amp; Hazan, 1998; Serniclaes et al., 2001) and distinguishing CV

Zebra finches are one of the few songbird species in which variation in song amplitude between males has been studied (Brumm and Slater 2006b; Brumm 2009). During

Then files may be copied one at a time (or with a wild card transfer) to the dual density diskette.. Using the Filer under the UCSD O/S, do an E)xtended listing of the files on

Generalized least squares models with maximum-likelihood population-effects (MLPE) results showing the relationship between pairwise genetic distance (D EST and F ST ) and

Because flowers are so vastly different, as the insects serving them, the unifying concept must be quite abstract, like the value system mentioned above.. With physics and