• Keine Ergebnisse gefunden

A.1 Introduction

A.1.2 Neuroanatomy of object processing and categorization: empirical findings

A.1.2.1 Findings in animals

The field of animal research on object recognition and categorization was opened by the pioneering work of Herrnstein & Loveland in pigeons (1964). Pigeons could not only discriminate between pictures containing a human being or not, they were also able to transfer this categorization ability to new sets of pictures.

In the 70s, animal research started to investigate object recognition and categorization in Macaques, a nonhuman primate with a high developed temporal cortex compared to nonprimates (Rolls, 2000).

The anterior inferior temporal cortex (area TE) in the macaque brain plays an important role in visu-al object recognition and research primarily focused on the registration of neurvisu-al activity within this region. Monkeys with a bilateral ablation of the area TE showed deficits in tasks that required the visual recognition of objects (Dean, 1976).

In their pioneering work, Gross and Desimone (Desimone & Gross, 1979; Gross et al., 1972) 4

Figure A-3: Schematic view of the ventral visual processing stream.

A.1.2 : Introduction - Neuroanatomy of object processing and categorization: empirical findings recorded activity from single cells in the visual areas of the temporal lobe and found neurons that responded best to complex visual stimuli, especially to faces. In the following, one of the obstacles in studying the processing of objects on a neuronal level was to determine the fine-grained stimulus selectivity of single neurons.

Based on previous work by Desimone et al. (1984), Tanaka et al. (Kobatake & Tanaka, 1994:

Tanaka et al., 1991) developed a reduction method to determine critical features for the activation of cells in area TE. In a first step, a large number of complex stimuli were used and single cells re-sponding best to a given object were identified. Next, the most effective image for each cell was simplified step by step whilst maintaining maximal activation of the monitored cell. The simplest image still producing maximal activation was defined as the critical feature for a cell. These fea-tures for neurons in the area TE are more complex than feafea-tures like orientation, size, color or sim-ple texture, which are represented by cells in V1 (Callaway, 1998). On the other hand these feature did not show enough specificity to represent natural objects on the level of single cells ('grand mother cell' hypothesis) as was suggested earlier by Barlow (1972).

Further evidence for the important role of moderately complex features in activating cells in area TE comes from a different method employed by Wang et al. (1996). Wang and coworkers also first determined the critical features of cells by recording from single neurons and then used optical imaging, a method which utilizes the relationship between the amount of reflected light and neu-ronal activity (see Gratton et al., 2003 for details). Compared to single cell recordings, optical imag-ing measures average neuronal activity from a rather wide cortical region. The activity spots (with a diameter of about 0.5 mm) in response to the presentation of a critical feature corresponded to the position of the previously recorded single cells. This topographical activity pattern suggests that cells responding to the same feature are clustered locally. Such a clustering of cells with similar fea-ture selectivity in a columnar organization is also supported by results of Fujita et al. (1992). Fujita

A.1.2 : Introduction - Neuroanatomy of object processing and categorization: empirical findings and coworkers vertically penetrated the cortex and for the whole thickness of the gray matter cells responding to similar features were found; in contrast, if nerve tissue was penetrated in oblique di-rection, cells with a similar selectivity were limited to short spans of distance of about 300 to 400 microns.

Though the majority of cells in the IT respond preferentially to the presentation of moderately com-plex feature, there is a portion of cells with maximal activity to facial stimuli (Desimone, 1991; Per-ret et al., 1982). These 'face cells' were used to examine the view-invariance of representations in the IT. While some of these 'face cells' were found to be relatively invariant in terms of size, con-trast and spatial frequency of the stimuli (Rolls, 1992; Tovee et al., 1994), within the same region a larger number of cells with view-dependent responses have been found. For example, Perret et al.

(1985, 1992) reported neurons responding to the profile of a monkey face, but not to the frontal view of the same face. Booth and Rolls (1998) examined view-invariance using non-facial stimuli.

Plastic objects were placed in the cages of monkeys and after the animals were able to explore the objects, activity of neurons for different views of the same object was recorded. The majority of neurons responded only to some views of a particular object. However a subset of neurons fired ex-clusively to a particular object independent of its view.

In the aforementioned studies objects have been presented instantaneously and isolation. However, in real life the visual system must locate and identify objects in complex surroundings. Sheinberg &

Logothetis (2001) tried to account for this with an elegant experimental design. Target objects were embedded in natural scenes, while single cell activities as well as eye movements were recorded.

This allowed monitoring visually guided search processes while simultaneously recording the neu-ral activity. In a separate condition the target objects were presented in isolated views without any context. The neural activity associated with the processing of the target objects in isolated or bedded views was found to be highly similar, corroborating the validity of previous studies that

em-6

A.1.2 : Introduction - Neuroanatomy of object processing and categorization: empirical findings ployed only isolated objects. In the 'embedded' condition activity related to the target object was found shortly before the manual response of the monkeys. Detailed inspection of the time course re-vealed that this identity-related activity was sometimes observable right before target fixation, indi-cating that the information 'also was used to guide behavior'.

Most animal researchers have concentrated on the possible mechanisms of object representation, but little is known about the process of categorization of objects. While there is evidence that mon-keys show categorization abilities in the wild (e.g. Seyfarth et al., 1980) and in the laboratory (e.g.

Delorme et al., 2000), only a few studies attempted to record neuronal activity during categorization tasks in the monkey.

Category-selective neurons have been found in the IT by Vogels (1999). In this study, monkeys were trained to categorize tree- and non-tree images and a set of neurons responded almost exclu-sively to trees during task performance. On the other hand, this category-specifity was limited to subsets of the category, i.e. no neurons were found that responded to all exemplars of trees, which makes it unlikely that categories are coded on the basis of single neurons.

Such category-selective activity in the anterior IT is shaped by visual features of an object as was shown by Sigala & Logothetis (2002). The authors trained monkeys to categorize line drawings of faces or fishes into two groups. The stimuli differed on four visual features (e.g. eye height, nose length), however only two of them were relevant for the categorization task. Neurons in the anterior IT showed enhanced activity for the relevant features compared to the non-relevant ones.

Tsao et al. (2003) used fMRI to detect category-related activity in the macaque brain evoked by pictures of hands, bodies, faces, fruits and man-made objects. Specialized patches of brain activity (extending from V4 to TE) were only found for faces and bodies, but not for any of the other categories. Recently, Gil-da-Costa and coworkers (2004) used PET to measure brain activity

evoked by the perception of species-specific vocalizations. Interestingly, early (V2, V3, V4) as well

A.1.2 : Introduction - Neuroanatomy of object processing and categorization: empirical findings as higher-order (TEO and TE) areas along the ventral visual-processing stream showed activation for species-specific calls compared to non-biological sounds.

Freedman and coworkers (2001) used stimuli from the categories 'dogs' and 'cats' and blended these prototypes with a digital morphing system. The resulting stimuli varied in the relative amount of 'dog' or 'cat'. Monkeys performed a dog/cat-categorization task and activity in the prefrontal cortex (PFC) was registered. About one-third of the responding neurons in the lateral prefrontal cortices reflected the sample's category. Remarkably, most of these neurons were not influenced by the mor-phing level. For instance, neurons showed a similar response to images of dog prototypes (100 % dog-like) and cat-like dogs (60% dog, 40% cat), indicating a sharp “boundary” between categories.

However, the exact role of prefrontal structures in categorization tasks needs further exploration.

Thorpe & Fabre-Thorpe (2001) suggested that the IT and the PFC subserve different roles. Whereas the IT might deliver highly processed visual information, the PFC might be involved in decision making about category membership of an object.

In addition to the functional organization of object recognition and categorization, the time course of activity along the ventral visual pathway is another point of interest. Given that the visual pro-cessing of an object requires a sequence of several stages from the retina to the anterior temporal cortex, it is impressive that neuronal object-related activity in the anterior temporal cortex as the fi-nal purely visual processing stage was consistently found around 100 ms after stimulus onset across studies (e.g. Oram & Perrett, 1992; Sugase et al, 1999). This leaves just a few milliseconds for each processing stage along the visual processing stream and makes it very likely that the signal is pro-cessed in a feed-forward manner without any recurrent or feedback processing stages (Thorpe &

Fabre-Thorpe, 2001). Keysers et al. (2001) additionally have shown that face-selectivity of neurons in the temporal cortex is still preserved even at ultra-rapid presentation rates of 14 ms/image.

In sum, animal studies have shown that most neurons in the IT respond best to stimuli below the 8

A.1.2 : Introduction - Neuroanatomy of object processing and categorization: empirical findings level of objects and this activity is mostly view-variant. There is no compelling evidence for the representation of object categories on the level of single neurons.

Moreover, studies have started to disentangle how objects might be represented in the brain. The representation of objects in the IT might be built up by distributed feature columns, each responding to a specific visual feature of the object (Wallis & Rolls, 1997; Wang et al.1996; Tsunoda et al., 2001).

However one should be cautious in transferring findings directly from animals to humans (Crick &

Jones, 1993). In particular, the capability for language in humans might be reflected in differences in the neuroanatomy of the temporal cortex as well as in the way in which objects are represented.