• Keine Ergebnisse gefunden

A discussion of The Algebraic Mind

N/A
N/A
Protected

Academic year: 2021

Aktie "A discussion of The Algebraic Mind"

Copied!
36
0
0

Wird geladen.... (Jetzt Volltext ansehen)

Volltext

(1)

Jasmin Steinwender (jsteinwe@uos.de) Sebastian Bitzer (sbitzer@uos.de)

Seminar Cognitive Architecture University of Osnabrueck

Multilayer Perceptrons

A discussion of The Algebraic Mind

Chapters 1+2

(2)

The General Question

What are the processes and

representations underlying mental

activity?

(3)

Connectionism vs. Symbol manipulation

Also referred to as parallel-distributed processing (PDP) or neural network models

Hypothesis that cognition is a dynamic pattern of connections and activations in a 'neural net.'

Model of the parallel processor and the relevance to the anatomy and function of neurons.

Consists of simple neuron- like processing elements: units

• Biological plausible?

brain consisting of neurons, evidence for hebbian learning in the brain

• „classical view“

• Production rules

• Hierarchical binary trees

• computer-like application of rules and manipulation of symbols

• Mind as symbol manipulator (Marcus)

• Biological plausible?

Brain circuits as representation of generalization and rules

(4)

BUT…

Ambiguity of the term connectionism:

in the huge variety of connectionist models

some will also include symbol-manipulation

(5)

Two types of Connectionism

1. implementational connectionism:

- a form of connectionism that would seek to understand how systems of neuron-like entities could implement symbols

2. eliminative connectionism:

- which denies that the mind can be usefully understood in terms of symbol- manipulation

→ „ …eliminative connectionism cannot work(…): eliminativist models (unlike humans) provably cannot generalize abstractions to novel items that contain features that did not appear in the training set.”

Gary Marcus:

http://listserv.linguistlist.org/archives/info-childes/infochi/Connectionism/connectionist5.html and http://listserv.linguistlist.org/archives/info-childes/infochi/Connectionism/connectionism11.html

(6)

Symbol manipulation -3 separable Hypothesis-

• Will be explicitly explained in the whole book, now just mentioned

1. „The mind represents abstract relationships between variables“

2. „The mind has a system of recursively structured representations“

3. „ The mind distinguishes between mental representations of individuals and mental representation of kinds“

If the brain is a symbol-manipulator, then one of this

hypotheses must hold.

(7)

Introduction to Multilayer Perceptrons

• simple perceptron

– local vs. distributed – linearly separable

• hidden layers

• learning

(8)

The Simple Perceptron I

w1 w2

w3

w4

w5

o i1

i2 i3

i4 i5

) (

5

1

=

=

n

n n

act

i w

f

o

(9)

Activation functions

(10)

The Simple Perceptron II

a single-layer feed-forward mapping network

i1 i2 i3 i4

o1 o2 o3

(11)

Local vs. distributed representations

i1 i2 i3

o1 o2

i1 i2 i3

o1 o2

local distributed

cat

representation of CAT

furry four- legged

whiskered

(12)

Linear (non-)separable functions I

[Trappenberg]

(13)

Linear (non-)separable functions II

~1.810

19

15,028,134

6

~4.310

9

94,572

5

63654 1,882

4

151 104

3

2 14

2

Number of linear non- separable functions Number of linear

separable functions n

boolean functions

(14)

Hidden Layers

h1 h2

i1 i2 i3 i4

o1 o2 o3

a two-layer

feed-forward

mapping network

(a MLP)

(15)

Learning

(16)

Backpropagation

• compare actual output - right o., change weights

• based on

comparison from above change

weights in deeper layers, too

h1 h2

i1 i2 i3 i4

o1 o2 o3

(17)

Multilayer Perceptron (MLP)

A type of feedforward neural network that is an extension of the perceptron in that it has at least one hidden layer of neurons.

Layers are updated by starting at the inputs and ending with the outputs. Each neuron computes a weighted sum of the incoming signals, to yield a net input, and passes this value through its sigmoidal activation function to yield the neuron's activation value. Unlike the perceptron, an MLP can solve linearly inseparable problems.

Gary William Flake, The Computational Beauty of Nature,

MIT Press, 2000

(18)

Many other network structures

MLPs*

* a single-layer feed-forward network is actually not an MLP, but a simplified version of it (lacking hidden layers)

(19)

distributed encoding of patient (6 nodes)

Vicky Andy Penny

hidden layer (12 nodes)

distributed encoding

of agent (6 nodes)

distributed encoding of relationship

(6 nodes)

Arthur

Vicky

Andy Penny

others

others sis other

mom dad

The

Family-Tree Model

1 1

2

(20)

The sentence prediction model

(21)

The appeal of MLPs

(preliminary considerations)

1. Biological plausibility

– independent nodes

– change of connection weights resembles synaptic plasticity

– parallel processing

⇒ brain is a network and MLPs are too

(22)

Evaluation Of The Preliminaries

1. Biological plausibility

• Biological plausibility considerations make no distinction between eliminative and implementing connectionist models

• Multilayered perceptron as „more compatible than symbolic models“, BUT nodes and their connections only loosely model neurons and synapses

• Back-propagation MLP lacks brain-like structure and requires varying synapses (inhibitory and excitatory)

• Also symbol-manipulation models consist of multiple units and operate in parallel → brain-like structure

• Not yet clear what is biological plausible – biological knowledge

changes over time

(23)

Remarks on Marcus

difficult to argue against his arguments:

– sometimes addresses comparison between

eliminative and implementational connectionist models

– sometimes he compares connectionism and

classical symbol-manipulation

(24)

Remarks on Marcus

1. Biological plausibility

(comparison MLPs – classical symbol-manipulation) – MLPs are just an abstraction

– no need to model newest detailed biological knowledge

– even if not everything is biological plausible,

still MLPs are more likely

(25)

Preliminary considerations II

2. Universal function approximators

– “multilayer networks can approximate any function arbitrarily well” [Trappenberg]

– “information is frequently mapped between different representations” [Trappenberg]

– mapping of one representation to another can

be seen as a function

(26)

Evaluation Of The Preliminaries II

2. Universal function approximators

• MLP cannot capture all functions (f. e. partial recursive func. – models computational properties of human

language)

• No guarantee of generalization ability from limited data like humans

• Unrealistic need of infinite resources for universal function approximation

• Symbol-manipulators could also approximate any function

(27)

Preliminary considerations III

3. Little innate structure

– children have relatively little innate structure

⇒ “simulate developmental phenomena in new and … exciting ways” [Elman et al., 1996]

e.g. model of balance beam problem

[McClelland, 1989] fits data from children

– domain-specific representations from domain-

general architectures

(28)

Evaluation Of The Preliminaries III

3. Little innate structure

• There also exist symbol-manipulating models with little innate structure

• Possibility to prespecify the connection

weights of MLP

(29)

Preliminary considerations IV

4. Graceful degradation

– tolerate noise during processing and in input

– tolerate damage (loss of nodes)

(30)

Evaluation Of The Preliminaries IV

4. Learning and graceful degradation

• No unique ability of all MLP

• Symbol-manipulation models which can also handle degradation

• No yet empirical data that humans recover

from degraded input

(31)

Preliminary considerations V

5. Parsimony

– one just has to give the architecture and examples

– more generally applicable mechanisms

(e.g. inflecting verbs)

(32)

Evaluation Of The Preliminaries V

5. Parsimony

• MLP connections interpreted as free parameters → less parsimonious

• Complexity may be more biological plausible than parsimony

• Parsimony as criterion only if both models

cover the data adequately

(33)

What truly distinguishes MLP from Symbol -manipulation

• Is not clear, because…

…both can be context independent

…both can be counted as having symbols

…both can be localist or distributed

(34)

We are left with the question:

Is the mind a system that represents

• abstract relationships between variables OR*

• operations over variables OR*

• structured representations

• and distinguishes between mental

representations of individuals and of kinds

We will find out later in the book…

*inclusive

(35)

Discussion

“… I agree with Stemberger that connectionism can make a valuable contribution to cognitive science. The only

place that we differ is that, first, he thinks that the contribution will be made by providing a way of

*eliminating* symbols, whereas I think that connectionism will make its greatest contribution by accepting the

importance of symbols, seeking ways of supplementing symbolic theories and seeking ways of explaining how symbols could be implemented in the brain. Second, Stemberger feels that symbols may play no role in cognition; I think that they do.”

Gary Marcus:

http://listserv.linguistlist.org/archives/info-childes/infochi/Connectionism/connectionist8.html

(36)

References

• Marcus, Gary F.: The Algebraic Mind, MIT Press, 2001

• Trappenberg, Thomas P.: Fundamentals of Computational Neuroscience, OUP, 2002

• Dennis, Simon & McAuley, Devin:

Introduction to Neural Networks,

http://www2.psy.uq.edu.au/~brainwav/Manual/WhatIs.html

Referenzen

ÄHNLICHE DOKUMENTE

with some modification which is easily made. It is obvious that in making impact analysis we have to consider the components of national importance as exogenously assigned.

A significantly positive (negative) parameter associated with getFeedback implies that users with a higher number of revisions made to their user talk page have a higher (lower)

We derive a cost- sensitive perceptron learning rule for non-separable classes, that can be extended to multi-modal classes (DIPOL) and present a natural cost-sensitive extension of

Figure 1B plots the same kind of relationship, replacing GDP per person with the mean years of schooling of the adult population to determine whether educational attainment could

We extend the YBWB model to include intergenerational transfers and show that the new model covers a range of possible dynamics including YBWB and Ramsey models as the two

ned helmets of kings appeared in the medieval Europe, as testified by numerous miniatures, particularly from the 13th and 14th centuries, but also by some actual specimens: the

Similarly, if eye gaze, prehension, or pointing fail to include the general meaning of object focus/securing another's focus to that of the child, index has but a single

It is conventional wisdom in machine learning and data mining that logical models such as rule sets are more interpretable than other models, and that among such rule-based