Performance of the Model

(1)

Latent Semantic Analysis Latent Semantic Analysis

The Predication Model

Performance of the Model

(2)

Latent Semantic Analysis (LSA)

• LSA maps text onto high-dimensional vectors

• The claim: similarity of vectors indicates similarity of sense

• The miracle: it works

(3)

LSA - What Does It Do?

• LSA gets text (lots of) and creates a cooccurence matrix:

1 2 3 4 ... # Documents

bla 1 2 1

blub 2 3

..

# Word Types

1 2 ... n

bla .1 .2 .3

blub .1 .1 .4

..

# Word Types

• From this, LSA creates a smaller matrix:

(4)

LSA - What Does It Do? (II)

• So, an n-dimensional vector is assigned to each word:

– Direction of a vector: the “meaning”

– Length of a vector: weight (how much is known, or, how important is it)

– Cosine (Angle btw. two vectors): similarity of meaning

• Around n=300, LSA similarity judgements resemble human judgements

(5)

Sample Applications

• LSA could be shown to resemble human similarity judgements, e.g., in

– Choosing the most appropriate synonym for a given word

– Word/word or passage/word relations found in priming experiments

– Essay grading (!)

(6)

Interpretation of LSA

• Invented for information retrieval

• Suitable as framework for semantic theory?

– Wittgenstein: word meanings are not to be

defined, but can only be characterized by their

“family resemblance”

– That is, their cosine neighborhood?

• Relevance as cognitive architecture?

– Striking similarities to human performance – Abstract neurological plausibility

(7)

LSA - What Does It Really Do?

– For n >= min{#Word Types, #Documents}, the result is a perfect reconstruction of the original matrix

– For smaller n, some generalizations have to be made.

– So the generalizations are the effect of a compression process.

1 2 3 4 ... # Documents bla

blub

..

# Word Types

=^

bla blub

…

# Word Types

1 2 ... n 1

2 ...

n

1 2 3 4 ... # Documents 1

2 ...

n

x x

– Singular Vector Decomposition (SVD)

(8)

Composition of LSA Vectors

• Needed: A compositional rule for LSA such that for two vectors P and A, a sensible

vector for P(A) is put out

• One idea: centroid addition

• Too simple: When a predicate is applied to an argument, the predicate meaning is

typically influenced…

(9)

Latent Semantic Analysis (LSA) The Predication Model

The Predication Model

Performance of the Model

(10)

What is predication?

• “predicate” contains the statement

• Predication Algorithm is a solution to problems LSA cannot cope with

• Core problem: Predicate meaning:

– we predicate only subset of Properties of P (contextually appropriate ones) not all of P

(11)

What is predication? (II)

• Alternative Compositional rule to the centroid vector approach

• Essential characteristic:

Strengthens features of the predicate that are appropriate for the argument of the predication

• LSA + C&I = Predication

(12)

How does it work?

• Input:

as in LSA word meanings (as vectors)

• Predication performs a

Construction & Integration process

• Output:

as in LSA: compositional vector

representing the meaning of the compound

(13)

Construction

• A network is constructed

– Nodes: P, A, and all other items I – 2 sets of links:

• Between the Argument and all other Items

positive; according to relatedness Æ fascilitation

• All Items I interconnected with each other negative weights Æinhibition

(14)

Integration

• The network is self-inhibiting

– Competion for activation

– Nodes/items most strongly related to A and P acquire positve activation values

• Two parameters

– k most activated notes

– m computational approximation limiting the neighborhood

(15)

Inhibition-network

Ex: The horse ran

P(A) = RAN[HORSE]

What is the meaning of this proposition?

Need the right word sense of RUN

(16)

Meaning = Sum of vectors

simplification: dim=2

• P: predicate

• A: argument

• m: computational approximation

(pruning irrelevant items in space)

• k: Most relevant Items for argument

Output:

Centroid of P, A and the k‘s is computed

(17)

A simple example for predication

Meaning of collapsed is compared to landmark meanings

Centroid fails where predication yields

intuitively right results Test:

•The bridge collapsed

•The plans collapsed

•The runner collapsed

Closest landmarks:

•Breakdown

•Failure

•Race

(18)

Summary Predication

• LSA + C&I = Predication

• By Using C&I

contextual modification is introduced

= characteristic of human comprehension

• Æ Better than the centroid approach

• ? Sufficient for even more complex language processing tasks ?

(19)

Latent Semantic Analysis (LSA) The Predication Model

Performance of the Model

(20)

Performance of the Model in Performance of the Model in

various complex NLP tasks various complex NLP tasks

Metaphor Comprehension Causal Inference

Similarity Judgements

Homonym Treatment

(21)

Metaphors

m = 500

(22)

Metaphor Comprehension

• Centroid makes no sense

– centroid of lawyer & shark in no man‘s land, more related to shark .83 and fish .58

• Overall the algorithm produced satisfactory results:

– cosine btw. metaphor & relevant landmarks was much higher than w. irrelevant landmarks.

• But still human

– e.g. Her marriage is an icebox.

• What is „cold marriage“ ?

(23)

Priming in Metaphor Comp.

• Time to comprehend metaphor is increased when literal meaning is primed.

– e.g. Sharks can swim. My lawyer is a shark.

– Vice Versa

• e.g. My lawyer is a shark. Sharks can swim.

• Some of the major psycholinguistic

phenomena about metaphor comprehension are readily accounted for.

(24)

Causal Inferences

m = 20, k = 5

(25)

Causal Inference Comp.

• Algorithm produced satisfactory results:

– Sentence vectors, computed by predication, are closer to causally related inferences than to causally unrelated but superficially similar sentence.

• But still failed in the last example, i.e.

– the hunter shot the elk -> the elk was dead – possible reasons:

• smaller k = 3 -> hunter dead 0.69 vs. Elk dead 0.68

• replace elk(LSA knows little) with deer(LSA knows a lot) -> deer dead 0.75 vs. Hunter dead 0.69

(26)

Causal vs. Temporal Inferences

-> Causally related sentences had a higher cosine than

temporally related sentences, which demonstrates the ability of the predication model to explain

causal inference (semantic relatedness)

Causal average vs.

Temporal average 0.58 vs. 0.42

(27)

Similarity Judgement (SJ)

• SJ do not directly reflect basic semantic relationships but

– are subject to task- and context-dependent influences (see example below)

• The literature on SJ is huge and complex

– it is not clear which phenomena predication can account for

– one systematic comparison with a small dataset is described (see example below)

(28)

SJ

predicating

Anatomy vs. Behavior

Anatomy Behaviour hawk vs. chicken 0.61 0.35

hawk vs. tiger 0.14 0.45

bee vs. ant 0.81 0.48

bee vs. Hummingbird 0.40 0.81

Rated similarity for pairs of animal names as a function of 2 instructional condictions (anatomy and behaviour).

Cosines are computed after predicating either „anatomy

“ or „behaviour“ about each animal name.

m = 50, k = 1

(29)

Homonyms

m = 50, k = 30

(30)

Homonyms

• The LSA vectors for homonymous nouns contain all possible meanings(w. biases for the more

frequent ones)

– cos(lead, metal) = .34 vs. Cos(lead, follow) = .36

• appropriate predicates select fitting meanings from this complex

• result: predication handle multi-meaning words the same way as multi-sense words

(31)

Discussion

• LAS not yet a complete semantic theory, promising alternative of lexical semantics

• althought P exceeds C in simple sentences out of large context, practically (easy grading) the results are almost the same

• predication presuppose a syntactic analysis to find out predicate and argument.

• Is this a sufficient model of comprehension for human?

Performance of the Model

Latent Semantic Analysis Latent Semantic Analysis

The Predication Model

Performance of the Model

Latent Semantic Analysis (LSA)

LSA - What Does It Do?

LSA - What Does It Do? (II)

Sample Applications

Interpretation of LSA

LSA - What Does It Really Do?

Composition of LSA Vectors

Latent Semantic Analysis (LSA) The Predication Model

The Predication Model

Performance of the Model

What is predication?

What is predication? (II)

How does it work?

Construction

Integration

Inhibition-network

Meaning = Sum of vectors

A simple example for predication

Summary Predication

Latent Semantic Analysis (LSA) The Predication Model

Performance of the Model

Performance of the Model

Performance of the Model in Performance of the Model in

various complex NLP tasks various complex NLP tasks

Metaphors

Metaphor Comprehension

Priming in Metaphor Comp.

Causal Inferences

Causal Inference Comp.

Causal vs. Temporal Inferences

Similarity Judgement (SJ)

SJ

Anatomy vs. Behavior

Homonyms

Homonyms

Discussion

Conclusion & Discussion