• Keine Ergebnisse gefunden

Performance of the Model

N/A
N/A
Protected

Academic year: 2021

Aktie "Performance of the Model"

Copied!
31
0
0

Wird geladen.... (Jetzt Volltext ansehen)

Volltext

(1)

Latent Semantic Analysis Latent Semantic Analysis

The Predication Model

Performance of the Model

(2)

Latent Semantic Analysis (LSA)

• LSA maps text onto high-dimensional vectors

• The claim: similarity of vectors indicates similarity of sense

• The miracle: it works

(3)

LSA - What Does It Do?

• LSA gets text (lots of) and creates a cooccurence matrix:

1 2 3 4 ... # Documents

bla 1 2 1

blub 2 3

..

..

# Word Types

1 2 ... n

bla .1 .2 .3

blub .1 .1 .4

..

..

# Word Types

• From this, LSA creates a smaller matrix:

(4)

LSA - What Does It Do? (II)

• So, an n-dimensional vector is assigned to each word:

– Direction of a vector: the “meaning”

– Length of a vector: weight (how much is known, or, how important is it)

– Cosine (Angle btw. two vectors): similarity of meaning

• Around n=300, LSA similarity judgements resemble human judgements

(5)

Sample Applications

• LSA could be shown to resemble human similarity judgements, e.g., in

– Choosing the most appropriate synonym for a given word

– Word/word or passage/word relations found in priming experiments

– Essay grading (!)

(6)

Interpretation of LSA

• Invented for information retrieval

• Suitable as framework for semantic theory?

– Wittgenstein: word meanings are not to be

defined, but can only be characterized by their

“family resemblance”

– That is, their cosine neighborhood?

• Relevance as cognitive architecture?

– Striking similarities to human performance – Abstract neurological plausibility

(7)

LSA - What Does It Really Do?

– For n >= min{#Word Types, #Documents}, the result is a perfect reconstruction of the original matrix

– For smaller n, some generalizations have to be made.

– So the generalizations are the effect of a compression process.

1 2 3 4 ... # Documents bla

blub

..

..

# Word Types

=^

bla blub

# Word Types

1 2 ... n 1

2 ...

n

1 2 3 4 ... # Documents 1

2 ...

n

x x

– Singular Vector Decomposition (SVD)

(8)

Composition of LSA Vectors

• Needed: A compositional rule for LSA such that for two vectors P and A, a sensible

vector for P(A) is put out

• One idea: centroid addition

• Too simple: When a predicate is applied to an argument, the predicate meaning is

typically influenced…

(9)

Latent Semantic Analysis (LSA) The Predication Model

The Predication Model

Performance of the Model

(10)

What is predication?

• “predicate” contains the statement

• Predication Algorithm is a solution to problems LSA cannot cope with

Core problem: Predicate meaning:

– we predicate only subset of Properties of P (contextually appropriate ones) not all of P

(11)

What is predication? (II)

Alternative Compositional rule to the centroid vector approach

Essential characteristic:

Strengthens features of the predicate that are appropriate for the argument of the predication

LSA + C&I = Predication

(12)

How does it work?

• Input:

as in LSA word meanings (as vectors)

• Predication performs a

Construction & Integration process

• Output:

as in LSA: compositional vector

representing the meaning of the compound

(13)

Construction

• A network is constructed

– Nodes: P, A, and all other items I – 2 sets of links:

• Between the Argument and all other Items

positive; according to relatedness Æ fascilitation

• All Items I interconnected with each other negative weights Æinhibition

(14)

Integration

• The network is self-inhibiting

– Competion for activation

– Nodes/items most strongly related to A and P acquire positve activation values

• Two parameters

k most activated notes

m computational approximation limiting the neighborhood

(15)

Inhibition-network

Ex: The horse ran

P(A) = RAN[HORSE]

What is the meaning of this proposition?

Need the right word sense of RUN

(16)

Meaning = Sum of vectors

simplification: dim=2

• P: predicate

• A: argument

• m: computational approximation

(pruning irrelevant items in space)

• k: Most relevant Items for argument

Output:

Centroid of P, A and the k‘s is computed

(17)

A simple example for predication

Meaning of collapsed is compared to landmark meanings

Centroid fails where predication yields

intuitively right results Test:

•The bridge collapsed

•The plans collapsed

•The runner collapsed

Closest landmarks:

•Breakdown

•Failure

•Race

(18)

Summary Predication

• LSA + C&I = Predication

• By Using C&I

contextual modification is introduced

= characteristic of human comprehension

• Æ Better than the centroid approach

• ? Sufficient for even more complex language processing tasks ?

(19)

Latent Semantic Analysis (LSA) The Predication Model

Performance of the Model

Performance of the Model

(20)

Performance of the Model in Performance of the Model in

various complex NLP tasks various complex NLP tasks

Metaphor Comprehension Causal Inference

Similarity Judgements

Homonym Treatment

(21)

Metaphors

m = 500

(22)

Metaphor Comprehension

• Centroid makes no sense

– centroid of lawyer & shark in no man‘s land, more related to shark .83 and fish .58

• Overall the algorithm produced satisfactory results:

– cosine btw. metaphor & relevant landmarks was much higher than w. irrelevant landmarks.

• But still human

– e.g. Her marriage is an icebox.

• What is „cold marriage“ ?

(23)

Priming in Metaphor Comp.

• Time to comprehend metaphor is increased when literal meaning is primed.

– e.g. Sharks can swim. My lawyer is a shark.

– Vice Versa

• e.g. My lawyer is a shark. Sharks can swim.

• Some of the major psycholinguistic

phenomena about metaphor comprehension are readily accounted for.

(24)

Causal Inferences

m = 20, k = 5

(25)

Causal Inference Comp.

• Algorithm produced satisfactory results:

– Sentence vectors, computed by predication, are closer to causally related inferences than to causally unrelated but superficially similar sentence.

• But still failed in the last example, i.e.

the hunter shot the elk -> the elk was dead – possible reasons:

• smaller k = 3 -> hunter dead 0.69 vs. Elk dead 0.68

• replace elk(LSA knows little) with deer(LSA knows a lot) -> deer dead 0.75 vs. Hunter dead 0.69

(26)

Causal vs. Temporal Inferences

-> Causally related sentences had a higher cosine than

temporally related sentences, which demonstrates the ability of the predication model to explain

causal inference (semantic relatedness)

Causal average vs.

Temporal average 0.58 vs. 0.42

(27)

Similarity Judgement (SJ)

• SJ do not directly reflect basic semantic relationships but

– are subject to task- and context-dependent influences (see example below)

• The literature on SJ is huge and complex

– it is not clear which phenomena predication can account for

– one systematic comparison with a small dataset is described (see example below)

(28)

SJ

predicating

Anatomy vs. Behavior

Anatomy Behaviour hawk vs. chicken 0.61 0.35

hawk vs. tiger 0.14 0.45

bee vs. ant 0.81 0.48

bee vs. Hummingbird 0.40 0.81

Rated similarity for pairs of animal names as a function of 2 instructional condictions (anatomy and behaviour).

Cosines are computed after predicating either „anatomy

“ or „behaviour“ about each animal name.

m = 50, k = 1

(29)

Homonyms

m = 50, k = 30

(30)

Homonyms

• The LSA vectors for homonymous nouns contain all possible meanings(w. biases for the more

frequent ones)

– cos(lead, metal) = .34 vs. Cos(lead, follow) = .36

• appropriate predicates select fitting meanings from this complex

• result: predication handle multi-meaning words the same way as multi-sense words

(31)

Discussion

• LAS not yet a complete semantic theory, promising alternative of lexical semantics

• althought P exceeds C in simple sentences out of large context, practically (easy grading) the results are almost the same

• predication presuppose a syntactic analysis to find out predicate and argument.

• Is this a sufficient model of comprehension for human?

Conclusion & Discussion

Referenzen

ÄHNLICHE DOKUMENTE

Using an eye tracker, the time taken by the participant to start fixating the target (the processing time) was measured for two levels of linguistic complexity (low vs. high) and

In other words, the prosodic structure coming from the syntax is not erased and replaced by another one when information structure influences the accent

Finally, we argue that the mechanisms underlying interpretation, visual attention, and scene apprehension are not only in close temporal synchronization, but have co-adapted to

Below, we’ve written a sentence connected by the word “denn.” Rewrite the sentence so that we’re using the word “weil” instead.. Tim ist glücklich, denn er geht am Samstag

Initially, the group concentrated on three types above all others: IC-IC sentences, which consisted of two independent clauses; IC-DC, where an independent clause was followed by

➟ If you can replace a group of words with a pro-form (pronoun, pro-verb, pro-adjective etc.) (keeping the meaning roughly the same) then they form a constituent:.. ➟ I’ve

With augmented transition networks it is possible to collect the features of sentences and syntagms as they are parsed. Thus, one particular noun phrase of a given sentence can

This work has been digitalized and published in 2013 by Verlag Zeitschrift für Naturforschung in cooperation with the Max Planck Society for the Advancement of Science under