• Keine Ergebnisse gefunden

Knowledge-Based Systems

N/A
N/A
Protected

Academic year: 2021

Aktie "Knowledge-Based Systems"

Copied!
68
0
0

Wird geladen.... (Jetzt Volltext ansehen)

Volltext

(1)

Wolf-Tilo Balke Christoph Lofi

Institut für Informationssysteme

Technische Universität Braunschweig http://www.ifis.cs.tu-bs.de

Knowledge-Based Systems

and Deductive Databases

(2)

9.1 Uncertain Knowledge 9.2 Probabilistic Application 9.3 Belief Networks

9. Deduction with Uncertainty

(3)

• Transform the program to relational algebra

– Easy… just use the rules in the lecture – Eval(R) = S

– Eval(P) = Q ∖ (Q ⋉(Q.#1 = R.#1 ⋀ Q.#2 = R.#2) R ) ⋃ Q ∖ (Q ⋉(Q.#1 = S.#1 ⋀ Q.#2 = S.#2) S) ⋃ π#1, #2#1 = #2 (Q × Q))

• Compute Fixpoint

– R= {(1,3)}

– P= {(1,2),(2,3), (1, 3)}

Exercise 1.1&1.2

(4)

• WITH trip (from,to,price)

AS (SELECT from, to, price FROM flight

UNION ALL

SELECT f.from, t.to, f.price + t.price FROM flight f, trip t

WHERE f.to = t.from AND f.price + t.price < 400) SELECT from, to, price

FROM trip

WHERE from IN (SELECT code FROM airport

WHERE city = 'Berlin') AND to IN (SELECT code

FROM airport

WHERE city = 'Stuttgart')

Exercise 2.1

(5)

WITH trip (from,to,price)

AS (SELECT from, to, price FROM flight

UNION ALL

SELECT f.from, t.to, f.price + t.price FROM flight f, trip t

WHERE f.to = t.from AND f.code != 'lh113' AND f.price + t.price < 400)

SELECT from, to, price FROM trip

WHERE from IN (SELECT code FROM airport

WHERE city = 'Berlin') AND to IN (SELECT code

FROM airport

WHERE city = 'Stuttgart')

Exercise 2.2

(6)

• WITH trip (from,to,price)

AS (SELECT from, to, price FROM flight

UNION ALL

SELECT f.from, t.to, f.price + t.price FROM flight f, trip t

WHERE f.to = t.from

AND f.price + t.price < 1000) SELECT p1+p2 FROM

SELECT min(price) AS p1 FROM trip

WHERE from= 'MUC' AND to = 'STR', SELECT min(price) AS p1 FROM trip

WHERE from= „STR' AND to = „HAM'

Exercise 2.3

(7)

• WITH fourports(a,b,c,d,p) AS (SELECT

f1.from, f2.from, f3.from, f3.to, f1.price+f2.price+f3.price

FROM flight f1, flight f2, flight f3

WHERE f1.to=f2.from AND f2.to=f3.from AND f1.from!=f3.from AND f2.from!=f3.to

) SELECT * FROM fourports

WHERE p IN (SELECT min(p) FROM fourports)

Exercise 2.4

(8)

• We have discussed ways of deriving new facts from other (ground) facts

– But often several rules can lead to a certain fact and we cannot be sure which one it was

A patient experiences toothaches, what is the reason?

– Sometimes a certain fact might be derived from ground facts only in certain cases

A normal bird can fly, except for penguins, ostriches,…

9.1 Uncertainty

(9)

• Typical sources of imperfect information in deductive databases are…

Incomplete information

Information is simply missing, which might clash with the closed world assumption

Imprecise information

The information needed has only been specified in a vague way, e.g., a person is young: young(Tim).

Queries, about Tim‟s age are difficult to answer, e.g., ?age(Tim, 67) is false, but what about ?age(Tim, 25)?

Uncertain information

A deduction is not always correct, e.g., the question whether a bird can fly: fly(X) :- bird(X).

What about penguins, dead birds, or birds with clipped wings?

9.1 Uncertainty

(10)

• Consider an expert system for dentists

– All possible causes for toothaches are contained in a database and the reason should be deduced

– cavities(X) :- toothache(X).

periodontosis(X) :- toothache(X).

Not very helpful, since all possible causes are listed. Thus, all rules fire…

– cavities(X) :- toothache(X), ¬periodontosis(X).

periodontosis(X) :- toothache(X), ¬cavities(X).

Not very helpful either, because now we need to disprove all alternatives before any rule fires…

Remember the assumption of „negation as failure‟

9.1 Uncertainty

(11)

• But how do dentists deal with the problem?

– Like in our second program look for positive or negative clues

e.g., bleeding of gums,…

• Still, how does a dentist know what to look for?

– What are probable causes?

– What are possible causes?

– Knowing the patient, what is the (subjective) judgement?

9.1 Uncertainty

(12)

Basic idea: assign a measure of validity to each rule or statement and propagate this measure

through the deduction process

Probabilistic truth values

Use statistics: how often is cavities the reason and how often is peridontosis?

Leads to a probability distribution over possible worlds

Possibility values

What are possible causes an to what degree do they cause toothache?

Leads to a possibility distribution over possible worlds

Belief values

Lead to belief networks with facts that may influence each other

– …

9.1 Uncertainty

(13)

• Usually dealing with uncertainty needs an open world assumption

– Facts not stated in the database may or may not be false

• But the reasoning gets more difficult

– Remember our discussion about the existence of several minimal models in Datalogneg

– The reasoning process is not monotonic any more

Introduction of new knowledge might lead to a revision (and sometimes refutation) of previously derived facts

9.1 Uncertainty

(14)

Non-monotonic reasoning considers that sometimes statements considered true, have to be revised in the light of new facts

– Tweety is a bird.

Can Tweety fly? Yes!

– Tweety is a bird. Tweety is 2.5 meters.

Can Tweety fly? No!

– The introduction of a new fact has

challenged the general rule that birds can fly

Only ostriches reach a height of 2.5 meters!

9.1 Non-Monotonic Reasoning

(15)

• There are several classical approaches of dealing with the problem

– Default logic

– Predicate circumscription – Autoepistemic reasoning – …

9.1 Non-Monotonic Reasoning

(16)

Default logic was proposed by Raymond Reiter (University of Toronto) in 1980

– Can express logical facts like

„by default, something is true‟

– Basically a default theory consists of two parts D and W

W is a set of first order logical formulae known to be true

D is a set of default rules of the form

prerequisite : justification1, …, justificationn conclusion

9.1 Non-Monotonic Reasoning

(17)

– prerequisite : justification1, …, justificationn conclusion

– If we believe the prerequisite to be true, and each of justificationi is consistent with our current beliefs, we are led to believe that conclusion is true

Example: bird(X) : fly(X) with {bird(condor), bird(penguin), fly(X) fly(eagle), ¬fly(penguin)}

fly(condor) is true by default, since it is a bird and we have no justification to believe otherwise

But fly(penguin) cannot be derived here, since although bird(penguin) is true, we know that the justification is false

Neither can we deduce bird(eagle) which would be abduction

9.1 Non-Monotonic Reasoning

(18)

• A common default assumption is the closed world assumption true : ¬F

¬F

• The semantics of default logics is again based on fixpoints

– Use set W as initial theory T

– Add to a theory T every fact that can be deduced by using any of the default rules in D, so-called extensions to the theory T

– Repeat until nothing new can be deduced

– If T is consistent with all justifications of the default rules used to derive any extension, output T

9.1 Non-Monotonic Reasoning

(19)

• The last check in the algorithm is necessary to avoid inconsistent theories

– i.e. something has been deduced using a justification that was later proven to be false

– E.g. consider a default rule true : A(X) and W := Ø

¬ A(X)

Since A(X) is consistent with W we may conclude ¬A(X), which however is inconsistent with the previously

assumed A(X)

In this case the theory simply has no extensions

9.1 Non-Monotonic Reasoning

(20)

• Interestingly, the semantics is non-deterministic

– The deduced theory may depend on the sequence in which defaults are applied

Example: D:={ bird(X) : fly(X), penguin(X) : ¬fly(X) } fly(X) ¬fly(X)

with {bird(Tweety), penguin(Tweety)}

Starting with W both default rules are applicable

If we use the first rule, the extension fly(Tweety) would be added, and the second default rule is no longer applicable

In case we apply the second rule first, the extension would be ¬fly(Tweety)

9.1 Non-Monotonic Reasoning

(21)

Entailment of a formula from a default theory can be defined in two ways

Skeptical entailment

A formula is entailed by a default theory if it is entailed by all its extensions

Credulous entailment

a formula is entailed by a default theory if it is entailed by at least one of its extensions

– For example our Tweety theory has two extensions, one in which Tweety can fly and one in which he cannot fly

Neither extension is skeptically entailed

Both of them are credulously entailed

9.1 Non-Monotonic Reasoning

(22)

Predicate circumscription was introduced by John McCarthy (Stanford University) in 1978

– Inventor of LISP and the „space fountain‟

– Basically circumscription tries to formalize the common sense

assumption that things are as expected, unless specified otherwise

9.1 Non-Monotonic Reasoning

(23)

• Consider the problem whether Tweety can fly, if we assume that Tweety is a penguin…

– Sure, Tweety can fly,…

…because he takes a helicopter!

– This solution is intuitively not valid, since no helicopter was mentioned in our facts – Of course we could exclude

all possible ways to fly in our program, but…

9.1 Non-Monotonic Reasoning

(24)

• Circumscription is a rule of conjecture that can be used for jumping to certain conclusions

– The objects that can be shown to have a certain

property P by reasoning from certain facts A, are all the objects that satisfy P

More generally, circumscription can be used to conjecture that the substitutions that can be shown to satisfy a

predicate, are all the tuples satisfying this predicate

– Thus, the set of relevant tuples is circumscribed

9.1 Non-Monotonic Reasoning

(25)

Example: by circumscription a bird can be

conjectured to fly unless something prevents it

– The only entities that can prevent the bird from flying are those whose existence follows from the facts

If no clipped wings, being a penguin or other circumstances preventing flight are deducible, then the bird is concluded to fly

Basically, this can be done by adding a predicate

¬abnormal(X) to all rules about flying

– The correctness of this conclusion depends on having taken into account all relevant facts when the

circumscription was made

9.1 Non-Monotonic Reasoning

(26)

• Circumscription therefore tries to derive all minimal models of a set of formulae

– If we have a predicate p(X1, …, Xn) then a model tells

whether the predicate is true for any possible substitution with terms for Xi

The extension of p(X1, …, Xn) in a model is the set of substitutions for which p(X1, …, Xn) evaluates to true

– The circumscription of a formula is a minimization

believing only the least possible number of predicates

The circumscription of p(X1, …, Xn) in a formula is obtained by selecting only models with a minimal extension of p(X1, …, Xn)

9.1 Non-Monotonic Reasoning

(27)

Example

– Consider a formula of the type A ⋀ (B ⋁ C)  D like fly(X) :- bird(X), eagle(X).

fly(X) :- bird(X), condor(X).

Obviously bird(X) has to be true in any model, but to be minimal only eagle(X) or condor(X) has to be true

Hence there are two circumscriptions of the formula {bird(X), eagle(X)} and {bird(X), condor(X)}, but not {bird(X), eagle(X), condor(X)}

– Note that predicates are only evaluated as false, if it is possible

eagle(X) and condor(X) cannot both be false

9.1 Non-Monotonic Reasoning

(28)

• But sometimes circumscription handles disjunctive information incorrectly

– Toss a coin onto a chess board and consider the predicate lies_on(X, Y) where it lies

– There are several possibilities of models

Obviously {lies_on(coin, floor)} should be false, since it was not mentioned that the coin could miss the board

That leaves {lies_on(coin, white)}, {lies_on(coin, black)}, and {lies_on(coin, white), lies_on(coin, black)} for the overlapping case

– But the last model would be filtered out as not being minimal by circumscription

One possibility to remedy this case is theory curbing, where iteratively the least upper bound(s) of the minimal models is

9.1 Non-Monotonic Reasoning

(29)

Autoepistemic Logic was introduced by Robert C. Moore (Microsoft Research) in 1985

• Autoepistemic logic cannot only express

facts, but also knowledge and lack of knowledge about facts

• Formalizes non-monotonicity using statements with a belief operator B

– For every well-formed formula F, the „belief atom‟ B(F) means that F is believed

– ¬B(F) means that F is not believed

9.1 Non-Monotonic Reasoning

(30)

• It uses the following axioms

– All propositional tautologies are axioms

– If we believe in B(X) :- A(X)., then whenever we believe in A(X), we also have to believe in B(X) – Inconsistent conclusions are never believed, i.e.

¬B(false)

• It uses modus ponens as inference rule

– Given an conditional claim A  B and the truth of the antecedent A, it can be logically concluded that the

consequent B must be true as well

9.1 Non-Monotonic Reasoning

(31)

• This can be used to derive stable sets of sentences which are then believed

– i.e. the reflection of our own state of knowledge

• If we do not believe in a fact, then we believe that we do not believe it

– B(bird(X)) ⋀ ¬B(¬fly(X))  fly(X)

– If I believe that X is a bird and if I don‟t believe that X cannot fly, then I will conclude that X flies

9.1 Non-Monotonic Reasoning

(32)

• A belief theory T describes the knowledge base

– A restricted belief interpretation of T is a set of belief atoms I such that for each B(F) appearing in T either B(F)  I or ¬B(F)  I (but not both)

– A restricted belief model of T is a belief interpretation I such that T ⋃ I is consistent

9.1 Non-Monotonic Reasoning

(33)

• Again expansions to the theory can be derived

– Since all belief atoms have to be either true or false, the theory can be treated like propositional

formulae

– In particular, checking whether T entails F can be done using the rules of the propositional calculus – In order for an initial assumption to be an

expansion, it must be that F is entailed, iff B(F) has been initially assumed true

9.1 Non-Monotonic Reasoning

(34)

Probability theory deals with expressing the belief or knowledge that a certain event will or has occurred

• In general, there are two major factions among probability theorists

Frequentistic view:

Probability of an event is its relative frequency of

occurrence during a long running random experiment

Major supporters: Neyman, Pearson, Wald, …

9.2 Probability

(35)

Bayesian view:

Probabilities can be assigned to any event or statement whether it is part of a random process or not

Probabilities thus express the degree of belief that a given event will happen

Major supporters: Bayes, Laplace, de Finetti, …

• During the following slides, we will encounter both views

– …but still, formal notation and theory is similar in both

9.2 Probability

(36)

• The probability of an event or statement A is given by P(A)

– P(A) ∈ [0,1]

– P(¬A):=1-P(A)

– Depending on your world view, probability of P(A)=0.8 may mean

During an longer random experiment, A was the outcome of 80% of all tries

You have a strong belief (quantified by 0.8 of a maximum of 1) that A can / will happen

9.2 Probability

(37)

• Given two events A and B and assuming that they are statistically independent of each other,

probabilities may be combined

– P(A ⋀ B)= P(A) * P(B)

also written P(A, B)

– e.g.

P(isYellow(Tweety))=0.8 and P(canFly(Tweety))=0.2

⤇ P(isYellow(Tweety), canFly(Tweety)) = 0.16

9.2 Probability

(38)

• However, events are often not independent, thus we need conditional probabilities

– This is written as P(A | B)

P(A | B) is the conditional probability of A given B

P(A | B) := P(A ⋀ B) / P(B)

e.g. P(canBark(X) | dog(X)) = 0.9

Given that X is a dog, X can bark with a probability of 0.9

• Based on conditional probabilities, we can derive simple deductive system

Probabilistic rules:

B ←P(B|A) A or B :-P(B|A) A

9.2 Probabilistic Reasoning

(39)

• Of course, we can also form deductive chains

Example:

– dog(X) ←0.6 domestic_animal(X).

canBark(X) ←0.9 dog(X).

⊢ canBark(X) ←?? domestic_animal(X).

– So, assuming statistical independence between barking and domestic animals, we may conclude that the probabilities may be just multiplied, i.e.

canBark(X) ←0.54 domestic_animal(X).

9.2 Probabilistic Reasoning

(40)

• Unfortunately, this naïve approach breaks quickly

Example:

– dog(X) ←0.6 domestic_animal(X).

canBark(X) ←0.9 dog(X).

⊢ canBark(X) ←0.54 domestic_animal(X).

– domestic_animal(X) ←1.0 cat(X).

⊢ canBark(X) ←0.54 cat(X).

Cats can bark with 0.54 probability? Something is wrong…

– Problem:

dog(X) ←0.6 domestic_animal(X) ←1.0 cat(X). ??

9.2 Probabilistic Reasoning

(41)

• Why can‟t we have any confidence about barking cats?

– Not enough information!

– We don‟t know about

P(cat(x)|dog(x)), or P(bark(X)|cat(X)), …

9.2 Probabilistic Reasoning

dog(X) ←0.7 domestic_animal(X).

canBark(X) ←0.9dog(X).

domestic_animal(X) ←1.0 cat(X)

canBark(X) ←?? domestic_animal(X).

canBark(X) ←?? cat(X).

domestic animals cats

barks dogs

domestic animals

cats

barks dogs

All cats bark No cat barks

(42)

• Given two events with their respective

probabilities, P(A)=α and P(B)=β, how could they be related, i.e. what is P(A ⋀ B) ?

a) A and B could be independent, and thus P(A ⋀ B) :=P(A) * P(B)

e.g. P(isMonday(today)), P(cat(Garfield))

b) A and B could be mutually exclusive, thus P(A ⋀ B) := 0

e.g. P(isMonday(today)), P(isTuesday(today))

c) A implies B, thus P(A ⋀ B) = P(A)

e.g. P(isCat(X)), P(isAnimal(X))

9.2 Probabilistic Reasoning

(43)

d) There could also be no quantifiable relationship between P(A) and P(B)

However, according to Boole, we can at least provide an interval which contains P(A ⋀ B)

max(0, P(A)+P(B)-1) ≤ P(A ⋀ B) ≤ min(P(A), P(B))

Those two boundaries are called T-Norms

Minimum T-Norm: min(a, b) (also known as Gödel T-Norm) Łukasiewicz T-Norm: max(0, a+b-1)

Example: P(A)= 0.33, P(B) = 0.23 0 ≤ P(A ⋀ B) ≤ 0.23

9.2 Probabilistic Reasoning

P(A) P(B) P(A)

P(B)

P(A) P(B)

(44)

– Obviously, there may also be many additional cases (like negative correlation, A implies B when C, etc...) – However, if there is no further information available,

upper/lower bound estimation is the only possible case

We should try to incorporate these results into our to-be- developed chaining rule

– Thus, we can conclude

If there are no further properties known for A and B but their probabilities, their combined probability can only be described by an interval

9.2 Probabilistic Reasoning

(45)

• Confidence intervals also help to model probabilistic rules

– B ←(x1, x2) A iff 0 ≤ x1 ≤ P(B|A) ≤ x2 ≤ 1

i.e. given A, the probability for B is between x1 and x2

If x1=x2, this can still be abbreviated with B ←x1 A

e.g. canBark(X) ← (0.8, 1.0) dog(X)

9.2 Probabilistic Reasoning

(46)

• Also, rules combined with their converse can be stated that way

– A ←(x1, x2) B and its converse B ←(y1, y2) A, denoted as A (y1, y2)(x1, x2) B

– e.g. domesticAnimal(X) (0.3, 0.3) (1.0, 1.0) cat(X)

9.2 Probabilistic Reasoning

(47)

• The dominant reason for these flawed deductions is mixing causal rules with diagnostic rules

Causal Rules: Relate a known cause to its effect

A is the cause for B; A is given and B happened because of A

e.g. groundIsWet 1.0 sprinklerWasOn

Diagnostic Rules: Try to relate an observable effect to its cause

i.e. B 0.2 A

B is the cause for A, but just with a weaker probability / belief

e.g. sprinklerWasOn 0.3 groundIsWet

9.2 Probabilistic Reasoning

(48)

• Rule chaining along just causal OR diagnostic rules works just fine

– groundIsWet ←1.0 sprinklerWasOn youGetWetFeet ←0.97 groundIsWet

⊢ youGetWetFeet ←0.97 sprinklerWasOn – groundIsWet ←0.2 youGotWetFeet

itRained ←0.9 groundIsWet

⊢ itRained ←0.18 youGotWetFeet

9.2 Probabilistic Reasoning

(49)

• But careful:

– Causal:

groundIsWet ←1.0 sprinklerWasOn – Diagnostic:

itRained ←0.9 groundIsWet

– but not: itRained ←0.9 sprinklerWasOn

9.2 Probabilistic Reasoning

both are causes of wet ground, but are otherwise unrelated

(50)

• Causal and diagnostic can be treated in pairs

– Diagnostic rules are the converse of causal rules – groundIsWet ←1.0 sprinklerWasOn

groundIsWet 0.1→ sprinklerWasOn Written as:

groundIsWet 0.11.0 sprinklerWasOn groundIsWet 0.91.0 itRained

– Now, we need a heuristic for dealing with diagnostic and causal rules together

9.2 Probabilistic Reasoning

(51)

Observation:

– Causal rules usually have a quite high probability:

B ←~1.0 A

If probability was low, A is not really the cause for B

– Diagnostic rules usually have a lower probability:

A ←≪1.0 B

i.e., B may be the effect of A, but it is usually also the effect of other causes

9.2 Probabilistic Reasoning

(52)

Observation:

– So, the main syntactic difference between those rule types is the strength of belief in the deduction

– Consider bi-directional rules:

groundIsWet 0.11.0 sprinklerWasOn

– Thus, when chaining two rules with diverging

probabilities, we probably mix diagnostic and causal rules

A chaining rule needs a strong dampening factor for diverging probabilities

9.2 Probabilistic Reasoning

is probably causal rule, sprinkler wets the ground for sure

is probably diagnostic rule; there may be many other reasons for wet ground

(53)

• A correct chaining rule can be given as follows:

– C (y1, y2)(x1, x2) B, B (v1, v2)(u1, u2) A

⊢ C ← (z1, z2) A – z1 =

– z2 =

9.2 Probabilistic Reasoning

u1/v1 * max(0, v1+x1-1) u1

0

if v1>0

if v1=0 and x1=1 otherwise

min(1, u2+t*(1-y1), 1-u1+t*y1, t); t=u2*x2/v1*y1 min(1, 1-u1+(u2*x2)/v1)

1-u1

if v1>0 and y1>0 if v1>0 and y1=0 otherwise

if v1=0 and x2=0 1

Proof and derivation in:

(54)

• This chaining rule can be obtained by a lengthy proof within a deductive calculus

– …thus, it is correct

– Unfortunately, it is not really intuitively obvious what it does and how it works

But we can try to find some rationales

– The chaining rule tries to „play safe‟ by incorporating the T-Norms as a worst-case estimation

9.2 Probabilistic Reasoning

Łukasiewicz T-Norm as safe lower bound Minimum T-Norm as safe upper bound

(55)

• But it works:

– dog(X) 1.00.7 domestic(X).

barks(X) ←0.9 dog(X).

domestic(X)0.31.0 cat(X).

– By using the chaining rule, we get

⊢ barks(X) ←(0.63, 0.93) domestic(X).

⊢ barks(X) ←(0.0, 1.0) cat(X).

– If now additional knowledge is added, the belief intervals change

dog(X) ←1.0 barks(X). (Only dogs bark) dog(X) ←0.0 cat(X). (Cats are no dogs)

⊢ barks(X) 0.00.0 cat(X). (No barking cats)

9.2 Probabilistic Reasoning

(56)

• The chaining rule dampens all conclusion which seem to involve mixed causal/diagnostic chains

9.2 Probabilistic Reasoning

C (y1, y2)(x1, x2) B, B (v1, v2)(u1, u2) A ⊢ C ← (z1, z2) A rain 1.00.9 wet. wet 0.11.0 sprinkler.

Known: diagnostic? causal causal ⇾ diagnostic ⇾

Rule:

very low value if rules seem of different type

very high value if rules seem of different type

(57)

• Lets try to perform a “safer” chaining

9.2 Probabilistic Reasoning

C (y1, y2)(x1, x2) B, B (v1, v2)(u1, u2) A ⊢ C ← (z1, z2) A

wetFeet 0.20.97 wetGround. wetGround 0.11.0 sprinkler.

Known: causal causal

diagnostic ⇾ diagnostic ⇾

Rule:

wetFeet ←(0.7, 1.0) sprinkler Result:

higher value for causal chaining

higher value for causal chaining

(58)

• Summary: probabilistic deduction

– Chaining rules produce new rules which are only true within a certain confidence interval

Non-monotonism is reflected by adjusting those confidence intervals

– For computing the confidence intervals of a chain, the converse rules are considered

Thus, the problem of chaining diagnostic and causal rules is solved implicitly

9.2 Probabilistic Reasoning

(59)

Bayesian belief networks are used to represent a set of random variables and their conditional

probabilities

– Introduced by Judea Pearl (UCLA) in 1985

• The networks explicitly model the

independence relationships in the data

– These independence relationships can then be used to make probabilistic inferences

9.3 Bayesian Belief Networks

(60)

• Bayesian networks are directed acyclic graphs whose nodes represent random variables

Edges represent the direct (causal) influence between variables

Missing edges encode conditional independencies between the variables

– What causes toothaches?

Has flu anything to do with it?

9.3 Bayesian Belief Networks

toothache

periodontosis

cavities flu

(61)

Nodes are annotated with (conditional) probabilities

– Root nodes are assigned prior probability distributions

– Child nodes are assigned conditional probability tables with respect to

their parents

9.3 Bayesian Belief Networks

toothache

periodontosis

cavities flu

P(has_flu)

P(has_cavities) P(has_periodontitis)

P(toothache | has_cavities, has_periodontitis) P(toothache | has_cavities, ¬has_periodontitis) P(toothache | ¬has_cavities, has_periodontitis) P(toothache | ¬has_cavities, ¬has_periodontitis) gum bleeding

P(gum bleeding | has_periodontitis) P(gum bleeding | ¬has_periodontitis)

(62)

• What is the full joint distribution?

– P(X1, X2, ..., Xn)

=P(X1) * P(X2, X1, ..., Xn | X1)

=P(X1) * P(X2|X1) * P(X3, X4, ..., Xn| X1, X2)

= ...

= P(X1) * P(X2|X1) * P(X3|X1, X2)* ...* P(Xn| X1,...,Xn-1)

– Note that we did not use any independence assumption here

9.3 Bayesian Belief Networks

(63)

• Now, use the semantics of Bayesian belief networks (local Markov property)

– Let X1, …, Xn be an ordering of the nodes such that only the nodes that are indexed lower than i may have a directed path to Xi

– The full joint distribution can now be defined as the product of the local conditional distributions

P(X1, …, Xn) = Π1≤ i ≤ n P(Xi | Parents(Xi))

Note that all these probabilities are available in the network

9.3 Bayesian Belief Networks

(64)

• For example, what is the joint probability that somebody has periodontities and has toothache, but no cavities?

– P(has_periodontitis, ¬has_cavities, toothache)

= P(has_periodontitis) * P(¬has_cavities) *

P(toothache | ¬has_cavities, has_periodontitis)

9.3 Bayesian Belief Networks

(65)

• Given a Bayesian network and its conditional probability tables, we can compute all

probabilities of the form P(H | X

1

, X

2

,…, X

n

)

– Where H and X1, X2, ..., Xn are assignments to nodes (i.e. random variables) in the network

– H is the hypothesis we are interested in – X1, X2, ..., Xn are the influences

• By being conditionally dependent on their parents, beliefs are propagated through the network

9.3 Bayesian Belief Networks

(66)

Inferring causal or diagnostic information can be done using the joint probability distributions

– E.g., what is the probability that somebody has cavities given that he/she suffers from toothache?

– Can be evaluated using the conditional probability formula:

P(has_cavities | toothache)

= P(has_cavities, toothache) P(toothache)

9.3 Bayesian Belief Networks

(67)

• A Bayesian belief network for breast cancer diagnosis

9.3 Example: Medicine

(68)

• More reasoning

– Fuzzy logic and possibilistic sytems – Case-based reasoning

• Heuristic reasoning

9 Next Lecture

Referenzen

ÄHNLICHE DOKUMENTE

In the case of gapped BLG, since even at B = 0 the normal incident transmission is finite in the studied energy range with finite u, there already exist trajectories which go through

tiresome, hard work at dusty archives and taken to the active virtual life via social media. Almost all photos got comments, disproving or confirming

Ceasefire/peace negotiations should therefore include representatives of the Assad regime, Syrian political opponents and armed rebels from the Free Syrian Army, and

We provide a simple necessary and su¢cient condition for when a multiproduct demand system can be generated from a discrete choice model with unit demands1. Keywords: Discrete

The cointegration test, shown in Table 9, (see Engle &amp; Granger, 1987; Engle and Yoo, 1987, Table 2), shows that in the two cases with monthly data (models 5 and 6),

1) Inventory, reprocessing and analysis of existing data from the Weddell Sea to evaluate and identify key areas, which are in the need of protection. 2) Discussion, improvement

Control Relay Translator-14 enables the control engineer to develop his PDP-14 program directly from a relay ladder dia- gram and debug this program on-line with a PDP-14

We show that arrow cichlids (Amphilophus za/iosus) adjusted their territorial aggression regarding the status ofheterospecific intruders: breeding individuals of Amphilophus