• Keine Ergebnisse gefunden

Knowledge-Based Systems and Deductive Databases

N/A
N/A
Protected

Academic year: 2021

Aktie "Knowledge-Based Systems and Deductive Databases"

Copied!
63
0
0

Wird geladen.... (Jetzt Volltext ansehen)

Volltext

(1)

Wolf-Tilo Balke Hermann Kroll

Institut für Informationssysteme

Technische Universität Braunschweig http://www.ifis.cs.tu-bs.de

Knowledge-Based Systems

and Deductive Databases

(2)

8.1 Uncertain Knowledge 8.2 Probabilistic Application 8.3 Belief Networks

8. Deduction with Uncertainty

(3)

• We have discussed ways of deriving new facts from other (ground) facts

But often several rules can lead to a certain fact and we cannot be sure which one it was

A patient experiences toothaches, what is the reason?

Sometimes a certain fact might be derived from ground facts only in certain cases

A normal bird can fly, except for penguins, ostriches,…

8.1 Uncertainty

(4)

• Typical sources of imperfect information in deductive databases are…

Incomplete information

Information is simply missing, which might clash with the closed world assumption

Imprecise information

The information needed has only been specified in a vague way, e.g., a person is young: young(Tim).

Queries, about Tim’s age are difficult to answer, e.g., ?age(Tim, 67) is false, but what about ?age(Tim, 25)?

Uncertain information

A deduction is not always correct, e.g., the question whether a bird can fly: fly(X) :- bird(X).

What about penguins, dead birds, or birds with clipped wings?

8.1 Uncertainty

(5)

• Consider an expert system for dentists

All possible causes for toothaches are contained in a database and the reason should be deduced

cavities(X) :- toothache(X).

periodontosis(X) :- toothache(X).

Not very helpful, since all possible causes are listed. Thus, all rules fire…

cavities(X) :- toothache(X), ¬periodontosis(X).

periodontosis(X) :- toothache(X), ¬cavities(X).

Not very helpful either, because now we need to disprove all alternatives before any rule fires…

Remember the assumption of ‘negation as failure’

8.1 Uncertainty

(6)

• But how do dentists deal with the problem?

Like in our second program look for positive or negative clues

e.g., bleeding of gums,…

• Still, how does a dentist know what to look for?

What are probable causes?

What are possible causes?

Knowing the patient, what is the (subjective) judgement?

8.1 Uncertainty

(7)

Basic idea: assign a measure of validity to each rule or statement and propagate this measure

through the deduction process

Probabilistic truth values

Use statistics: how often is cavities the reason and how often is peridontosis?

Leads to a probability distribution over possible worlds

Possibility values

What are possible causes and to what degree do they cause toothache?

Leads to a possibility distribution over possible worlds

Belief values

Lead to belief networks with facts that may influence each other

– …

8.1 Uncertainty

(8)

• Usually dealing with uncertainty needs an open world assumption

Facts not stated in the database may or may not be false

• But the reasoning gets more difficult

Remember our discussion about the existence of several minimal models in Datalog

neg

The reasoning process is not monotonic any more

Introduction of new knowledge might lead to a revision (and sometimes refutation) of previously derived facts

8.1 Uncertainty

(9)

Probability theory deals with expressing the belief or knowledge that a certain event will or has occurred

• In general, there are two major factions among probability theorists

Frequentistic view:

Probability of an event is its relative frequency of

occurrence during a long running random experiment

Major supporters: Neyman, Pearson, Wald, …

8.2 Probability

(10)

Bayesian view:

Probabilities can be assigned to any event or statement whether it is part of a random process or not

Probabilities thus express the degree of belief that a given event will happen

Major supporters: Bayes, Laplace, de Finetti, …

• During the following slides, we will encounter both views

…but still, formal notation and theory is similar in both

8.2 Probability

(11)

• The probability of an event or statement A is given by P(A)

– P(A) ∈ [0,1]

P(¬A):=1-P(A)

Depending on your world view, probability of P(A)=0.8 may mean

During an longer random experiment, A was the outcome of 80% of all tries

You have a strong belief (quantified by 0.8 of a maximum of 1) that A can / will happen

8.2 Probability

(12)

• Given two events A and B and assuming that they are statistically independent of each other,

probabilities may be combined

– P(A ⋀ B)= P(A) * P(B)

also written P(A, B)

e.g.

P(isYellow(Tweety))=0.8 and P(canFly(Tweety))=0.2

⤇ P(isYellow(Tweety), canFly(Tweety)) = 0.16

8.2 Probability

(13)

• However, events are often not independent, thus we need conditional probabilities

This is written as P(A | B)

P(A | B) is the conditional probability of A given B

P(A | B) := P(A ⋀ B) / P(B)

e.g. P(canBark(X) | dog(X)) = 0.9

Given that X is a dog, X can bark with a probability of 0.9

• Based on conditional probabilities, we can derive simple deductive system

Probabilistic rules:

B ←P(B|A) A or B :- P(B|A) A

8.2 Probabilistic Reasoning

(14)

• Of course, we can also form deductive chains

Example:

– dog(X) ←0.6

domestic_animal(X).

canBark(X) ←

0.9

dog(X).

⊢ canBark(X) ←??

domestic_animal(X).

So, assuming statistical independence between barking and domestic animals, we may conclude that the probabilities may be just multiplied, i.e.

canBark(X) ←

0.54

domestic_animal(X).

8.2 Probabilistic Reasoning

(15)

• Unfortunately, this naïve approach breaks quickly

Example:

– dog(X) ←0.6

domestic_animal(X).

canBark(X) ←

0.9

dog(X).

canBark(X) ←

0.54

domestic_animal(X).

domestic_animal(X) ←

1.0

cat(X).

canBark(X) ←

0.54

cat(X).

Cats can bark with 0.54 probability? Something is wrong…

Problem:

dog(X) ←0.6 domestic_animal(X) ←1.0 cat(X). ??

8.2 Probabilistic Reasoning

(16)

• Why can’t we have any confidence about barking cats?

Not enough information!

We don’t know about

P(cat(x)|dog(x)), or P(bark(X)|cat(X)), …

8.2 Probabilistic Reasoning

dog(X) ←0.7 domestic_animal(X).

canBark(X) ←0.9 dog(X).

domestic_animal(X) ←1.0 cat(X)

canBark(X) ←?? domestic_animal(X).

canBark(X) ←?? cat(X).

domestic animals cats

barks dogs

domestic animals

cats

barks dogs

All cats bark No cat barks

(17)

• Given two events with their respective

probabilities, P(A)=α and P(B)=β, how could they be related, i.e. what is P(A ⋀ B) ?

a) A and B could be independent, and thus

P(A ⋀ B) :=P(A) * P(B)

e.g. P(isMonday(today)), P(cat(Garfield))

b) A and B could be mutually exclusive, thus

P(A ⋀ B)

:= 0

e.g. P(isMonday(today)), P(isTuesday(today))

c) A implies B, thus P(A ⋀ B) = P(A)

e.g. P(isCat(X)), P(isAnimal(X))

8.2 Probabilistic Reasoning

(18)

d) There could also be no quantifiable relationship between P(A) and P(B)

However, according to Boole, we can at least provide an interval which contains P(A ⋀ B)

max(0, P(A)+P(B)-1) ≤ P(A ⋀ B) ≤ min(P(A), P(B))

Those two boundaries are called T-Norms

Minimum T-Norm: min(a, b) (also known as Gödel T-Norm) ŁukasiewiczT-Norm: max(0, a+b-1)

Example: P(A)= 0.33, P(B) = 0.23 0 ≤ P(A ⋀ B) ≤ 0.23

8.2 Probabilistic Reasoning

P(A) P(B)

P(A ⋀ B) = 0 = max(0, a+b-1)

P(A)

P(B) P(A ⋀ B) = 0.053

P(A) P(B)

P(A ⋀ B) = 0.23 = min(a, b)

(19)

Obviously, there may also be many additional cases (like negative correlation, A implies B when C, etc...)

However, if there is no further information available,

upper/lower bound estimation is the only possible case

We should try to incorporate these results into our to-be- developed chaining rule

Thus, we can conclude

If there are no further properties known for A and B but their probabilities, their combined probability can only be described by an interval

8.2 Probabilistic Reasoning

(20)

• Confidence intervals also help to model probabilistic rules

– B ←(x1, x2)

A iff 0 ≤ x1 ≤ P(B|A) ≤ x2 ≤ 1

i.e. given A, the probability for B is between x1 and x2

If x1=x2, this can still be abbreviated with B ←x1 A

e.g. canBark(X) ←(0.8, 1.0) dog(X)

8.2 Probabilistic Reasoning

(21)

• Also, rules combined with their converse can be stated that way

– A ←(x1, x2)

B and its converse

B ←(y1, y2)

A, denoted as A

(y1, y2)(x1, x2)

B

e.g. domesticAnimal(X)

(0.3, 0.3) (1.0, 1.0)

cat(X)

8.2 Probabilistic Reasoning

(22)

• The dominant reason for these flawed deductions is mixing causal rules with diagnostic rules

Causal Rules: Relate a known cause to its effect

A is the cause for B; A is given and B happened because of A

e.g. groundIsWet 1.0 sprinklerWasOn

Diagnostic Rules: Try to relate an observable effect to its cause

i.e. B 0.2 A

B is the cause for A, but just with a weaker probability / belief

e.g. sprinklerWasOn 0.3 groundIsWet

8.2 Probabilistic Reasoning

(23)

• Rule chaining along just causal OR diagnostic rules works just fine

groundIsWet

1.0

sprinklerWasOn youGetWetFeet

0.97

groundIsWet

youGetWetFeet

0.97

sprinklerWasOn

groundIsWet

0.2

youGotWetFeet

itRained

0.9

groundIsWet

itRained

0.18

youGotWetFeet

8.2 Probabilistic Reasoning

(24)

• But careful:

Causal:

groundIsWet

1.0

sprinklerWasOn

Diagnostic:

itRained

0.9

groundIsWet

but not: itRained

0.9

sprinklerWasOn

8.2 Probabilistic Reasoning

both are causes of wet ground, but are otherwise unrelated

(25)

• Causal and diagnostic can be treated in pairs

Diagnostic rules are the converse of causal rules

groundIsWet

1.0

sprinklerWasOn

groundIsWet

0.1

sprinklerWasOn Written as:

groundIsWet

0.11.0

sprinklerWasOn groundIsWet

0.91.0

itRained

Now, we need a heuristic for dealing with diagnostic and causal rules together

8.2 Probabilistic Reasoning

(26)

Observation:

Causal rules usually have a quite high probability:

B

~1.0

A

If probability was low, A is not really the cause for B

Diagnostic rules usually have a lower probability:

A

≪1.0

B

i.e., B may be the effect of A, but it is usually also the effect of other causes

8.2 Probabilistic Reasoning

(27)

Observation:

– So, the main syntactic difference between those rule types is the strength of belief in the deduction

– Consider bi-directional rules:

groundIsWet 0.11.0 sprinklerWasOn

– Thus, when chaining two rules with diverging

probabilities, we probably mix diagnostic and causal rules

A chaining rule needs a strong dampening factor for diverging probabilities

8.2 Probabilistic Reasoning

is probably causal rule, sprinkler wets the ground for sure

is probably diagnostic rule; there may be many other reasons for wet ground

(28)

• A correct chaining rule can be given as follows:

C

(y1, y2)(x1, x2)

B, B

(v1, v2)(u1, u2)

A

⊢ C ← (z1, z2)

A

z1 =

z2 =

8.2 Probabilistic Reasoning

u1/v1 * max(0, v1+x1-1) u1

0

if v1>0

if v1=0 and x1=1 otherwise

min(1, u2+t*(1-y1), 1-u1+t*y1, t); t=u2*x2/v1*y1 min(1, 1-u1+(u2*x2)/v1)

1-u1

if v1>0 and y1>0 if v1>0 and y1=0 otherwise

if v1=0 and x2=0 1

Proof and derivation in:

U. Güntzer, W. Kießling, H. Thöne. New directions for uncertainty reasoning in deductive databases . Proc. ACM SIGMOD, 1991

(29)

• This chaining rule can be obtained by a lengthy proof within a deductive calculus

…thus, it is correct

Unfortunately, it is not really intuitively obvious what it does and how it works

But we can try to find some rationales

The chaining rule tries to ‘play safe’ by incorporating the T-Norms as a worst-case estimation

8.2 Probabilistic Reasoning

ŁukasiewiczT-Norm as safe lower bound Minimum T-Norm as safe upper bound

(30)

• But it works:

– dog(X) 1.00.7 domestic(X).

barks(X) ←0.9 dog(X).

domestic(X)0.31.0 cat(X).

– By using the chaining rule, we get

⊢ barks(X) ←(0.63, 0.93) domestic(X).

⊢ barks(X) ←(0.0, 1.0) cat(X).

– If now additional knowledge is added, the belief intervals change

dog(X) ←1.0 barks(X). (Only dogs bark) dog(X) ←0.0 cat(X). (Cats are no dogs)

⊢ barks(X) 0.00.0 cat(X). (No barking cats)

8.2 Probabilistic Reasoning

(31)

• The chaining rule dampens all conclusion which seem to involve mixed causal/diagnostic chains

8.2 Probabilistic Reasoning

C (y1, y2)(x1, x2) B, B (v1, v2)(u1, u2) A ⊢ C ← (z1, z2) A rain 1.00.9wet. wet 0.11.0 sprinkler.

Known: diagnostic? causal causal ⇾ diagnostic ⇾

Rule:

very low value if rules seem of different type

very high value if rules seem of different type

(32)

• Lets try to perform a “safer” chaining

8.2 Probabilistic Reasoning

C (y1, y2)(x1, x2) B, B (v1, v2)(u1, u2) A ⊢ C ← (z1, z2) A

wetFeet 0.20.97 wetGround. wetGround0.11.0sprinkler.

Known: causal causal

diagnostic ⇾ diagnostic ⇾

Rule:

wetFeet (0.7, 1.0)sprinkler Result:

higher value for causal chaining

higher value for causal chaining

(33)

• Summary: probabilistic deduction

Chaining rules produce new rules which are only true within a certain confidence interval

Non-monotonism is reflected by adjusting those confidence intervals

For computing the confidence intervals of a chain, the converse rules are considered

Thus, the problem of chaining diagnostic and causal rules is solved implicitly

8.2 Probabilistic Reasoning

(34)

Non-monotonic reasoning considers that sometimes statements considered true, have to be revised in the light of new facts

Tweety is a bird.

Can Tweety fly? Yes!

Tweety is a bird. Tweety is 2.5 meters.

Can Tweety fly? No!

The introduction of a new fact has

challenged the general rule that birds can fly

Only ostriches reach a height of 2.5 meters!

8.1 Non-Monotonic Reasoning

(35)

• There are several classical approaches of dealing with the problem

Default logic

Predicate circumscription

Autoepistemic reasoning

8.1 Non-Monotonic Reasoning

(36)

Default logic was proposed by Raymond Reiter (University of Toronto) in 1980

Can express logical facts like

‘by default, something is true’

Basically a default theory consists of two parts D and W

W is a set of first order logical formulae known to be true

D is a set of default rules of the form

prerequisite : justification1, …, justificationn conclusion

8.1 Default Logic

(37)

– prerequisite : justification1, …, justificationn conclusion

– If we believe the prerequisite to be true, and each of justificationi is consistent with our current beliefs, we are led to believe that conclusion is true

Example: bird(X) : fly(X) with {bird(condor), bird(penguin), fly(X) fly(eagle), ¬fly(penguin)}

fly(condor) is true by default, since it is a bird and we have no justification to believe otherwise

But fly(penguin) cannot be derived here, since although bird(penguin) is true, we know that the justification is false

Neither can we deduce bird(eagle) which would be abduction

8.1 Default Logic

(38)

• A common default assumption is the closed world assumption true : ¬F

¬F

• The semantics of default logics is again based on fixpoints

– Use set W as initial theory T

– Add to a theory T every fact that can be deduced by using any of the default rules in D, so-called extensions to the theory T

– Repeat until nothing new can be deduced

– If T is consistent with all justifications of the default rules used to derive any extension, output T

8.1 Default Logic

(39)

• The last check in the algorithm is necessary to avoid inconsistent theories

i.e. something has been deduced using a justification that was later proven to be false

E.g. consider a default rule true : A(X) and W := Ø

¬ A(X)

Since A(X) is consistent with W we may conclude ¬A(X), which however is inconsistent with the previously

assumed A(X)

In this case the theory simply has no extensions

8.1 Default Logic

(40)

• Interestingly, the semantics is non-deterministic

The deduced theory may depend on the sequence in which defaults are applied

Example: D:={ bird(X) : fly(X), penguin(X) : ¬fly(X) } fly(X) ¬fly(X)

with {bird(Tweety), penguin(Tweety)}

Starting with W both default rules are applicable

If we use the first rule, the extension fly(Tweety) would be added, and the second default rule is no longer applicable

In case we apply the second rule first, the extension would be ¬fly(Tweety)

8.1 Default Logic

(41)

Entailment of a formula from a default theory can be defined in two ways

Skeptical entailment

A formula is entailed by a default theory if it is entailed by all its extensions

Credulous entailment

A formula is entailed by a default theory if it is entailed by at least one of its extensions

– For example our Tweety theory has two extensions, one in which Tweety can fly and one in which he cannot fly

Neither extension is skeptically entailed

Both of them are credulously entailed

8.1 Default Logic

(42)

Predicate circumscription was introduced by John McCarthy (Stanford University) in 1978

Inventor of LISP and the ‘space fountain’

Basically circumscription tries to formalize the common sense

assumption that things are as expected, unless specified otherwise

8.1 Predicate Circumscription

(43)

• Consider the problem whether Tweety can fly, if we assume that Tweety is a penguin…

Sure, Tweety can fly, …

…because he takes a helicopter!

This solution is intuitively not valid, since no helicopter was mentioned in our facts

Of course we could exclude

all possible ways to fly in our program, but…

8.1 Predicate Circumscription

(44)

• Circumscription is a rule of conjecture that can be used for jumping to certain conclusions

The objects that can be shown to have a certain

property P by reasoning from certain facts A, are all the objects that satisfy P

More generally, circumscription can be used to conjecture that the substitutions that can be shown to satisfy a

predicate, are all the tuples satisfying this predicate

Thus, the set of relevant tuples is circumscribed

8.1 Predicate Circumscription

(45)

Example: by circumscription a bird can be

conjectured to fly unless something prevents it

The only entities that can prevent the bird from flying are those whose existence follows from the facts

If no clipped wings, being a penguin or other circumstances preventing flight are deducible, then the bird is concluded to fly

Basically, this can be done by adding a predicate

¬abnormal(X) to all rules about flying

The correctness of this conclusion depends on having taken into account all relevant facts when the

circumscription was made

8.1 Predicate Circumscription

(46)

• Circumscription therefore tries to derive all minimal models of a set of formulae

– If we have a predicate p(X1, …, Xn) then a model tells

whether the predicate is true for any possible substitution with terms for Xi

The extension of p(X1, …, Xn) in a model is the set of substitutions for which p(X1, …, Xn) evaluates to true

– The circumscription of a formula is a minimization

believing only the least possible number of predicates

The circumscription of p(X1, …, Xn) in a formula is obtained by selecting only models with a minimal extension of p(X1, …, Xn)

8.1 Predicate Circumscription

(47)

Example

Consider a formula of the type A ⋀ (B ⋁ C) → D like fly(X) :- bird(X), eagle(X).

fly(X) :- bird(X), condor(X).

Obviously bird(X) has to be true in any model, but to be minimal only eagle(X) or condor(X) has to be true

Hence there are two circumscriptions of the formula {bird(X), eagle(X)} and {bird(X), condor(X)}, but not {bird(X), eagle(X), condor(X)}

Note that predicates are only evaluated as false, if it is possible

eagle(X) and condor(X) cannot both be false

8.1 Predicate Circumscription

(48)

• But sometimes circumscription handles disjunctive information incorrectly

– Toss a coin onto a chess board and consider the predicate lies_on(X, Y) where it lies

– There are several possibilities of models

Obviously {lies_on(coin, floor)} should be false, since it was not mentioned that the coin could miss the board

That leaves {lies_on(coin, white)}, {lies_on(coin, black)}, and {lies_on(coin, white), lies_on(coin, black)} for the overlapping case

– But the last model would be filtered out as not being minimal by circumscription

One possibility to remedy this case is theory curbing, where iteratively the least upper bound(s) of the minimal models is added until the set of models is closed

8.1 Predicate Circumscription

(49)

Autoepistemic Logic was introduced by Robert C. Moore (Microsoft Research) in 1985

• Autoepistemic logic cannot only express

facts, but also knowledge and lack of knowledge about facts

• Formalizes non-monotonicity using statements with a belief operator B

– For every well-formed formula F, the ‘belief atom’ B(F) means that F is believed

– ¬B(F) means that F is not believed

8.1 Autoepistemic Logic

(50)

• It uses the following axioms

All propositional tautologies are axioms

If we believe in B(X) :- A(X)., then whenever we believe in A(X), we also have to believe in B(X)

Inconsistent conclusions are never believed, i.e.

¬B(false)

• It uses modus ponens as inference rule

Given an conditional claim A

B and the truth of the antecedent A, it can be logically concluded that the

consequent B must be true as well

8.1 Autoepistemic Logic

(51)

• This can be used to derive stable sets of sentences which are then believed

i.e. the reflection of our own state of knowledge

• If we do not believe in a fact, then we believe that we do not believe it

B

(bird(X)) ⋀ ¬B(¬fly(X)) →

fly(X)

If I believe that X is a bird and if I don’t believe that X cannot fly, then I will conclude that X flies

8.1 Autoepistemic Logic

(52)

• A belief theory T describes the knowledge base

A restricted belief interpretation of T is a set of belief atoms I such that for each B(F) appearing in T either B(F)  I or ¬B(F)

I (but not both)

A restricted belief model of T is a belief interpretation I such that T ⋃ I is consistent

8.1 Autoepistemic Logic

(53)

• Again expansions to the theory can be derived

Since all belief atoms have to be either true or false, the theory can be treated like propositional

formulae

In particular, checking whether T entails F can be done using the rules of the propositional calculus

In order for an initial assumption to be an

expansion, it must be that F is entailed, iff B(F) has been initially assumed true

8.1 Autoepistemic Logic

(54)

Bayesian belief networks are used to represent a set of random variables and their conditional

probabilities

Introduced by Judea Pearl (UCLA) in 1985

• The networks explicitly model the

independence relationships in the data

These independence relationships can then be used to make probabilistic inferences

8.3 Bayesian Belief Networks

(55)

• Bayesian networks are directed acyclic graphs whose nodes represent random variables

Edges represent the direct (causal) influence between variables

Missing edges encode conditional independencies between the variables

What causes toothaches?

Has flu anything to do with it?

8.3 Bayesian Belief Networks

toothache

periodontosis

cavities flu

gum bleeding

(56)

Nodes are annotated with (conditional) probabilities

Root nodes are assigned prior probability distributions

Child nodes are assigned conditional probability tables with respect to

their parents

8.3 Bayesian Belief Networks

toothache

periodontosis

cavities flu

P(has_flu)

P(has_cavities) P(has_periodontitis)

P(toothache | has_cavities, has_periodontitis) P(toothache | has_cavities, ¬has_periodontitis) P(toothache | ¬has_cavities, has_periodontitis) P(toothache | ¬has_cavities, ¬has_periodontitis) gum bleeding

P(gum bleeding | has_periodontitis) P(gum bleeding | ¬has_periodontitis)

(57)

• What is the full joint distribution?

P(X

1

, X

2

, ..., X

n

)

=P(X

1

) * P(X

2

, X

1

, ..., X

n

| X

1

)

=P(X

1

) * P(X

2

|X

1

) * P(X

3

, X

4

, ..., X

n

| X

1

, X

2

)

= ...

= P(X

1

) * P(X

2

|X

1

) * P(X

3

|X

1

, X

2

)* ...* P(X

n

| X1,...,X

n-1

)

Note that we did not use any independence assumption here

8.3 Bayesian Belief Networks

(58)

• Now, use the semantics of Bayesian belief networks (local Markov property)

Let X

1, …, Xn

be an ordering of the nodes such that only the nodes that are indexed lower than i may have a directed path to X

i

The full joint distribution can now be defined as the product of the local conditional distributions

P(X

1, …,

X

n

) =

Π1≤ i ≤ n

P(X

i

| Parents(X

i

))

Note that all these probabilities are available in the network

8.3 Bayesian Belief Networks

(59)

• For example, what is the joint probability that somebody has periodontities and has toothache, but no cavities?

P(has_periodontitis, ¬has_cavities, toothache)

= P(has_periodontitis) * P(¬has_cavities) *

P(toothache | ¬has_cavities, has_periodontitis)

8.3 Bayesian Belief Networks

(60)

• Given a Bayesian network and its conditional probability tables, we can compute all

probabilities of the form P(H | X

1

, X

2

,…, X

n

)

Where H and X

1

, X

2

, ..., X

n

are assignments to nodes (i.e. random variables) in the network

H is the hypothesis we are interested in

X

1

, X

2

, ..., X

n

are the influences

• By being conditionally dependent on their parents, beliefs are propagated through the network

8.3 Bayesian Belief Networks

(61)

Inferring causal or diagnostic information can be done using the joint probability distributions

E.g., what is the probability that somebody has cavities given that he/she suffers from toothache?

Can be evaluated using the conditional probability formula:

P(has_cavities | toothache)

= P(has_cavities, toothache) P(toothache)

8.3 Bayesian Belief Networks

(62)

• A Bayesian belief network for breast cancer diagnosis

8.3 Example: Medicine

by Charles Kahn, Linda M. Roberts, Kun Wang, Deb Jenks, Peter Haddawy (1995)

(63)

• More reasoning

Fuzzy logic and possibilistic sytems

Case-based reasoning

• Heuristic reasoning

8 Next Lecture

Referenzen

ÄHNLICHE DOKUMENTE

• The basic building blocks of description logics are concepts, roles and individuals.. – Like with frame systems, think of concepts like OO classes

13.1 Generating ontologies 13.2 Collective Intelligence 13.3 Folksonomies.. 13

topic of question is Godzilla, text contains named entity Godzilla. • Sentence proximity between passage

– Add to a theory T every fact that can be deduced by using any of the default rules in D, so-called extensions to the theory T. – Repeat until nothing new can

– Thus, Herbrand interpretation can be defined by listing all atoms from the base which evaluate to true. •  A Herbrand interpretation can identified with a subset of the

•  For more expressive logic languages (like Prolog), deductive systems are used to find the truth values for the elements of the Herbrand universe.

–  On the union of those rules and the previous ground instances, apply normal fixpoint iteration..

–  Transform program # to a semantically equivalent program #’ which can be evaluated faster using the same evaluation technique.