Planning and Optimization E2. Landmarks: RTG Landmarks & MHS Heuristic Malte Helmert and Gabriele R¨oger

(1)

E2. Landmarks: RTG Landmarks & MHS Heuristic

Malte Helmert and Gabriele R¨oger

Universit¨at Basel

(2)

Content of this Course

Planning

Classical

Foundations Logic Heuristics Constraints

Probabilistic

Explicit MDPs Factored MDPs

(3)

Content of this Course: Constraints

Constraints

Landmarks RTG Landmarks

MHS Heuristic

LM-Cut Heuristic Cost

Partitioning Network

Flows Operator Counting

(4)

Landmarks

(5)

Landmarks

Basic Idea: Something that must happen in every solution For example

some operator must be applied (action landmark) some atomic proposition must hold (fact landmark) some formula must be true (formula landmark)

→Derive heuristic estimate from this kind of information.

We only consider fact and disjunctive action landmarks.

(6)

Landmarks

Basic Idea: Something that must happen in every solution For example

some operator must be applied (action landmark) some atomic proposition must hold (fact landmark) some formula must be true (formula landmark)

→Derive heuristic estimate from this kind of information.

We only considerfactanddisjunctive action landmarks.

(7)

Definition

Definition (Disjunctive Action Landmark)

Lets be a state of planning task Π =hV,I,O, γi.

Adisjunctive action landmarkfor s is a set of operatorsL⊆O such that every label path froms to a goal state contains an operator fromL.

Thecostof landmark Lis cost(L) = mino∈Lcost(o).

Definition (Fact Landmark)

Lets be a state of planning task Π =hV,I,O, γi.

An atomic propositionv =d for v ∈V andd ∈dom(v) is a fact landmarkfors if every state path from s to a goal state contains a states⁰ with s⁰(v) =d.

If we talk about landmarks for the initial state, we omit “forI”.

(8)

Landmarks: Example

Example

Consider a FDR planning taskhV,I,O, γi with V ={robot-at,dishes-at} with

dom(robot-at) ={A1, . . . ,C3,B4,A5, . . . ,B6}

dom(dishes-at) ={Table,Robot,Dishwasher}

I ={robot-at7→C1,dishes-at7→Table}

operators

move-x-y to move from cellx to adjacent celly pickup dishes, and

load dishes into the dishwasher.

γ = (robot-at=B6)∧(dishes-at= Dishwasher)

(9)

Landmarks Landmarks from RTGs Minimum Hitting Set Heuristic Summary

Fact Landmarks: Example

1 2 3 4 5 6

C B A

Images from wikimedia

Each fact in gray is a fact landmark:

robot-at= x for x∈ {A1,A6,B3,B4,B5,B6,C1}

dishes-at =x forx ∈ {Dishwasher,Robot,Table}

Dumym 2

(10)

Disjunctive Action Landmarks: Example

1 2 3 4 5 6

C B A

Actions of same color form disjunctive action landmark:

{pickup}

{load}

{move-B3-B4}

{move-B4-B5}

{move-A6-B6,move-B5-B6}

{move-A3-B3,move-B2-B3,move-C3-B3}

{move-B1-A1,move-A2-A1}

. . .

(11)

Remarks

Not every landmark is informative. Some examples:

The set of all operators is a disjunctive action landmark unless the initial state is already a goal state.

Every variable that is initially true is a fact landmark.

Deciding whether a given variable is a fact landmark is as hard as the plan existence problem.

Deciding whether a given operator set is a disjunctive action landmark is as hard as the plan existence problem.

Every fact landmark v that is initially false induces a disjunctive action landmark consisting of all operators that possibly makev true.

(12)

Landmarks from RTGs

(13)

Content of this Course: Constraints

Constraints

MHS Heuristic

Flows Operator Counting

(14)

Computing Landmarks

How can we come up with landmarks?

Most landmarks are derived from therelaxed task graph:

RHW landmarks: Richter, Helmert & Westphal. Landmarks Revisited. (AAAI 2008)

LM-Cut: Helmert & Domshlak. Landmarks, Critical Paths and Abstractions: What’s the Difference Anyway? (ICAPS 2009) h^m landmarks: Keyder, Richter & Helmert: Sound and Complete Landmarks for And/Or Graphs (ECAI 2010) We discussh^m landmarksrestricted to m= 1

and to STRIPS planning tasks.

(15)

Incidental Landmarks: Example

Example (Incidental Landmarks)

Consider a STRIPS planning taskhV,I,{o₁,o₂}, γi with V ={a,b,c,d,e,f},

I ={a7→T,b 7→T,c 7→F,d 7→F,e 7→T,f 7→F}, o1 =h{a},{c,d,e},{a,b}i,

o2 =h{d,e},{f},{a,d}i, and γ ={e,f}.

Single solution: ho₁,o₂i

All variables are fact landmarks.

Variable b is initially true but irrelevant for the plan.

Variable c gets true as “side effect” ofo1 but it is not necessary for the goal or to make an operator applicable.

(16)

Causal Landmarks

Definition (Causal Fact Landmark)

Let Π =hV,I,O, γibe a STRIPS planning task.

An atomic propositionv =Tfor v ∈V is a causal fact landmark if v∈γ

or if for all goal pathsπ =ho₁, . . . ,o_ni there is ano_i with v ∈pre(oi).

(17)

Causal Landmarks: Example

Example (Causal Landmarks)

o2 =h{d,e},{f},{a,d}i, and γ ={e,f}.

Single solution: ho₁,o₂i

All variables are fact landmarks for the initial state.

Only a,d,e andf are causal landmarks.

(18)

What We Are Doing Next

Causal landmarks are the desirable landmarks.

We can use a simplified version of RTGs to compute causal landmarks for STRIPS planning tasks.

We will define landmarks of AND/OR graphs, . . . and show how they can be computed.

Afterwards we establish that these are landmarks of the planning task.

(19)

Simplified Relaxed Task Graph

Definition

For a STRIPS planning task Π =hV,I,O, γi, thesimplified relaxed task graphsRTG(Π⁺) is theAND/OR graph hN_and∪Nor,A,typei with

N_and={n_o |o ∈O} ∪ {v_I,v_G} with type(n) =∧for all n∈N_and, Nor={n_v |v ∈V}

with type(n) =∨for all n∈N_or, and A={hn_a,n_oi |o∈O,a∈add(o)} ∪ E ={hn_o,n_pi |o ∈O,p ∈pre(o)} ∪ E ={hn_v,nIi |v ∈I} ∪

E ={hn_G,n_vi |v ∈γ}

(20)

Simplified RTG: Example

The simplified RTG for our example task is:

a b

c

d

e f

I

o1 o2

G

(21)

Characterizing Equation System

Theorem

Let G =hN,A,typeibe an AND/OR graph. Consider the following system of equations:

LM(n) ={n} ∪ \

hn,n⁰i∈A

LM(n⁰) type(n) =∨ LM(n) ={n} ∪ [

hn,n⁰i∈A

LM(n⁰) type(n) =∧

The equation system has a unique maximal solution (maximal with regard to set inclusion), and for this solution it holds that

n⁰ ∈LM(n) iff n⁰ is a landmark for reaching n in G.

(22)

Computation of Maximal Solution

Theorem

Let G =hN,A,typeibe an AND/OR graph. Consider the following system of equations:

LM(n) ={n} ∪ \

hn,n⁰i∈A

LM(n⁰) type(n) =∨ LM(n) ={n} ∪ [

hn,n⁰i∈A

LM(n⁰) type(n) =∧

The equation system has a unique maximal solution (maximal with regard to set inclusion).

Computation: Initialize landmark sets asLM(n) =N_and∪N_or and Computation: apply equations as update rules until fixpoint.

(23)

Computation: Example

a b

c

d

e f

I

o₁ o₂

G

a-f,I,G,o1,o2 a-f,I,G,o1,o2

a-f,I,G,o1,o2

a-f,I,G,oI 1,o2 a-f,I,G,o1,o2

a,I b,I e,I

a,c,I,o1

a,d,I,o1

a,d,e,f,I,o1,o2

a,d,e,f,I,G,o1,o2

(24)

Computation: Example

a b

c

d

e f

I

o₁ o₂

G

a-f,I,G,o1,o2

I

a,I b,I e,I

a,c,I,o1

a,d,I,o1

a,d,e,f,I,o1,o2

a,d,e,f,I,G,o1,o2

Initialize with all nodes

(25)

Computation: Example

a b

c

d

e f

I

o₁ o₂

G

a-f,I,G,o1,o2

I

a,I b,I e,I

a,c,I,o1

a,d,I,o1

a,d,e,f,I,o1,o2

a,d,e,f,I,G,o1,o2

LM(I) ={I}

(26)

Computation: Example

a b

c

d

e f

I

o₁ o₂

G

a-f,I,G,o1,o2

I a,I

b,I e,I

a,c,I,o1

a,d,I,o1

a,d,e,f,I,o1,o2

a,d,e,f,I,G,o1,o2

LM(a) ={a} ∪LM(I)

(27)

Computation: Example

a b

c

d

e f

I

o₁ o₂

G

a-f,I,G,o1,o2

I

a,I b,I

e,I a,c,I,o1

a,d,I,o1

a,d,e,f,I,o1,o2

a,d,e,f,I,G,o1,o2

LM(b) ={b} ∪LM(I)

(28)

Computation: Example

a b

c

d

e f

I

o₁ o₂

G

a-f,I,G,o1,o2

I

a,I b,I e,I

a,c,I,o1

a,d,I,o1

a,d,e,f,I,o1,o2

a,d,e,f,I,G,o1,o2

LM(e) ={e} ∪(LM(I)∩LM(o₁))

(29)

Computation: Example

a b

c

d

e f

I

o₁ o₂

G

a-f,I,G,o1,o2

I

a,I b,I e,I

a,I,o1

a,c,I,o1

a,d,I,o1

a,d,e,f,I,o1,o2

a,d,e,f,I,G,o1,o2

LM(o₁) ={o₁} ∪LM(a)

(30)

Computation: Example

a b

c

d

e f

I

o₁ o₂

G

a-f,I,G,o1,o2

I

a,I b,I e,I

a,I,o1

a,c,I,o1

a,d,I,o1

a,d,e,f,I,o1,o2

a,d,e,f,I,G,o1,o2

LM(c) ={c} ∪LM(o₁)

(31)

Computation: Example

a b

c

d

e f

I

o₁ o₂

G

a-f,I,G,o1,o2

I

a,I b,I e,I

a,I,o1

a,c,I,o1

a,d,I,o1

a,d,e,f,I,o1,o2

a,d,e,f,I,G,o1,o2

LM(d) ={d} ∪LM(o₁)

(32)

Computation: Example

a b

c

d

e f

I

o₁ o₂

G

a-f,I,G,o1,o2

I

a,I b,I e,I

a,I,o1

a,c,I,o1

a,d,I,o1

a,d,e,I,o1,o2

a,d,e,f,I,o1,o2

a,d,e,f,I,G,o1,o2

LM(o₂) ={o₂} ∪LM(d)∪LM(e)

(33)

Computation: Example

a b

c

d

e f

I

o₁ o₂

G

a-f,I,G,o1,o2

I

a,I b,I e,I

a,I,o1

a,c,I,o1

a,d,I,o1

a,d,e,I,o1,o2

a,d,e,f,I,o1,o2

a,d,e,f,I,G,o1,o2

LM(f) ={f} ∪LM(o₂)

(34)

Computation: Example

a b

c

d

e f

I

o₁ o₂

G

a-f,I,G,o1,o2

I

a,I b,I e,I

a,I,o1

a,c,I,o1

a,d,I,o1

a,d,e,I,o1,o2

a,d,e,f,I,o1,o2

a,d,e,f,I,G,o1,o2

LM(G) ={G} ∪LM(e)∪LM(f)

(35)

Relation to Planning Task Landmarks

Theorem

LetΠ =hV,I,O, γibe a STRIPS planning task and

letL be the set of landmarks for reaching nG in sRTG(Π⁺).

The set{v =T|v ∈V and n_v ∈ L} is exactly the set of causal fact landmarksinΠ⁺.

For operators o∈O, if no ∈ Lthen{o} is a disjunctive action landmarkin Π⁺.

There are no other disjunctive action landmarks of size1.

(Proofs omitted.)

(36)

Computed RTG Landmarks: Example

Example (Computed RTG Landmarks)

o₂ =h{d,e},{f},{a,d}i, and γ ={e,f}.

LM(n_G) ={a,d,e,f,I,G,o₁,o₂}

a,d,e,andf are causal fact landmarks of Π⁺.

{o₁}and {o₂} are disjunctive action landmarks of Π⁺.

(37)

Landmarks of Π

⁺

Are Landmarks of Π

Theorem

LetΠbe a STRIPS planning task.

All fact landmarks ofΠ⁺ are fact landmarks ofΠand all disjunctive action landmarks ofΠ⁺ are disjunctive action landmarks ofΠ.

Proof.

LetLbe a disjunctive action landmark of Π⁺ andπ be a plan for Π. Thenπ is also a plan for Π⁺ and, thus, π contains an operator fromL.

Letf be a fact landmark of Π⁺. If f is already true in the initial state, then it is also a landmark of Π. Otherwise, every plan for Π⁺ contains an operator that addsf and the set of all these operators is a disjunctive action landmark of Π⁺. Therefore, also each plan of Π contains such an operator, makingf a fact landmark of Π.

(38)

Minimum Hitting Set Heuristic

(39)

Content of this Course: Constraints

Constraints

MHS Heuristic

Flows Operator Counting Potential Heuristics

(40)

Exploiting Disjunctive Action Landmarks

The cost cost(L) of a disjunctive action landmarkLis an admissible heuristic, but it is usually not very informative.

Landmark heuristics typically aim to combine multiple disjunctive action landmarks.

How can we exploit a given setL of disjunctive action landmarks?

Sum of costsP

L∈Lcost(L)?

not admissible!

Maximize costs max_L∈Lcost(L)?

usually very weak heuristic better: Hitting sets

(41)

Hitting Sets

Definition (Hitting Set)

LetX be a set,F ={F₁, . . . ,Fn} ⊆2^X be a family of subsets of X andc :X →R⁺₀ be a cost function for X.

Ahitting setis a subsetH⊆X that “hits” all subsets in F, i.e., H∩F 6=∅ for allF ∈ F. The costofH isP

x∈Hc(x).

Aminimum hitting set (MHS)is a hitting set with minimal cost.

MHS is a “classical” NP-complete problem (Karp, 1972)

(42)

Example: Hitting Sets

Example

X ={o₁,o₂,o₃,o₄}

F={{o₄},{o₁,o₂},{o₁,o₃},{o₂,o₃}}

c(o1) = 3, c(o2) = 4, c(o3) = 5, c(o4) = 0 What is a minimum hitting set?

Solution: {o₁,o2,o4} with cost 3 + 4 + 0 = 7

(43)

Example: Hitting Sets

Example

X ={o₁,o₂,o₃,o₄}

F={{o₄},{o₁,o₂},{o₁,o₃},{o₂,o₃}}

c(o1) = 3, c(o2) = 4, c(o3) = 5, c(o4) = 0 What is a minimum hitting set?

Solution: {o₁,o2,o4} with cost 3 + 4 + 0 = 7

(44)

Hitting Sets for Disjunctive Action Landmarks

Idea: disjunctive action landmarks are interpreted as Idea: instance of minimum hitting set

Definition (Hitting Set Heuristic)

LetL be a set of disjunctive action landmarks. The hitting set heuristich^MHS(L) is defined as the cost of a minimum hitting set forL with c(o) =cost(o).

Proposition (Hitting Set Heuristic is Admissible)

LetL be a set of disjunctive action landmarks for state s.

Then h^MHS(L)is an admissible estimate for s.

(45)

Hitting Set Heuristic: Discussion

The hitting set heuristic is the best possibleheuristic that only uses the given information. . .

. . . but is NP-hard to compute.

Use approximations that can be efficiently computed.

⇒ LP-relaxation,cost partitioning (both discussed later)

(46)

Summary

(47)

Summary

Fact landmark: atomic proposition that is true in each state path to a goal

Disjunctive action landmark: setL of operators such that every plan uses some operator from L

Relaxed task graphsallows efficient computation of landmarks Hitting setsyield the most accurate heuristic for a given set of disjunctive action landmarks

Computation of minimal hitting set is NP-hard