E2. Landmarks: RTG Landmarks & MHS Heuristic
Malte Helmert and Gabriele R¨oger
Universit¨at Basel
Content of this Course
Planning
Classical
Foundations Logic Heuristics Constraints
Probabilistic
Explicit MDPs Factored MDPs
Content of this Course: Constraints
Constraints
Landmarks RTG Landmarks
MHS Heuristic
LM-Cut Heuristic Cost
Partitioning Network
Flows Operator Counting
Landmarks
Landmarks
Basic Idea: Something that must happen in every solution For example
some operator must be applied (action landmark) some atomic proposition must hold (fact landmark) some formula must be true (formula landmark)
→Derive heuristic estimate from this kind of information.
We only consider fact and disjunctive action landmarks.
Landmarks
Basic Idea: Something that must happen in every solution For example
some operator must be applied (action landmark) some atomic proposition must hold (fact landmark) some formula must be true (formula landmark)
→Derive heuristic estimate from this kind of information.
We only considerfactanddisjunctive action landmarks.
Definition
Definition (Disjunctive Action Landmark)
Lets be a state of planning task Π =hV,I,O, γi.
Adisjunctive action landmarkfor s is a set of operatorsL⊆O such that every label path froms to a goal state contains an operator fromL.
Thecostof landmark Lis cost(L) = mino∈Lcost(o).
Definition (Fact Landmark)
Lets be a state of planning task Π =hV,I,O, γi.
An atomic propositionv =d for v ∈V andd ∈dom(v) is a fact landmarkfors if every state path from s to a goal state contains a states0 with s0(v) =d.
If we talk about landmarks for the initial state, we omit “forI”.
Landmarks: Example
Example
Consider a FDR planning taskhV,I,O, γi with V ={robot-at,dishes-at} with
dom(robot-at) ={A1, . . . ,C3,B4,A5, . . . ,B6}
dom(dishes-at) ={Table,Robot,Dishwasher}
I ={robot-at7→C1,dishes-at7→Table}
operators
move-x-y to move from cellx to adjacent celly pickup dishes, and
load dishes into the dishwasher.
γ = (robot-at=B6)∧(dishes-at= Dishwasher)
Landmarks Landmarks from RTGs Minimum Hitting Set Heuristic Summary
Fact Landmarks: Example
1 2 3 4 5 6
C B A
Images from wikimedia
Each fact in gray is a fact landmark:
robot-at= x for x∈ {A1,A6,B3,B4,B5,B6,C1}
dishes-at =x forx ∈ {Dishwasher,Robot,Table}
Dumym 2
Disjunctive Action Landmarks: Example
1 2 3 4 5 6
C B A
Actions of same color form disjunctive action landmark:
{pickup}
{load}
{move-B3-B4}
{move-B4-B5}
{move-A6-B6,move-B5-B6}
{move-A3-B3,move-B2-B3,move-C3-B3}
{move-B1-A1,move-A2-A1}
. . .
Remarks
Not every landmark is informative. Some examples:
The set of all operators is a disjunctive action landmark unless the initial state is already a goal state.
Every variable that is initially true is a fact landmark.
Deciding whether a given variable is a fact landmark is as hard as the plan existence problem.
Deciding whether a given operator set is a disjunctive action landmark is as hard as the plan existence problem.
Every fact landmark v that is initially false induces a disjunctive action landmark consisting of all operators that possibly makev true.
Landmarks from RTGs
Content of this Course: Constraints
Constraints
Landmarks RTG Landmarks
MHS Heuristic
LM-Cut Heuristic Cost
Partitioning Network
Flows Operator Counting
Computing Landmarks
How can we come up with landmarks?
Most landmarks are derived from therelaxed task graph:
RHW landmarks: Richter, Helmert & Westphal. Landmarks Revisited. (AAAI 2008)
LM-Cut: Helmert & Domshlak. Landmarks, Critical Paths and Abstractions: What’s the Difference Anyway? (ICAPS 2009) hm landmarks: Keyder, Richter & Helmert: Sound and Complete Landmarks for And/Or Graphs (ECAI 2010) We discusshm landmarksrestricted to m= 1
and to STRIPS planning tasks.
Incidental Landmarks: Example
Example (Incidental Landmarks)
Consider a STRIPS planning taskhV,I,{o1,o2}, γi with V ={a,b,c,d,e,f},
I ={a7→T,b 7→T,c 7→F,d 7→F,e 7→T,f 7→F}, o1 =h{a},{c,d,e},{a,b}i,
o2 =h{d,e},{f},{a,d}i, and γ ={e,f}.
Single solution: ho1,o2i
All variables are fact landmarks.
Variable b is initially true but irrelevant for the plan.
Variable c gets true as “side effect” ofo1 but it is not necessary for the goal or to make an operator applicable.
Causal Landmarks
Definition (Causal Fact Landmark)
Let Π =hV,I,O, γibe a STRIPS planning task.
An atomic propositionv =Tfor v ∈V is a causal fact landmark if v∈γ
or if for all goal pathsπ =ho1, . . . ,oni there is anoi with v ∈pre(oi).
Causal Landmarks: Example
Example (Causal Landmarks)
Consider a STRIPS planning taskhV,I,{o1,o2}, γi with V ={a,b,c,d,e,f},
I ={a7→T,b 7→T,c 7→F,d 7→F,e 7→T,f 7→F}, o1 =h{a},{c,d,e},{a,b}i,
o2 =h{d,e},{f},{a,d}i, and γ ={e,f}.
Single solution: ho1,o2i
All variables are fact landmarks for the initial state.
Only a,d,e andf are causal landmarks.
What We Are Doing Next
Causal landmarks are the desirable landmarks.
We can use a simplified version of RTGs to compute causal landmarks for STRIPS planning tasks.
We will define landmarks of AND/OR graphs, . . . and show how they can be computed.
Afterwards we establish that these are landmarks of the planning task.
Simplified Relaxed Task Graph
Definition
For a STRIPS planning task Π =hV,I,O, γi, thesimplified relaxed task graphsRTG(Π+) is theAND/OR graph hNand∪Nor,A,typei with
Nand={no |o ∈O} ∪ {vI,vG} with type(n) =∧for all n∈Nand, Nor={nv |v ∈V}
with type(n) =∨for all n∈Nor, and A={hna,noi |o∈O,a∈add(o)} ∪ E ={hno,npi |o ∈O,p ∈pre(o)} ∪ E ={hnv,nIi |v ∈I} ∪
E ={hnG,nvi |v ∈γ}
Simplified RTG: Example
The simplified RTG for our example task is:
a b
c
d
e f
I
o1 o2
G
Characterizing Equation System
Theorem
Let G =hN,A,typeibe an AND/OR graph. Consider the following system of equations:
LM(n) ={n} ∪ \
hn,n0i∈A
LM(n0) type(n) =∨ LM(n) ={n} ∪ [
hn,n0i∈A
LM(n0) type(n) =∧
The equation system has a unique maximal solution (maximal with regard to set inclusion), and for this solution it holds that
n0 ∈LM(n) iff n0 is a landmark for reaching n in G.
Computation of Maximal Solution
Theorem
Let G =hN,A,typeibe an AND/OR graph. Consider the following system of equations:
LM(n) ={n} ∪ \
hn,n0i∈A
LM(n0) type(n) =∨ LM(n) ={n} ∪ [
hn,n0i∈A
LM(n0) type(n) =∧
The equation system has a unique maximal solution (maximal with regard to set inclusion).
Computation: Initialize landmark sets asLM(n) =Nand∪Nor and Computation: apply equations as update rules until fixpoint.
Landmarks Landmarks from RTGs Minimum Hitting Set Heuristic Summary
Computation: Example
a b
c
d
e f
I
o1 o2
G
a-f,I,G,o1,o2 a-f,I,G,o1,o2
a-f,I,G,o1,o2
a-f,I,G,o1,o2
a-f,I,G,o1,o2 a-f,I,G,o1,o2
a-f,I,G,oI 1,o2 a-f,I,G,o1,o2
a,I b,I e,I
a,c,I,o1
a,d,I,o1
a,d,e,f,I,o1,o2
a,d,e,f,I,G,o1,o2
Landmarks Landmarks from RTGs Minimum Hitting Set Heuristic Summary
Computation: Example
a b
c
d
e f
I
o1 o2
G
a-f,I,G,o1,o2 a-f,I,G,o1,o2
a-f,I,G,o1,o2
a-f,I,G,o1,o2
a-f,I,G,o1,o2 a-f,I,G,o1,o2
a-f,I,G,o1,o2 a-f,I,G,o1,o2
a-f,I,G,o1,o2 a-f,I,G,o1,o2
I
a,I b,I e,I
a,c,I,o1
a,d,I,o1
a,d,e,f,I,o1,o2
a,d,e,f,I,G,o1,o2
Initialize with all nodes
Landmarks Landmarks from RTGs Minimum Hitting Set Heuristic Summary
Computation: Example
a b
c
d
e f
I
o1 o2
G
a-f,I,G,o1,o2 a-f,I,G,o1,o2
a-f,I,G,o1,o2
a-f,I,G,o1,o2
a-f,I,G,o1,o2 a-f,I,G,o1,o2
a-f,I,G,o1,o2
a-f,I,G,o1,o2
a-f,I,G,o1,o2 a-f,I,G,o1,o2
I
a,I b,I e,I
a,c,I,o1
a,d,I,o1
a,d,e,f,I,o1,o2
a,d,e,f,I,G,o1,o2
LM(I) ={I}
Landmarks Landmarks from RTGs Minimum Hitting Set Heuristic Summary
Computation: Example
a b
c
d
e f
I
o1 o2
G
a-f,I,G,o1,o2
a-f,I,G,o1,o2
a-f,I,G,o1,o2
a-f,I,G,o1,o2
a-f,I,G,o1,o2 a-f,I,G,o1,o2
a-f,I,G,o1,o2
a-f,I,G,o1,o2
a-f,I,G,o1,o2 a-f,I,G,o1,o2
I a,I
b,I e,I
a,c,I,o1
a,d,I,o1
a,d,e,f,I,o1,o2
a,d,e,f,I,G,o1,o2
LM(a) ={a} ∪LM(I)
Landmarks Landmarks from RTGs Minimum Hitting Set Heuristic Summary
Computation: Example
a b
c
d
e f
I
o1 o2
G
a-f,I,G,o1,o2 a-f,I,G,o1,o2
a-f,I,G,o1,o2
a-f,I,G,o1,o2
a-f,I,G,o1,o2 a-f,I,G,o1,o2
a-f,I,G,o1,o2
a-f,I,G,o1,o2
a-f,I,G,o1,o2 a-f,I,G,o1,o2
I
a,I b,I
e,I a,c,I,o1
a,d,I,o1
a,d,e,f,I,o1,o2
a,d,e,f,I,G,o1,o2
LM(b) ={b} ∪LM(I)
Landmarks Landmarks from RTGs Minimum Hitting Set Heuristic Summary
Computation: Example
a b
c
d
e f
I
o1 o2
G
a-f,I,G,o1,o2 a-f,I,G,o1,o2
a-f,I,G,o1,o2
a-f,I,G,o1,o2
a-f,I,G,o1,o2
a-f,I,G,o1,o2
a-f,I,G,o1,o2
a-f,I,G,o1,o2
a-f,I,G,o1,o2 a-f,I,G,o1,o2
I
a,I b,I e,I
a,c,I,o1
a,d,I,o1
a,d,e,f,I,o1,o2
a,d,e,f,I,G,o1,o2
LM(e) ={e} ∪(LM(I)∩LM(o1))
Landmarks Landmarks from RTGs Minimum Hitting Set Heuristic Summary
Computation: Example
a b
c
d
e f
I
o1 o2
G
a-f,I,G,o1,o2 a-f,I,G,o1,o2
a-f,I,G,o1,o2
a-f,I,G,o1,o2
a-f,I,G,o1,o2
a-f,I,G,o1,o2
a-f,I,G,o1,o2
a-f,I,G,o1,o2
a-f,I,G,o1,o2
I
a,I b,I e,I
a,I,o1
a,c,I,o1
a,d,I,o1
a,d,e,f,I,o1,o2
a,d,e,f,I,G,o1,o2
LM(o1) ={o1} ∪LM(a)
Landmarks Landmarks from RTGs Minimum Hitting Set Heuristic Summary
Computation: Example
a b
c
d
e f
I
o1 o2
G
a-f,I,G,o1,o2 a-f,I,G,o1,o2
a-f,I,G,o1,o2
a-f,I,G,o1,o2
a-f,I,G,o1,o2
a-f,I,G,o1,o2
a-f,I,G,o1,o2
a-f,I,G,o1,o2
a-f,I,G,o1,o2
I
a,I b,I e,I
a,I,o1
a,c,I,o1
a,d,I,o1
a,d,e,f,I,o1,o2
a,d,e,f,I,G,o1,o2
LM(c) ={c} ∪LM(o1)
Landmarks Landmarks from RTGs Minimum Hitting Set Heuristic Summary
Computation: Example
a b
c
d
e f
I
o1 o2
G
a-f,I,G,o1,o2 a-f,I,G,o1,o2
a-f,I,G,o1,o2
a-f,I,G,o1,o2
a-f,I,G,o1,o2
a-f,I,G,o1,o2
a-f,I,G,o1,o2
a-f,I,G,o1,o2
a-f,I,G,o1,o2
I
a,I b,I e,I
a,I,o1
a,c,I,o1
a,d,I,o1
a,d,e,f,I,o1,o2
a,d,e,f,I,G,o1,o2
LM(d) ={d} ∪LM(o1)
Landmarks Landmarks from RTGs Minimum Hitting Set Heuristic Summary
Computation: Example
a b
c
d
e f
I
o1 o2
G
a-f,I,G,o1,o2 a-f,I,G,o1,o2
a-f,I,G,o1,o2
a-f,I,G,o1,o2
a-f,I,G,o1,o2
a-f,I,G,o1,o2
a-f,I,G,o1,o2
a-f,I,G,o1,o2
I
a,I b,I e,I
a,I,o1
a,c,I,o1
a,d,I,o1
a,d,e,I,o1,o2
a,d,e,f,I,o1,o2
a,d,e,f,I,G,o1,o2
LM(o2) ={o2} ∪LM(d)∪LM(e)
Landmarks Landmarks from RTGs Minimum Hitting Set Heuristic Summary
Computation: Example
a b
c
d
e f
I
o1 o2
G
a-f,I,G,o1,o2 a-f,I,G,o1,o2
a-f,I,G,o1,o2
a-f,I,G,o1,o2
a-f,I,G,o1,o2 a-f,I,G,o1,o2
a-f,I,G,o1,o2
a-f,I,G,o1,o2
I
a,I b,I e,I
a,I,o1
a,c,I,o1
a,d,I,o1
a,d,e,I,o1,o2
a,d,e,f,I,o1,o2
a,d,e,f,I,G,o1,o2
LM(f) ={f} ∪LM(o2)
Landmarks Landmarks from RTGs Minimum Hitting Set Heuristic Summary
Computation: Example
a b
c
d
e f
I
o1 o2
G
a-f,I,G,o1,o2 a-f,I,G,o1,o2
a-f,I,G,o1,o2
a-f,I,G,o1,o2
a-f,I,G,o1,o2 a-f,I,G,o1,o2
a-f,I,G,o1,o2 a-f,I,G,o1,o2
I
a,I b,I e,I
a,I,o1
a,c,I,o1
a,d,I,o1
a,d,e,I,o1,o2
a,d,e,f,I,o1,o2
a,d,e,f,I,G,o1,o2
LM(G) ={G} ∪LM(e)∪LM(f)
Relation to Planning Task Landmarks
Theorem
LetΠ =hV,I,O, γibe a STRIPS planning task and
letL be the set of landmarks for reaching nG in sRTG(Π+).
The set{v =T|v ∈V and nv ∈ L} is exactly the set of causal fact landmarksinΠ+.
For operators o∈O, if no ∈ Lthen{o} is a disjunctive action landmarkin Π+.
There are no other disjunctive action landmarks of size1.
(Proofs omitted.)
Computed RTG Landmarks: Example
Example (Computed RTG Landmarks)
Consider a STRIPS planning taskhV,I,{o1,o2}, γi with V ={a,b,c,d,e,f},
I ={a7→T,b 7→T,c 7→F,d 7→F,e 7→T,f 7→F}, o1 =h{a},{c,d,e},{a,b}i,
o2 =h{d,e},{f},{a,d}i, and γ ={e,f}.
LM(nG) ={a,d,e,f,I,G,o1,o2}
a,d,e,andf are causal fact landmarks of Π+.
{o1}and {o2} are disjunctive action landmarks of Π+.
Landmarks of Π
+Are Landmarks of Π
Theorem
LetΠbe a STRIPS planning task.
All fact landmarks ofΠ+ are fact landmarks ofΠand all disjunctive action landmarks ofΠ+ are disjunctive action landmarks ofΠ.
Proof.
LetLbe a disjunctive action landmark of Π+ andπ be a plan for Π. Thenπ is also a plan for Π+ and, thus, π contains an operator fromL.
Letf be a fact landmark of Π+. If f is already true in the initial state, then it is also a landmark of Π. Otherwise, every plan for Π+ contains an operator that addsf and the set of all these operators is a disjunctive action landmark of Π+. Therefore, also each plan of Π contains such an operator, makingf a fact landmark of Π.
Minimum Hitting Set Heuristic
Content of this Course: Constraints
Constraints
Landmarks RTG Landmarks
MHS Heuristic
LM-Cut Heuristic Cost
Partitioning Network
Flows Operator Counting Potential Heuristics
Exploiting Disjunctive Action Landmarks
The cost cost(L) of a disjunctive action landmarkLis an admissible heuristic, but it is usually not very informative.
Landmark heuristics typically aim to combine multiple disjunctive action landmarks.
How can we exploit a given setL of disjunctive action landmarks?
Sum of costsP
L∈Lcost(L)?
not admissible!
Maximize costs maxL∈Lcost(L)?
usually very weak heuristic better: Hitting sets
Hitting Sets
Definition (Hitting Set)
LetX be a set,F ={F1, . . . ,Fn} ⊆2X be a family of subsets of X andc :X →R+0 be a cost function for X.
Ahitting setis a subsetH⊆X that “hits” all subsets in F, i.e., H∩F 6=∅ for allF ∈ F. The costofH isP
x∈Hc(x).
Aminimum hitting set (MHS)is a hitting set with minimal cost.
MHS is a “classical” NP-complete problem (Karp, 1972)
Example: Hitting Sets
Example
X ={o1,o2,o3,o4}
F={{o4},{o1,o2},{o1,o3},{o2,o3}}
c(o1) = 3, c(o2) = 4, c(o3) = 5, c(o4) = 0 What is a minimum hitting set?
Solution: {o1,o2,o4} with cost 3 + 4 + 0 = 7
Example: Hitting Sets
Example
X ={o1,o2,o3,o4}
F={{o4},{o1,o2},{o1,o3},{o2,o3}}
c(o1) = 3, c(o2) = 4, c(o3) = 5, c(o4) = 0 What is a minimum hitting set?
Solution: {o1,o2,o4} with cost 3 + 4 + 0 = 7
Hitting Sets for Disjunctive Action Landmarks
Idea: disjunctive action landmarks are interpreted as Idea: instance of minimum hitting set
Definition (Hitting Set Heuristic)
LetL be a set of disjunctive action landmarks. The hitting set heuristichMHS(L) is defined as the cost of a minimum hitting set forL with c(o) =cost(o).
Proposition (Hitting Set Heuristic is Admissible)
LetL be a set of disjunctive action landmarks for state s.
Then hMHS(L)is an admissible estimate for s.
Hitting Set Heuristic: Discussion
The hitting set heuristic is the best possibleheuristic that only uses the given information. . .
. . . but is NP-hard to compute.
Use approximations that can be efficiently computed.
⇒ LP-relaxation,cost partitioning (both discussed later)
Summary
Summary
Fact landmark: atomic proposition that is true in each state path to a goal
Disjunctive action landmark: setL of operators such that every plan uses some operator from L
Relaxed task graphsallows efficient computation of landmarks Hitting setsyield the most accurate heuristic for a given set of disjunctive action landmarks
Computation of minimal hitting set is NP-hard