Planning and Optimization
C5. Delete Relaxation: hmaxandhadd
Gabriele R¨oger and Thomas Keller
Universit¨at Basel
October 24, 2018
G. R¨oger, T. Keller (Universit¨at Basel) Planning and Optimization October 24, 2018 1 / 27
Planning and Optimization
October 24, 2018 — C5. Delete Relaxation: hmax andhadd
C5.1 Introduction C5.2 h
maxand h
addC5.3 Properties of h
maxand h
addC5.4 Summary
G. R¨oger, T. Keller (Universit¨at Basel) Planning and Optimization October 24, 2018 2 / 27
Content of this Course
Planning
Classical
Tasks Progression/
Regression Complexity Heuristics
Probabilistic
MDPs Uninformed Search
Heuristic Search Monte-Carlo
Methods
Content of this Course: Heuristics
Heuristics
Delete Relaxation Relaxed Tasks Relaxed Task Graphs
Relaxation Heuristics Abstraction
Landmarks Potential Heuristics Cost Partitioning
C5. Delete Relaxation:hmaxandhadd Introduction
C5.1 Introduction
G. R¨oger, T. Keller (Universit¨at Basel) Planning and Optimization October 24, 2018 5 / 27
C5. Delete Relaxation:hmaxandhadd Introduction
Delete Relaxation Heuristics
I In this chapter, we introduceheuristics based on delete relaxation.
I Their basic idea is to propagate information
in relaxed task graphs, similar to the previous chapter.
I Unlike the previous chapter, we do not just propagate information aboutwhethera given node is reachable, but estimates how expensiveit is to reach the node.
G. R¨oger, T. Keller (Universit¨at Basel) Planning and Optimization October 24, 2018 6 / 27
C5. Delete Relaxation:hmaxandhadd Introduction
Reminder: Running Example
We will use the same running example as in the previous chapter:
Π =hV,I,{o1,o2,o3,o4}, γi with V ={a,b,c,d,e,f,g,h}
I ={a7→T,b7→T,c 7→F,d 7→T, e7→F,f 7→F,g 7→F,h7→F}
o1=hc∨(a∧b),c∧((c∧d)Be),1i o2=h>,f,2i
o3=hf,g,1i o4=hf,h,1i
γ=e∧(g ∧h)
C5. Delete Relaxation:hmaxandhadd Introduction
Algorithm for Reachability Analysis (Reminder)
I reachability analysis in RTGs = computing all forced true nodes = computing the most conservative assignment
I Here is an algorithm that achieves this:
Reachability Analysis
Associate areachable attribute with each node.
for allnodesn:
n.reachable:=false whileno fixed point is reached:
Choose a noden.
if n is an AND node:
n.reachable:=V
n0∈succ(n)n0.reachable if n is an OR node:
n.reachable:=W
n0∈succ(n)n0.reachable
C5. Delete Relaxation:hmaxandhadd Introduction
Reachability Analysis: Example (Reminder)
aF
T TbF TcF TdF eFT FTf gFT ThF
I F T F
T
F
T FT
o1T,F> o1,cFT∧d
F T o2FT,>
o3FT,> o4FT,>
F T
γF T
G. R¨oger, T. Keller (Universit¨at Basel) Planning and Optimization October 24, 2018 9 / 27
C5. Delete Relaxation:hmaxandhadd hmaxandhadd
C5.2 h max and h add
G. R¨oger, T. Keller (Universit¨at Basel) Planning and Optimization October 24, 2018 10 / 27
C5. Delete Relaxation:hmaxandhadd hmaxandhadd
Associating Costs with RTG Nodes
Basic intuitions for associating costswith RTG nodes:
I To apply anoperator, we must pay its cost.
I To make anOR node true, it is sufficient to makeoneof its successors true.
Therefore, we estimate the cost of an OR node as theminimumof the costs of its successors.
I To make anAND node true,allits successors must be made true first.
We can beoptimisticand estimate the cost as themaximum of the successor node costs.
Or we can bepessimisticand estimate the cost as thesumof the successor node costs.
We will prove later that this is indeed optimistic/pessimistic.
C5. Delete Relaxation:hmaxandhadd hmaxandhadd
h
maxAlgorithm
(Differences to reachability analysis algorithm highlighted.) Computinghmax Values
Associate acost attribute with each node.
for allnodesn:
n.cost:=∞
whileno fixed point is reached:
Choose a noden.
if n is an AND node that is not an effect node:
n.cost:=maxn0∈succ(n)n0.cost if n is an effect node for operatoro:
n.cost:=cost(o) + maxn0∈succ(n)n0.cost if n is an OR node:
n.cost:=minn0∈succ(n)n0.cost
The overall heuristic value is the cost of thegoal node,nγ.cost.
C5. Delete Relaxation:hmaxandhadd hmaxandhadd
h
max: Example
∞a0 ∞b0 ∞c1 ∞d0 ∞e2 ∞f2 ∞g3 ∞h3
∞I0
∞0
∞0 ∞1
o1∞,1> o1,∞c2∧d
+1 +1
∞0 o2∞,2>
+2
o3∞,3>
+1
o4∞,3>
+1
∞3
∞γ3
hmax(I) = 3
G. R¨oger, T. Keller (Universit¨at Basel) Planning and Optimization October 24, 2018 13 / 27
C5. Delete Relaxation:hmaxandhadd hmaxandhadd
h
addAlgorithm
(Differences tohmax algorithm highlighted.) Computinghadd Values
Associate acost attribute with each node.
for allnodesn:
n.cost:=∞
whileno fixed point is reached:
Choose a noden.
if n is an AND node that is not an effect node:
n.cost:=P
n0∈succ(n)n0.cost if n is an effect node for operator o:
n.cost:=cost(o) +P
n0∈succ(n)n0.cost if n is an OR node:
n.cost:= minn0∈succ(n)n0.cost
The overall heuristic value is the cost of thegoal node,nγ.cost.
G. R¨oger, T. Keller (Universit¨at Basel) Planning and Optimization October 24, 2018 14 / 27
C5. Delete Relaxation:hmaxandhadd hmaxandhadd
h
add: Example
a
∞0 ∞b0 ∞c1 ∞d0 ∞e2 ∞f2 ∞g3 ∞h3
∞I0
∞0
∞0 ∞1
o1∞,1> o1,∞c2∧d
+1 +1
∞0 o2∞,2>
+2
o3∞,3>
+1
o4∞,3>
+1
∞6
γ
∞8
hadd(I) = 8
C5. Delete Relaxation:hmaxandhadd hmaxandhadd
h
maxand h
add: Definition
We can now define our first non-trivial heuristics for planning:
hmax andhadd Heuristics
Let Π =hV,I,O, γibe a propositional planning task in positive normal form.
Thehmax heuristic value of a states, writtenhmax(s), is obtained by constructing the RTG for Π+s =hV,s,O+, γi and then
computingnγ.cost using thehmax value algorithm for RTGs.
Thehadd heuristic value of a states, writtenhadd(s), is computed in the same way using thehadd value algorithm for RTGs.
Notation: we will use the same notationhmax(n)andhadd(n) for thehmax/hadd values of RTG nodes
C5. Delete Relaxation:hmaxandhadd Properties ofhmaxandhadd
C5.3 Properties of h max and h add
G. R¨oger, T. Keller (Universit¨at Basel) Planning and Optimization October 24, 2018 17 / 27
C5. Delete Relaxation:hmaxandhadd Properties ofhmaxandhadd
Understanding h
maxand h
addWe want to understandhmax andhadd better:
I Are they well-defined?
I How can they be efficiently computed?
I Are they safe?
I Are they admissible?
I How do they compare to the optimal solution cost for a delete-relaxed task?
G. R¨oger, T. Keller (Universit¨at Basel) Planning and Optimization October 24, 2018 18 / 27
C5. Delete Relaxation:hmaxandhadd Properties ofhmaxandhadd
Well-Definedness of h
maxand h
add(1)
Arehmax andhadd well-defined?
I The algorithms for computinghmax andhadd values do not specifyin which order the RTG nodes should be selected.
I It turns out that the order does not affect the final result.
Thehmax andhadd values are well-defined.
I To show this, we must show
I that their computation always terminates, and
I that all executions terminate with the same result.
I For time reasons, we only provide a proof sketch.
C5. Delete Relaxation:hmaxandhadd Properties ofhmaxandhadd
Well-Definedness of h
maxand h
add(2)
Theorem
The fixed point algorithms for computing hmax and hadd values produce a well-defined result.
Proof Sketch.
LetV0,V1,V2, . . . be the vectors of cost values during a given execution of the algorithm.
Termination: Note thatVi ≥Vi+1 for alli.
It is not hard to prove that each node value can only decrease a finite number of times: first from ∞to some finite value, and then a finite number of additional times. . . .
C5. Delete Relaxation:hmaxandhadd Properties ofhmaxandhadd
Well-Definedness of h
maxand h
add(3)
Proof Sketch (continued).
Uniqueness of result: Let V0 ≥V1≥V2 ≥ · · · ≥Vn be the finite sequence of cost value vectors until termination during a given execution of the algorithm.
I View the consistency conditions of all nodes
(e.g.,n.cost= minn0∈succ(n)n0.cost for all OR nodesn) as a system of equations E.
I Vn must be a solution to E (otherwise no fixed point is reached withVn).
I For alli ∈ {0, . . . ,n}, show by induction over i thatVi ≥S for all solutions S toE.
I It follows thatVn is the unique maximum solution toE and hence well-defined.
G. R¨oger, T. Keller (Universit¨at Basel) Planning and Optimization October 24, 2018 21 / 27
C5. Delete Relaxation:hmaxandhadd Properties ofhmaxandhadd
Efficient Computation of h
maxand h
addI If nodes are poorly chosen, thehmax/hadd algorithm can update the same node many times
until it reaches its final value.
I However, there is a simple strategy that prevents this:
in every iteration, pick a node with minimumnew value among all nodes that can be updated to a new value.
I With this strategy, no node is updated more than once.
(We omit the proof, which is not complicated.)
I Using a suitable priority queue data structure,
this allows computing the hmax/hadd values of an RTG with nodes N and arcs Ain time O(|N|log|N|+|A|).
G. R¨oger, T. Keller (Universit¨at Basel) Planning and Optimization October 24, 2018 22 / 27
C5. Delete Relaxation:hmaxandhadd Properties ofhmaxandhadd
h
max: Example of Efficient Computation
∞a0
(2) (3) ∞b0 (9) ∞c1 (4) ∞d0 (12)∞e2 (14)∞f2 (16)∞g3 (18)∞h3
∞I0 (1)
∞0 (5)
∞0
(6) (10)∞1
o1∞,1>
(8) (11)o1,∞c2∧d
+1 +1
∞0 (7)
o2∞,2>
(13) +2
o3∞3,>
(15) +1
o4∞,3>
(17) +1
∞3 (19)
∞γ3 (20)
hmax(I) = 3
C5. Delete Relaxation:hmaxandhadd Properties ofhmaxandhadd
Efficient Computation of h
maxand h
add: Remarks
I In the following chapters, we will always assume that we are using this efficient version of the hmax andhadd algorithm.
I In particular, we will assume that all reachable nodes of the relaxed task graph are processed exactly once
(and all unreachable nodes not at all), so that it makes sense to speak of certain nodes being processed after others etc.
C5. Delete Relaxation:hmaxandhadd Properties ofhmaxandhadd
Heuristic Quality of h
maxand h
addThis leaves us with the questions about the heuristic quality of hmax andhadd:
I Are they safe?
I Are they admissible?
I How do they compare to the optimal solution cost for a delete-relaxed task?
It is easy to see thathmax andhadd aresafe:
they assign ∞iff a node is unreachable in the delete relaxation.
In our running example, it seems that hmax is prone to underestimation andhadd is prone to overestimation.
We will study this further in the next chapter.
G. R¨oger, T. Keller (Universit¨at Basel) Planning and Optimization October 24, 2018 25 / 27
C5. Delete Relaxation:hmaxandhadd Summary
C5.4 Summary
G. R¨oger, T. Keller (Universit¨at Basel) Planning and Optimization October 24, 2018 26 / 27
C5. Delete Relaxation:hmaxandhadd Summary
Summary
I hmax andhadd values estimate how expensive it is to reach a state variable, operator effect or formula (e.g., the goal).
I They are computed by propagatingcost information in relaxed task graphs:
I AtOR nodes, choose the cheapest alternative.
I AtAND nodes, maximize or sum the successor costs.
I Ateffect nodes, also add the operator cost.
I hmax andhadd values can serve as heuristics.
I They are well-defined and can be computed efficiently
by computing them in order of increasing cost along the RTG.