Planning and Optimization C5. Delete Relaxation:

(1)

Planning and Optimization

C5. Delete Relaxation: h^maxandh^add

Gabriele R¨oger and Thomas Keller

Universit¨at Basel

October 24, 2018

G. R¨oger, T. Keller (Universit¨at Basel) Planning and Optimization October 24, 2018 1 / 27

Planning and Optimization

October 24, 2018 — C5. Delete Relaxation: h^max andh^add

C5.1 Introduction C5.2 h

^max

and h

^add

C5.3 Properties of h

^max

and h

^add

C5.4 Summary

Content of this Course

Planning

Classical

Tasks Progression/

Regression Complexity Heuristics

Probabilistic

MDPs Uninformed Search

Heuristic Search Monte-Carlo

Methods

Content of this Course: Heuristics

Heuristics

Delete Relaxation Relaxed Tasks Relaxed Task Graphs

Relaxation Heuristics Abstraction

Landmarks Potential Heuristics Cost Partitioning

(2)

C5. Delete Relaxation:h^maxandh^add Introduction

C5.1 Introduction

Delete Relaxation Heuristics

I In this chapter, we introduceheuristics based on delete relaxation.

I Their basic idea is to propagate information

in relaxed task graphs, similar to the previous chapter.

I Unlike the previous chapter, we do not just propagate information aboutwhethera given node is reachable, but estimates how expensiveit is to reach the node.

Reminder: Running Example

We will use the same running example as in the previous chapter:

Π =hV,I,{o₁,o₂,o₃,o₄}, γi with V ={a,b,c,d,e,f,g,h}

I ={a7→T,b7→T,c 7→F,d 7→T, e7→F,f 7→F,g 7→F,h7→F}

o₁=hc∨(a∧b),c∧((c∧d)Be),1i o₂=h>,f,2i

o₃=hf,g,1i o₄=hf,h,1i

γ=e∧(g ∧h)

Algorithm for Reachability Analysis (Reminder)

I reachability analysis in RTGs = computing all forced true nodes = computing the most conservative assignment

I Here is an algorithm that achieves this:

Reachability Analysis

Associate areachable attribute with each node.

for allnodesn:

n.reachable:=false whileno fixed point is reached:

Choose a noden.

if n is an AND node:

n.reachable:=V

n⁰∈succ(n)n⁰.reachable if n is an OR node:

n.reachable:=W

n⁰∈succ(n)n⁰.reachable

(3)

Reachability Analysis: Example (Reminder)

aF

T TbF TcF TdF eFT FTf gFT ThF

I F T F

T

F

T FT

o1T,F> o1,cFT∧d

F T o2FT,>

o₃FT,> o₄FT,>

F T

γF T

C5. Delete Relaxation:h^maxandh^add h^maxandh^add

C5.2 h ^max and h ^add

Associating Costs with RTG Nodes

Basic intuitions for associating costswith RTG nodes:

I To apply anoperator, we must pay its cost.

I To make anOR node true, it is sufficient to makeoneof its successors true.

Therefore, we estimate the cost of an OR node as theminimumof the costs of its successors.

I To make anAND node true,allits successors must be made true first.

We can beoptimisticand estimate the cost as themaximum of the successor node costs.

Or we can bepessimisticand estimate the cost as thesumof the successor node costs.

We will prove later that this is indeed optimistic/pessimistic.

h

^max

Algorithm

(Differences to reachability analysis algorithm highlighted.) Computingh^max Values

Associate acost attribute with each node.

for allnodesn:

n.cost:=∞

whileno fixed point is reached:

Choose a noden.

if n is an AND node that is not an effect node:

n.cost:=max_n⁰_∈succ(n)n⁰.cost if n is an effect node for operatoro:

n.cost:=cost(o) + max_n0∈succ(n)n⁰.cost if n is an OR node:

n.cost:=min_n0∈succ(n)n⁰.cost

The overall heuristic value is the cost of thegoal node,n_γ.cost.

(4)

h

^max

: Example

∞a0 ∞b0 ∞c1 ∞d0 ∞e2 ∞f2 ∞g3 ∞h3

∞I0

∞0

∞0 ∞1

o₁∞,1> o₁,∞c2∧d

+1 +1

∞0 o₂∞,2>

+2

o₃∞,3>

+1

o₄∞,3>

+1

∞3

∞γ3

h^max(I) = 3

h

^add

Algorithm

(Differences toh^max algorithm highlighted.) Computingh^add Values

Associate acost attribute with each node.

for allnodesn:

n.cost:=∞

whileno fixed point is reached:

Choose a noden.

if n is an AND node that is not an effect node:

n.cost:=P

n⁰∈succ(n)n⁰.cost if n is an effect node for operator o:

n.cost:=cost(o) +P

n⁰∈succ(n)n⁰.cost if n is an OR node:

n.cost:= min_n0∈succ(n)n⁰.cost

The overall heuristic value is the cost of thegoal node,n_γ.cost.

h

^add

: Example

a

∞0 ∞b0 ∞c1 ∞d0 ∞e2 ∞f2 ∞g3 ∞h3

∞I0

∞0

∞0 ∞1

o₁∞,1> o₁,∞c2∧d

+1 +1

∞0 o₂∞,2>

+2

o3∞,3>

+1

o4∞,3>

+1

∞6

γ

∞8

h^add(I) = 8

h

^max

and h

^add

: Definition

We can now define our first non-trivial heuristics for planning:

h^max andh^add Heuristics

Let Π =hV,I,O, γibe a propositional planning task in positive normal form.

Theh^max heuristic value of a states, writtenh^max(s), is obtained by constructing the RTG for Π⁺_s =hV,s,O⁺, γi and then

computingn_γ.cost using theh^max value algorithm for RTGs.

Thehâdd heuristic value of a states, writtenhâdd(s), is computed in the same way using thehâdd value algorithm for RTGs.

Notation: we will use the same notationh^max(n)andh^add(n) for theh^max/h^add values of RTG nodes

(5)

C5. Delete Relaxation:h^maxandh^add Properties ofh^maxandh^add

C5.3 Properties of h ^max and h ^add

Understanding h

^max

and h

^add

We want to understandh^max andh^add better:

I Are they well-defined?

I How can they be efficiently computed?

I Are they safe?

I Are they admissible?

I How do they compare to the optimal solution cost for a delete-relaxed task?

Well-Definedness of h

^max

and h

^add

(1)

Areh^max andh^add well-defined?

I The algorithms for computingh^max andh^add values do not specifyin which order the RTG nodes should be selected.

I It turns out that the order does not affect the final result.

Theh^max andh^add values are well-defined.

I To show this, we must show

I that their computation always terminates, and

I that all executions terminate with the same result.

I For time reasons, we only provide a proof sketch.

Well-Definedness of h

^max

and h

^add

(2)

Theorem

The fixed point algorithms for computing h^max and h^add values produce a well-defined result.

Proof Sketch.

LetV₀,V₁,V₂, . . . be the vectors of cost values during a given execution of the algorithm.

Termination: Note thatV_i ≥V_i₊₁ for alli.

It is not hard to prove that each node value can only decrease a finite number of times: first from ∞to some finite value, and then a finite number of additional times. . . .

(6)

Well-Definedness of h

^max

and h

^add

(3)

Proof Sketch (continued).

Uniqueness of result: Let V₀ ≥V₁≥V₂ ≥ · · · ≥V_n be the finite sequence of cost value vectors until termination during a given execution of the algorithm.

I View the consistency conditions of all nodes

(e.g.,n.cost= min_n0∈succ(n)n⁰.cost for all OR nodesn) as a system of equations E.

I V_n must be a solution to E (otherwise no fixed point is reached withV_n).

I For alli ∈ {0, . . . ,n}, show by induction over i thatV_i ≥S for all solutions S toE.

I It follows thatV_n is the unique maximum solution toE and hence well-defined.

Efficient Computation of h

^max

and h

^add

I If nodes are poorly chosen, theh^max/h^add algorithm can update the same node many times

until it reaches its final value.

I However, there is a simple strategy that prevents this:

in every iteration, pick a node with minimumnew value among all nodes that can be updated to a new value.

I With this strategy, no node is updated more than once.

(We omit the proof, which is not complicated.)

I Using a suitable priority queue data structure,

this allows computing the h^max/h^add values of an RTG with nodes N and arcs Ain time O(|N|log|N|+|A|).

h

^max

: Example of Efficient Computation

∞a0

(2) (3) ∞b0 (9) ∞c1 (4) ∞d0 (12)∞e2 (14)∞f2 (16)∞g3 (18)∞h3

∞I0 (1)

∞0 (5)

∞0

(6) (10)∞1

o₁∞,1>

(8) (11)o₁,∞c2∧d

+1 +1

∞0 (7)

o₂∞,2>

(13) +2

o₃∞3,>

(15) +1

o₄∞,3>

(17) +1

∞3 (19)

∞γ3 (20)

h^max(I) = 3

Efficient Computation of h

^max

and h

^add

: Remarks

I In the following chapters, we will always assume that we are using this efficient version of the h^max andh^add algorithm.

I In particular, we will assume that all reachable nodes of the relaxed task graph are processed exactly once

(and all unreachable nodes not at all), so that it makes sense to speak of certain nodes being processed after others etc.

(7)

Heuristic Quality of h

^max

and h

^add

This leaves us with the questions about the heuristic quality of h^max andh^add:

I Are they safe?

I Are they admissible?

I How do they compare to the optimal solution cost for a delete-relaxed task?

It is easy to see thath^max andh^add aresafe:

they assign ∞iff a node is unreachable in the delete relaxation.

In our running example, it seems that h^max is prone to underestimation andh^add is prone to overestimation.

We will study this further in the next chapter.

C5. Delete Relaxation:h^maxandh^add Summary

C5.4 Summary

C5. Delete Relaxation:h^maxandh^add Summary

Summary

I h^max andh^add values estimate how expensive it is to reach a state variable, operator effect or formula (e.g., the goal).

I They are computed by propagatingcost information in relaxed task graphs:

I AtOR nodes, choose the cheapest alternative.

I AtAND nodes, maximize or sum the successor costs.

I Ateffect nodes, also add the operator cost.

I h^max andh^add values can serve as heuristics.

I They are well-defined and can be computed efficiently

by computing them in order of increasing cost along the RTG.

Planning and Optimization C5. Delete Relaxation:

Planning and Optimization

Planning and Optimization

C5.1 Introduction C5.2 h

and h

C5.3 Properties of h

and h

C5.4 Summary

Content of this Course

Content of this Course: Heuristics

C5.1 Introduction

Delete Relaxation Heuristics

Reminder: Running Example

Algorithm for Reachability Analysis (Reminder)

Reachability Analysis: Example (Reminder)

C5.2 h max and h add

Associating Costs with RTG Nodes

h

Algorithm

h

: Example

h

Algorithm

h

: Example

h

and h

: Definition

C5.3 Properties of h max and h add

Understanding h

and h

Well-Definedness of h

and h

(1)

Well-Definedness of h

and h

(2)

Well-Definedness of h

and h

(3)

Efficient Computation of h

and h

h

: Example of Efficient Computation

Efficient Computation of h

and h

: Remarks

Heuristic Quality of h

and h

C5.4 Summary

Summary

C5.2 h ^max and h ^add

C5.3 Properties of h ^max and h ^add