• Keine Ergebnisse gefunden

Planning and Optimization C5. Delete Relaxation:

N/A
N/A
Protected

Academic year: 2022

Aktie "Planning and Optimization C5. Delete Relaxation:"

Copied!
7
0
0

Wird geladen.... (Jetzt Volltext ansehen)

Volltext

(1)

Planning and Optimization

C5. Delete Relaxation: hmaxandhadd

Gabriele R¨oger and Thomas Keller

Universit¨at Basel

October 24, 2018

G. R¨oger, T. Keller (Universit¨at Basel) Planning and Optimization October 24, 2018 1 / 27

Planning and Optimization

October 24, 2018 — C5. Delete Relaxation: hmax andhadd

C5.1 Introduction C5.2 h

max

and h

add

C5.3 Properties of h

max

and h

add

C5.4 Summary

G. R¨oger, T. Keller (Universit¨at Basel) Planning and Optimization October 24, 2018 2 / 27

Content of this Course

Planning

Classical

Tasks Progression/

Regression Complexity Heuristics

Probabilistic

MDPs Uninformed Search

Heuristic Search Monte-Carlo

Methods

Content of this Course: Heuristics

Heuristics

Delete Relaxation Relaxed Tasks Relaxed Task Graphs

Relaxation Heuristics Abstraction

Landmarks Potential Heuristics Cost Partitioning

(2)

C5. Delete Relaxation:hmaxandhadd Introduction

C5.1 Introduction

G. R¨oger, T. Keller (Universit¨at Basel) Planning and Optimization October 24, 2018 5 / 27

C5. Delete Relaxation:hmaxandhadd Introduction

Delete Relaxation Heuristics

I In this chapter, we introduceheuristics based on delete relaxation.

I Their basic idea is to propagate information

in relaxed task graphs, similar to the previous chapter.

I Unlike the previous chapter, we do not just propagate information aboutwhethera given node is reachable, but estimates how expensiveit is to reach the node.

G. R¨oger, T. Keller (Universit¨at Basel) Planning and Optimization October 24, 2018 6 / 27

C5. Delete Relaxation:hmaxandhadd Introduction

Reminder: Running Example

We will use the same running example as in the previous chapter:

Π =hV,I,{o1,o2,o3,o4}, γi with V ={a,b,c,d,e,f,g,h}

I ={a7→T,b7→T,c 7→F,d 7→T, e7→F,f 7→F,g 7→F,h7→F}

o1=hc∨(a∧b),c∧((c∧d)Be),1i o2=h>,f,2i

o3=hf,g,1i o4=hf,h,1i

γ=e∧(g ∧h)

C5. Delete Relaxation:hmaxandhadd Introduction

Algorithm for Reachability Analysis (Reminder)

I reachability analysis in RTGs = computing all forced true nodes = computing the most conservative assignment

I Here is an algorithm that achieves this:

Reachability Analysis

Associate areachable attribute with each node.

for allnodesn:

n.reachable:=false whileno fixed point is reached:

Choose a noden.

if n is an AND node:

n.reachable:=V

n0∈succ(n)n0.reachable if n is an OR node:

n.reachable:=W

n0∈succ(n)n0.reachable

(3)

C5. Delete Relaxation:hmaxandhadd Introduction

Reachability Analysis: Example (Reminder)

aF

T TbF TcF TdF eFT FTf gFT ThF

I F T F

T

F

T FT

o1T,F> o1,cFTd

F T o2FT,>

o3FT,> o4FT,>

F T

γF T

G. R¨oger, T. Keller (Universit¨at Basel) Planning and Optimization October 24, 2018 9 / 27

C5. Delete Relaxation:hmaxandhadd hmaxandhadd

C5.2 h max and h add

G. R¨oger, T. Keller (Universit¨at Basel) Planning and Optimization October 24, 2018 10 / 27

C5. Delete Relaxation:hmaxandhadd hmaxandhadd

Associating Costs with RTG Nodes

Basic intuitions for associating costswith RTG nodes:

I To apply anoperator, we must pay its cost.

I To make anOR node true, it is sufficient to makeoneof its successors true.

Therefore, we estimate the cost of an OR node as theminimumof the costs of its successors.

I To make anAND node true,allits successors must be made true first.

We can beoptimisticand estimate the cost as themaximum of the successor node costs.

Or we can bepessimisticand estimate the cost as thesumof the successor node costs.

We will prove later that this is indeed optimistic/pessimistic.

C5. Delete Relaxation:hmaxandhadd hmaxandhadd

h

max

Algorithm

(Differences to reachability analysis algorithm highlighted.) Computinghmax Values

Associate acost attribute with each node.

for allnodesn:

n.cost:=∞

whileno fixed point is reached:

Choose a noden.

if n is an AND node that is not an effect node:

n.cost:=maxn0∈succ(n)n0.cost if n is an effect node for operatoro:

n.cost:=cost(o) + maxn0∈succ(n)n0.cost if n is an OR node:

n.cost:=minn0∈succ(n)n0.cost

The overall heuristic value is the cost of thegoal node,nγ.cost.

(4)

C5. Delete Relaxation:hmaxandhadd hmaxandhadd

h

max

: Example

a0 b0 c1 d0 e2 f2 g3 h3

I0

0

0 1

o1,1> o1,c2d

+1 +1

0 o2,2>

+2

o3,3>

+1

o4,3>

+1

3

γ3

hmax(I) = 3

G. R¨oger, T. Keller (Universit¨at Basel) Planning and Optimization October 24, 2018 13 / 27

C5. Delete Relaxation:hmaxandhadd hmaxandhadd

h

add

Algorithm

(Differences tohmax algorithm highlighted.) Computinghadd Values

Associate acost attribute with each node.

for allnodesn:

n.cost:=∞

whileno fixed point is reached:

Choose a noden.

if n is an AND node that is not an effect node:

n.cost:=P

n0∈succ(n)n0.cost if n is an effect node for operator o:

n.cost:=cost(o) +P

n0∈succ(n)n0.cost if n is an OR node:

n.cost:= minn0∈succ(n)n0.cost

The overall heuristic value is the cost of thegoal node,nγ.cost.

G. R¨oger, T. Keller (Universit¨at Basel) Planning and Optimization October 24, 2018 14 / 27

C5. Delete Relaxation:hmaxandhadd hmaxandhadd

h

add

: Example

a

0 b0 c1 d0 e2 f2 g3 h3

I0

0

0 1

o1,1> o1,c2d

+1 +1

0 o2,2>

+2

o3,3>

+1

o4,3>

+1

6

γ

8

hadd(I) = 8

C5. Delete Relaxation:hmaxandhadd hmaxandhadd

h

max

and h

add

: Definition

We can now define our first non-trivial heuristics for planning:

hmax andhadd Heuristics

Let Π =hV,I,O, γibe a propositional planning task in positive normal form.

Thehmax heuristic value of a states, writtenhmax(s), is obtained by constructing the RTG for Π+s =hV,s,O+, γi and then

computingnγ.cost using thehmax value algorithm for RTGs.

Thehadd heuristic value of a states, writtenhadd(s), is computed in the same way using thehadd value algorithm for RTGs.

Notation: we will use the same notationhmax(n)andhadd(n) for thehmax/hadd values of RTG nodes

(5)

C5. Delete Relaxation:hmaxandhadd Properties ofhmaxandhadd

C5.3 Properties of h max and h add

G. R¨oger, T. Keller (Universit¨at Basel) Planning and Optimization October 24, 2018 17 / 27

C5. Delete Relaxation:hmaxandhadd Properties ofhmaxandhadd

Understanding h

max

and h

add

We want to understandhmax andhadd better:

I Are they well-defined?

I How can they be efficiently computed?

I Are they safe?

I Are they admissible?

I How do they compare to the optimal solution cost for a delete-relaxed task?

G. R¨oger, T. Keller (Universit¨at Basel) Planning and Optimization October 24, 2018 18 / 27

C5. Delete Relaxation:hmaxandhadd Properties ofhmaxandhadd

Well-Definedness of h

max

and h

add

(1)

Arehmax andhadd well-defined?

I The algorithms for computinghmax andhadd values do not specifyin which order the RTG nodes should be selected.

I It turns out that the order does not affect the final result.

Thehmax andhadd values are well-defined.

I To show this, we must show

I that their computation always terminates, and

I that all executions terminate with the same result.

I For time reasons, we only provide a proof sketch.

C5. Delete Relaxation:hmaxandhadd Properties ofhmaxandhadd

Well-Definedness of h

max

and h

add

(2)

Theorem

The fixed point algorithms for computing hmax and hadd values produce a well-defined result.

Proof Sketch.

LetV0,V1,V2, . . . be the vectors of cost values during a given execution of the algorithm.

Termination: Note thatVi ≥Vi+1 for alli.

It is not hard to prove that each node value can only decrease a finite number of times: first from ∞to some finite value, and then a finite number of additional times. . . .

(6)

C5. Delete Relaxation:hmaxandhadd Properties ofhmaxandhadd

Well-Definedness of h

max

and h

add

(3)

Proof Sketch (continued).

Uniqueness of result: Let V0 ≥V1≥V2 ≥ · · · ≥Vn be the finite sequence of cost value vectors until termination during a given execution of the algorithm.

I View the consistency conditions of all nodes

(e.g.,n.cost= minn0∈succ(n)n0.cost for all OR nodesn) as a system of equations E.

I Vn must be a solution to E (otherwise no fixed point is reached withVn).

I For alli ∈ {0, . . . ,n}, show by induction over i thatVi ≥S for all solutions S toE.

I It follows thatVn is the unique maximum solution toE and hence well-defined.

G. R¨oger, T. Keller (Universit¨at Basel) Planning and Optimization October 24, 2018 21 / 27

C5. Delete Relaxation:hmaxandhadd Properties ofhmaxandhadd

Efficient Computation of h

max

and h

add

I If nodes are poorly chosen, thehmax/hadd algorithm can update the same node many times

until it reaches its final value.

I However, there is a simple strategy that prevents this:

in every iteration, pick a node with minimumnew value among all nodes that can be updated to a new value.

I With this strategy, no node is updated more than once.

(We omit the proof, which is not complicated.)

I Using a suitable priority queue data structure,

this allows computing the hmax/hadd values of an RTG with nodes N and arcs Ain time O(|N|log|N|+|A|).

G. R¨oger, T. Keller (Universit¨at Basel) Planning and Optimization October 24, 2018 22 / 27

C5. Delete Relaxation:hmaxandhadd Properties ofhmaxandhadd

h

max

: Example of Efficient Computation

a0

(2) (3) b0 (9) c1 (4) d0 (12)e2 (14)f2 (16)g3 (18)h3

I0 (1)

0 (5)

0

(6) (10)1

o1,1>

(8) (11)o1,c2d

+1 +1

0 (7)

o2,2>

(13) +2

o33,>

(15) +1

o4,3>

(17) +1

3 (19)

γ3 (20)

hmax(I) = 3

C5. Delete Relaxation:hmaxandhadd Properties ofhmaxandhadd

Efficient Computation of h

max

and h

add

: Remarks

I In the following chapters, we will always assume that we are using this efficient version of the hmax andhadd algorithm.

I In particular, we will assume that all reachable nodes of the relaxed task graph are processed exactly once

(and all unreachable nodes not at all), so that it makes sense to speak of certain nodes being processed after others etc.

(7)

C5. Delete Relaxation:hmaxandhadd Properties ofhmaxandhadd

Heuristic Quality of h

max

and h

add

This leaves us with the questions about the heuristic quality of hmax andhadd:

I Are they safe?

I Are they admissible?

I How do they compare to the optimal solution cost for a delete-relaxed task?

It is easy to see thathmax andhadd aresafe:

they assign ∞iff a node is unreachable in the delete relaxation.

In our running example, it seems that hmax is prone to underestimation andhadd is prone to overestimation.

We will study this further in the next chapter.

G. R¨oger, T. Keller (Universit¨at Basel) Planning and Optimization October 24, 2018 25 / 27

C5. Delete Relaxation:hmaxandhadd Summary

C5.4 Summary

G. R¨oger, T. Keller (Universit¨at Basel) Planning and Optimization October 24, 2018 26 / 27

C5. Delete Relaxation:hmaxandhadd Summary

Summary

I hmax andhadd values estimate how expensive it is to reach a state variable, operator effect or formula (e.g., the goal).

I They are computed by propagatingcost information in relaxed task graphs:

I AtOR nodes, choose the cheapest alternative.

I AtAND nodes, maximize or sum the successor costs.

I Ateffect nodes, also add the operator cost.

I hmax andhadd values can serve as heuristics.

I They are well-defined and can be computed efficiently

by computing them in order of increasing cost along the RTG.

Referenzen

ÄHNLICHE DOKUMENTE

According to hypothesis (i) above, FVD application should be least likely across a syntactic clause boundary, and most likely within a syntactic XP. For some target items, the

Freedom House rankings denote that an increase in democracy (compared with no change) is significantly more likely where a UN peacebuilding mission is deployed.. As shown in model

C3.1 Optimal Relaxed Plans C3.2 AND/OR Graphs C3.3 Forced Nodes?. C3.4 Most/Least Conservative Valuations

I The AND/OR graph that is obtained by removing all nodes with infinite cost from this reduced graph is called the best achiever graph for h max /h add. I We write G max and G add

As a result, the water salinity in the Large Aral has grown by a factor of 7 reaching over 80 ppt in the Western basin and 100 ppt in the Eastern basin.... Summary

As a result, the water salinity in the Large Aral has grown by a factor of 7 reaching over 80 ppt in the Western basin and 100 ppt in the Eastern basin.. Summary

Wach, Wymiary europeizacja i jej kontekst [The Dimensions of Europeanisation and Its Context], „Zeszyty Naukowe Uniwersytetu Ekonomicznego w Krakowie”, 2011, nr 852 (seria „Prace

I saw this possibility already in 1977 when I wrote a contribution to the political and ethnic geography of North Pakistan, but I was well aware that the