Planning and Optimization C6. Delete Relaxation: Best Achievers and

(1)

Planning and Optimization

C6. Delete Relaxation: Best Achievers andh^FF

Gabriele R¨oger and Thomas Keller

Universit¨at Basel

October 24, 2018

G. R¨oger, T. Keller (Universit¨at Basel) Planning and Optimization October 24, 2018 1 / 30

Planning and Optimization

October 24, 2018 — C6. Delete Relaxation: Best Achievers andh^FF

C6.1 Choice Functions C6.2 Best Achievers C6.3 The FF Heuristic

C6.4 h

^max

vs. h

^add

vs. h

^FF

vs. h

⁺

C6.5 Summary

Content of this Course

Planning

Classical

Tasks Progression/

Regression Complexity Heuristics

Probabilistic

MDPs Uninformed Search

Heuristic Search Monte-Carlo

Methods

Content of this Course: Heuristics

Heuristics

Delete Relaxation Relaxed Tasks Relaxed Task Graphs

Relaxation Heuristics Abstraction

Landmarks Potential Heuristics Cost Partitioning

(2)

C6. Delete Relaxation: Best Achievers andh^FF Choice Functions

C6.1 Choice Functions

Motivation

I In this chapter, we analyze the behaviour of h^max andh^add more deeply.

I Our goal is to understand their shortcomings and use this understanding to devise an improved heuristic.

I As a preparation for our analysis, we need some further definitions that concernchoices in AND/OR graphs.

I The key observation is that if we want to establish the value of a certain noden, we can to some extent choosehow we want to achieve the OR nodes that are relevant to achievingn.

Preview: Choice Function & Best Achievers

Preserve at most one outgoing arc of each OR node but node values may not change.

a: 0 b: 0 c: 1 d: 0 e: 2 f: 2 g: 3 h: 3

I: 0 0

0 1

o₁,>: 1 o₁,c∧d: 2

+1 +1

0 o₂,>: 2

+2

o₃,>: 3 +1

o₄,>: 3 +1

6

γ: 8

(precondition of o1modified toc∨(a∨b))

Choice Functions

Definition (Choice Function)

LetG be an AND/OR graph with nodesN and OR nodes N_OR. Achoice function forG is a functionf :N⁰ →N defined on some setN⁰ ⊆N_OR such that f(n)∈succ(n) for alln∈N⁰.

I In words, choice functions select (at most) onesuccessor for each OR node of G.

I Intuitively,f(n) selects by which disjunct n is achieved.

I If f(n) is undefined for a given n, the intuition is thatn is not achieved.

(3)

Reduced Graphs

Once we have decided how to achieve an OR node, we can remove the other alternatives:

Definition (Reduced Graph)

Let G be an AND/OR graph, and letf be a choice function for G defined on nodes N⁰.

Thereduced graph forf is the subgraph ofG where all outgoing arcs of OR nodes are removed except for the chosen arcs hn,f(n)i withn∈N⁰.

C6. Delete Relaxation: Best Achievers andh^FF Best Achievers

C6.2 Best Achievers

Choice Functions Induced by h

^max

and h

^add

Which choices doh^max andh^add make?

I At every OR noden, we set the cost of n

to theminimumof the costs of the successors of n.

I The motivation for this is to achieven via the successor that can be achievedmost cheaplyaccording to our cost estimates.

This corresponds to defining a choice functionf

withf(n)∈arg min_n⁰_∈N⁰n⁰.costfor all reached OR nodes n, where N⁰ ⊆succ(n) are all successors ofn processed before n.

I The successors chosen by this cost function are called best achievers (according toh^max or h^add).

I Note that the best achiever functionf is in general not well-defined because there can be multiple minimizers.

We assume that ties are broken arbitrarily.

Example: Best Achievers (1)

best achievers for h^add

a: 0 b: 0 c: 1 d: 0 e: 2 f: 2 g: 3 h: 3

I: 0 0

0 1

o1,>: 1 o1,c∧d: 2

+1 +1

0 o₂,>: 2

+2

o₃,>: 3 +1

o₄,>: 3 +1

6

γ: 8

(4)

Example: Best Achievers (2)

best achievers forh^add; modified goale∨(g∧h)

a: 0 b: 0 c: 1 d: 0 e: 2 f: 2 g: 3 h: 3

I: 0 0

0 1

o1,>: 1 o1,c∧d: 2

+1 +1

0 o₂,>: 2

+2

o₃,>: 3 +1

o₄,>: 3 +1

6

γ:2

Best Achiever Graphs

I Observation: Theh^max/h^add costs of nodes remain the same if we replace the RTG by the reduced graph for the respective best achiever function.

I The AND/OR graph that is obtained by removing all nodes with infinite cost from this reduced graph is called thebest achiever graphfor h^max/h^add.

I We writeG^max andG^addfor the best achiever graphs.

I G^max (G^add) is always acyclic: for all arcs hn,n⁰i it contains, n is processed byh^max (byh^add) after n⁰.

Paths in Best Achiever Graphs

Let n be a node of the best achiever graph.

Let N_eff be the set of effect nodes of the best achiever graph.

Thecost of aneffect nodeis the cost of the associated operator.

Thecost of a path in the best achiever graph is the sum of costs of all effect nodeson the path.

The following properties can be shown by induction:

I h^max(n) is themaximum costof all paths originating fromn in G^max. A path achieving this maximum is called acritical path.

I h^add(n) is thesum, over all effect nodesn⁰, of the cost of n⁰ multiplied by thenumber of pathsfrom n to n⁰ inG^add. In particular, these properties hold for the goal noden_γ if it is reachable.

Example: Undercounting in h

^max

G^max: undercounting in h^max

a: 0 b: 0 c: 1 d: 0 e: 2 f: 2 g: 3 h: 3

I: 0 0

0 1

o₁,>: 1 o₁,c∧d: 2

+1 +1

0 o2,>: 2

+2

o3,>: 3 +1

o4,>: 3 +1

3

γ: 3

o₁ ando₄ not counted because they are off the critical path

(5)

Example: Overcounting in h

^add

G^add: overcounting inh^add

a: 0 b: 0 c: 1 d: 0 e: 2 f: 2 g: 3 h: 3

I: 0 0

0 1

o₁,>: 1 o₁,c∧d: 2

+1 +1

0 o₂,>: 2

+2

o₃,>: 3 +1

o₄,>: 3 +1

6

γ: 8

o₂ counted twice because there are two paths ton_o^>

2

C6. Delete Relaxation: Best Achievers andh^FF The FF Heuristic

C6.3 The FF Heuristic

Inaccuracies in h

^max

and h

^add

I h^max is often inaccurate because it undercounts:

the heuristic estimate only reflects the cost of a critical path, which is often only a small fraction of the overall plan.

I h^add is often inaccurate because itovercounts:

if the same subproblem is reached in many ways, it will be counted many times although it only needs to be solved once.

The FF Heuristic

Fortunately, with the perspective of best achiever graphs, there is a simple solution: count all effect nodes thath^add would count, but only count each of them once.

Definition (FF Heuristic)

Let Π =hV,I,O, γibe a propositional planning task in positive normal form. TheFF heuristic for a states of Π, writtenh^FF(s), is computed as follows:

I Construct the RTG for the taskhV,s,O⁺, γi.

I Construct the best achiever graphG^add.

I Compute the set of effect nodes {n^χ_o₁¹, . . . ,n^χ_o_k^k} reachable from n_γ in G^add.

I Returnh^FF(s) =Pk

i=1cost(o_i).

Note: h^FF is not well-defined; different tie-breaking policies for best achievers can lead to different heuristic values

(6)

Example: FF Heuristic (1)

FF heuristic computation

a: 0 b: 0 c: 1 d: 0 e: 2 f: 2 g: 3 h: 3

I: 0 0

0 1

o₁,>: 1

o₁,>: 1 oo₁₁,,cc∧∧d: 2d: 2

+1 +1

0 o₂,>: 2 o₂,>: 2 +2

o₃,>: 3 o₃,>: 3 +1

o₄,>: 3 o₄,>: 3 +1

6

γ: 8

h^FF(s) = 1 + 1 + 2 + 1 + 1 = 6

Example: FF Heuristic (2)

FF heuristic computation; modified goale∨(g∧h)

a: 0 b: 0 c: 1 d: 0 e: 2 f: 2 g: 3 h: 3

I: 0 0

0 1

o1,>: 1

o1,>: 1 oo11,,cc∧∧d: 2d: 2

+1 +1

0 o2,>: 2

+2

o₃,>: 3 +1

o₄,>: 3 +1

6

γ:2

h^FF(s) = 1 + 1 = 2

C6. Delete Relaxation: Best Achievers andh^FF h^maxvs.h^addvs.h^FFvs.h⁺

C6.4 h ^max vs. h ^add vs. h ^FF vs. h ⁺

Optimal Delete Relaxation Heuristic

Definition (h⁺ Heuristic)

Let Π be a propositional planning task in positive normal form, and lets be a state of Π.

Theoptimal delete relaxation heuristicfor s, writtenh⁺(s), is defined as the perfect heuristich^∗(s) of states

in the delete-relaxed task Π⁺.

I Reminder: We proved that h^∗(s) is hard to compute.

(BCPlanExis NP-complete for delete-relaxed tasks.)

I The optimal delete relaxation heuristic is often used as a reference point for comparison.

(7)

Relationships between Delete Relaxation Heuristics (1)

Theorem

Let Πbe a propositional planning task in positive normal form, and let s be a state of Π.

Then:

1 h^max(s)≤h⁺(s)≤h^FF(s)≤h^add(s)

2 h^max(s) =∞iff h⁺(s) =∞iff h^FF(s) =∞iff h^add(s) =∞

3 h^max and h⁺ are admissible and consistent.

4 h^FF and h^add are neither admissible nor consistent.

5 All four heuristics are safe and goal-aware.

Relationships between Delete Relaxation Heuristics (2)

Proof Sketch.

for 1:

I To show h^max(s)≤h⁺(s), show that critical path costs can be defined for arbitrary relaxed plans and that the critical path cost of a plan is never larger than the cost of the plan.

Then show that h^max(s) computes the minimal critical path cost over all delete-relaxed plans.

I To show h⁺(s)≤h^FF(s), prove that the operators belonging to the effect nodes counted byh^FF form a relaxed plan.

No relaxed plan is cheaper than h⁺ by definition of h⁺.

I h^FF(s)≤h^add(s) is obvious from the description ofh^FF: both heuristics count the same operators,

buth^add may count some of them multiple times.

. . .

Relationships between Delete Relaxation Heuristics (3)

Proof Sketch (continued).

for 2: all heuristics are infinite iff the task has no relaxed solution for 3: follows fromh^max(s)≤h⁺(s)

for 3:

because we already know thath⁺ is admissible for 4: construct a counterexample to admissibility forh^FF for 5: goal-awareness is easy to show; safety follows from 2.+3.

C6. Delete Relaxation: Best Achievers andh^FF Summary

C6.5 Summary

(8)

Summary

I h^max andh^add can be used to decidehow to achieve OR nodes in a relaxed task graph best achievers

I Best achiever graphshelp identify shortcomings of h^max and h^add compared to the perfect delete relaxation heuristich⁺.

I h^max underestimatesh⁺because it only considers the cost of acritical pathfor the relaxed planning task.

I h^addoverestimates h⁺because it double-counts operators occurring onmultiple pathsin the best achiever graph.

I TheFF heuristic repairs this flaw ofh^add and therefore approximates h⁺ more closely.

I In general,h^max(s)≤h⁺(s)≤h^FF(s)≤h^add(s).

I h^max andh⁺ are admissible;h^FF andh^add are not.

Literature Pointers

(Some) delete-relaxation heuristics in the planning literature:

I additive heuristich^add (Bonet, Loerincs & Geffner, 1997)

I maximum heuristic h^max (Bonet & Geffner, 1999)

I (original) FF heuristic (Hoffmann & Nebel, 2001)

I cost-sharing heuristic h^cs (Mirkis & Domshlak, 2007)

I set-additive heuristicsh^sa (Keyder & Geffner, 2008)

I FF/additive heuristic h^FF (Keyder & Geffner, 2008)

I local Steiner tree heuristich^lst (Keyder & Geffner, 2008) also hybrids such as semi-relaxedheuristics

and delete-relaxation landmarkheuristics

Planning and Optimization C6. Delete Relaxation: Best Achievers and

Planning and Optimization

Planning and Optimization

C6.1 Choice Functions C6.2 Best Achievers C6.3 The FF Heuristic

C6.4 h

vs. h

vs. h

vs. h

C6.5 Summary

Content of this Course

Content of this Course: Heuristics

C6.1 Choice Functions

Motivation

Preview: Choice Function & Best Achievers

Choice Functions

Reduced Graphs

C6.2 Best Achievers

Choice Functions Induced by h

and h

Example: Best Achievers (1)

Example: Best Achievers (2)

Best Achiever Graphs

Paths in Best Achiever Graphs

Example: Undercounting in h

Example: Overcounting in h

C6.3 The FF Heuristic

Inaccuracies in h

and h

The FF Heuristic

Example: FF Heuristic (1)

Example: FF Heuristic (2)

C6.4 h max vs. h add vs. h FF vs. h +

Optimal Delete Relaxation Heuristic

Relationships between Delete Relaxation Heuristics (1)

Relationships between Delete Relaxation Heuristics (2)

Relationships between Delete Relaxation Heuristics (3)

C6.5 Summary

Summary

Literature Pointers

C6.4 h ^max vs. h ^add vs. h ^FF vs. h ⁺