Planning and Optimization

(1)

Planning and Optimization

C2. Delete Relaxation: Properties of Relaxed Planning Tasks

Gabriele R¨oger and Thomas Keller

Universit¨at Basel

October 17, 2018

G. R¨oger, T. Keller (Universit¨at Basel) Planning and Optimization October 17, 2018 1 / 28

Planning and Optimization

October 17, 2018 — C2. Delete Relaxation: Properties of Relaxed Planning Tasks

C2.1 The Domination Lemma C2.2 The Relaxation Lemma C2.3 Further Properties C2.4 Greedy Algorithm C2.5 Summary

Content of this Course

Planning

Classical

Tasks Progression/

Regression Complexity Heuristics

Probabilistic

MDPs Uninformed Search

Heuristic Search Monte-Carlo

Methods

Content of this Course: Heuristics

Heuristics

Delete Relaxation Relaxed Tasks Relaxed Task Graphs

Relaxation Heuristics Abstraction

Landmarks Potential Heuristics Cost Partitioning

(2)

C2. Delete Relaxation: Properties of Relaxed Planning Tasks The Domination Lemma

C2.1 The Domination Lemma

On-Set and Dominating States

Definition (On-Set)

Theon-setof a valuation s is the set of propositional variables that are true in s, i.e., on(s) =s⁻¹({T}).

for statesof propositional planning tasks:

states can be viewed as setsof (true) state variables Definition (Dominate)

A valuations⁰ dominatesa valuations ifon(s)⊆on(s⁰).

all state variables true in s are also true ins⁰

Domination Lemma (1)

Lemma (Domination)

Let s and s⁰ be valuations of a set of propositional variables V , and let χbe a propositional formula over V

which does not contain negation symbols.

If s |=χ and s⁰ dominates s, then s⁰|=χ.

Proof.

Proof by induction over the structure ofχ.

I Base caseχ=>: thens⁰|=>.

I Base caseχ=⊥: thens 6|=⊥.

. . .

Domination Lemma (2)

Proof (continued).

I Base caseχ=v ∈V: if s|=v, thenv ∈on(s).

With on(s)⊆on(s⁰), we getv ∈on(s⁰) and hences⁰|=v.

I Inductive case χ=χ₁∧χ₂: by induction hypothesis, our claim holds for the proper subformulas χ₁ andχ₂ of χ.

s |=χ =⇒ s |=χ1∧χ2

=⇒ s |=χ1ands |=χ2 I.H. (twice)

=⇒ s⁰|=χ1ands⁰ |=χ2

=⇒ s⁰|=χ1∧χ2

=⇒ s⁰|=χ.

I Inductive case χ=χ₁∨χ₂: analogous

(3)

C2. Delete Relaxation: Properties of Relaxed Planning Tasks The Relaxation Lemma

C2.2 The Relaxation Lemma

Add Sets and Delete Sets

Definition (Add Set and Delete Set for an Effect)

Consider a propositional planning task with state variablesV. Lete be an effect overV, and lets be a state over V. Theadd setof e ins, writtenaddset(e,s),

and the delete setof e in s, writtendelset(e,s), are defined as the following sets of state variables:

addset(e,s) ={v ∈V |s |=effcond(v,e)}

delset(e,s) ={v ∈V |s |=effcond(¬v,e)}

Note: For all states s and operatorso applicable in s, we have on(sJoK) = (on(s)\delset(eff(o),s))∪addset(eff(o),s).

Relaxation Lemma

For this and the following chapters on delete relaxation, we assume implicitly that we are working with

propositional planning tasks in positive normal form.

Lemma (Relaxation)

Let s be a state, and let s⁰ be a state that dominates s.

1 If o is an operator applicable in s,

then o⁺ is applicable in s⁰ and s⁰Jo⁺Kdominates sJoK.

2 If π is an operator sequence applicable in s,

thenπ⁺ is applicable in s⁰ and s⁰Jπ⁺Kdominates sJπK.

3 If additionally π leads to a goal state from state s, thenπ⁺ leads to a goal state from state s⁰.

Proof of Relaxation Lemma (1)

Proof.

LetV be the set of state variables.

Part 1: Becauseo is applicable in s, we haves |=pre(o).

Becausepre(o) is negation-free and s⁰ dominatess, we get s⁰ |=pre(o) from the domination lemma.

Becausepre(o⁺) =pre(o), this shows thato⁺ is applicable ins⁰. . . .

(4)

Proof of Relaxation Lemma (2)

Proof (continued).

To prove thats⁰Jo⁺KdominatessJoK, we first compare the relevant add sets:

addset(eff(o),s) ={v ∈V |s|=effcond(v,eff(o))}

={v ∈V |s|=effcond(v,eff(o⁺))} (1)

⊆ {v ∈V |s⁰ |=effcond(v,eff(o⁺))} (2)

=addset(eff(o⁺),s⁰),

where (1) uses effcond(v,eff(o))≡effcond(v,eff(o⁺))

and (2) uses the dominance lemma (note that effect conditions are negation-free for operators in positive normal form). . . .

Proof of Relaxation Lemma (3)

Proof (continued).

We then get:

on(sJoK) = (on(s)\delset(eff(o),s))∪addset(eff(o),s)

⊆on(s)∪addset(eff(o),s)

⊆on(s⁰)∪addset(eff(o⁺),s⁰)

=on(s⁰Jo⁺K), and thuss⁰Jo⁺KdominatessJoK.

This concludes the proof of Part 1. . . .

Proof of Relaxation Lemma (4)

Proof (continued).

Part 2: by induction overn =|π|

Base case: π=hi

The empty plan is trivially applicable in s⁰, and s⁰Jhi⁺K=s⁰ dominatessJhiK=s by prerequisite.

Inductive case: π=ho₁, . . . ,o_n+1i

By the induction hypothesis, ho₁⁺, . . . ,o_n⁺iis applicable in s⁰, andt⁰ =s⁰Jho₁⁺, . . . ,o_n⁺iKdominatest =sJho₁, . . . ,o_niK. Also, o_n+1 is applicable int.

Using Part 1,o_n+1⁺ is applicable in t⁰ ands⁰Jπ⁺K=t⁰Jo_n+1⁺ K dominatessJπK=tJo_n+1K.

This concludes the proof of Part 2. . . .

Proof of Relaxation Lemma (5)

Proof (continued).

Part 3: Letγ be the goal formula.

From Part 2, we obtain thatt⁰=s⁰Jπ⁺Kdominatest=sJπK. By prerequisite, t is a goal state and hencet|=γ.

Because the task is in positive normal form,γ is negation-free, and hencet⁰ |=γ because of the domination lemma.

Therefore,t⁰ is a goal state.

(5)

C2. Delete Relaxation: Properties of Relaxed Planning Tasks Further Properties

C2.3 Further Properties

Further Properties of Delete Relaxation

I The relaxation lemma is the main technical result that we will use to study delete relaxation.

I Next, we derive some further properties of delete relaxation that will be useful for us.

I Two of these are direct consequences of the relaxation lemma.

Consequences of the Relaxation Lemma (1)

Corollary (Relaxation Preserves Plans and Leads to Dominance) Let π be an operator sequence that is applicable in state s.

Then π⁺is applicable in s and sJπ⁺Kdominates sJπK. If π is a plan forΠ, thenπ⁺ is a plan forΠ⁺.

Proof.

Apply relaxation lemma withs⁰ =s.

Relaxations of plans are relaxed plans.

Delete relaxation is no harder to solve than original task.

Optimal relaxed plans are never more expensive than optimal plans for original tasks.

Consequences of the Relaxation Lemma (2)

Corollary (Relaxation Preserves Dominance)

Let s be a state, let s⁰ be a state that dominates s, and letπ⁺ be a relaxed operator sequence applicable in s.

Thenπ⁺ is applicable in s⁰ and s⁰Jπ⁺Kdominates sJπ⁺K.

Proof.

Apply relaxation lemma withπ⁺ forπ, noting that (π⁺)⁺=π⁺.

If there is a relaxed plan starting from state s,

the same plan can be used starting from a dominating states⁰. Dominating states are always “better” in relaxed tasks.

(6)

Monotonicity of Relaxed Planning Tasks

Lemma (Monotonicity)

Let s be a state in which relaxed operator o⁺ is applicable.

Then sJo⁺Kdominates s.

Proof.

Since relaxed operators only have positive effects,

we have on(s)⊆on(s)∪addset(eff(o⁺),s) =on(sJo⁺K).

Together with our previous results, this means that making a transition in a relaxed planning taskneverhurts.

Finding Relaxed Plans

Using the theory we developed, we are now ready to study the problem offinding plans forrelaxed planning tasks.

C2. Delete Relaxation: Properties of Relaxed Planning Tasks Greedy Algorithm

C2.4 Greedy Algorithm

Greedy Algorithm for Relaxed Planning Tasks

The relaxation and monotonicity lemmas suggest the following algorithm for solving relaxed planning tasks:

Greedy Planning Algorithm forhV,I,O⁺, γi s :=I

π⁺:=hi loop forever:

if s |=γ: returnπ⁺

else if there is an operatoro⁺∈O⁺ applicable in s withsJo⁺K6=s:

Append such an operatoro⁺ toπ⁺. s :=sJo⁺K

else:

returnunsolvable

(7)

Correctness of the Greedy Algorithm

The algorithm is sound:

I If it returns a plan, this is indeed a correct solution.

I If it returns “unsolvable”, the task is indeed unsolvable

I Upon termination, there clearly is no relaxed plan froms.

I By iterated application of the monotonicity lemma, s dominatesI.

I By the relaxation lemma, there is no solution fromI. What aboutcompleteness(termination) and runtime?

I Each iteration of the loop adds at least one atom to on(s).

I This guarantees termination after at most|V|iterations.

I Thus, the algorithm can clearly be implemented to run in polynomial time.

I A good implementation runs inO(kΠk).

Using the Greedy Algorithm as a Heuristic

We can apply the greedy algorithm within heuristic search:

I When evaluating a states in progression search, solve relaxation of planning task with initial state s.

I When evaluating a subgoalϕin regression search, solve relaxation of planning task with goal ϕ.

I Set h(s) to the cost of the generated relaxed plan.

Is this anadmissibleheuristic?

I Yes if the relaxed plans are optimal (due to the plan preservation corollary).

I However, usually they are not, because our greedy relaxed planning algorithm is very poor.

(What about safety? Goal-awareness? Consistency?)

C2. Delete Relaxation: Properties of Relaxed Planning Tasks Summary

C2.5 Summary

C2. Delete Relaxation: Properties of Relaxed Planning Tasks Summary

Summary

I Delete relaxation is a simplificationin the sense that it is never harder to solve a relaxed task than the original one.

I Delete-relaxed tasks have adominationproperty:

it is always beneficial to make more state variables true.

I Because of their monotonicityproperty, delete-relaxed tasks can be solved in polynomial time by a greedy algorithm.

I However, the solution quality of this algorithm is poor.