35. Automated Planning: Delete Relaxation

(1)

Foundations of Artificial Intelligence

35. Automated Planning: Delete Relaxation

Malte Helmert

University of Basel

May 3, 2021

M. Helmert (University of Basel) Foundations of Artificial Intelligence May 3, 2021 1 / 21

Foundations of Artificial Intelligence

May 3, 2021 — 35. Automated Planning: Delete Relaxation

35.1 How to Design Heuristics?

35.2 Delete Relaxation 35.3 Examples

35.4 Summary

35. Automated Planning: Delete Relaxation How to Design Heuristics?

35.1 How to Design Heuristics?

A Simple Planning Heuristic

The STRIPS planner (Fikes & Nilsson, 1971) uses the number of goals not yet satisfied in a STRIPS planning task as heuristic:

h(s ) := |G \ s|.

intuition: fewer unsatisfied goals closer to goal state

STRIPS heuristic (properties?)

(2)

Problems of STRIPS Heuristic

drawback of STRIPS heuristic?

I rather uninformed:

For state s , if there is no applicable action a in s such that applying a in s satisfies strictly more (or fewer) goals, then all successor states have the same heuristic value as s.

I ignores almost the whole task structure:

The heuristic values do not depend on the actions.

we need better methods to design heuristics

Planning Heuristics

We consider three basic ideas for general heuristics:

I delete relaxation this and next chapter I abstraction later

I landmarks later

Delete Relaxation: Basic Idea

Estimate solution costs by considering a simplified planning task, where all negative action effects are ignored.

Automated Planning: Overview

Chapter overview: automated planning I 33. Introduction

I 34. Planning Formalisms

I 35.–36. Planning Heuristics: Delete Relaxation I 35. Delete Relaxation

I 36. Delete Relaxation Heuristics I 37. Planning Heuristics: Abstraction I 38.–39. Planning Heuristics: Landmarks

35. Automated Planning: Delete Relaxation Delete Relaxation

35.2 Delete Relaxation

(3)

Relaxed Planning Tasks: Idea

In STRIPS planning tasks,

good and bad effects are easy to distinguish:

I Add effects are always useful.

I Delete effects are always harmful.

Why?

idea for designing heuristics: ignore all delete effects

Relaxed Planning Tasks

Definition (relaxation of actions)

The relaxation a ⁺ of STRIPS action a is the action with pre(a ⁺ ) = pre(a), add(a ⁺ ) = add(a), cost(a ⁺ ) = cost(a), and del(a ⁺ ) = ∅.

German: Relaxierung von Aktionen

Definition (relaxation of planning tasks)

The relaxation Π ⁺ of a STRIPS planning task Π = hV , I, G , Ai is the task Π ⁺ := hV , I , G , {a ⁺ | a ∈ A}i.

German: Relaxierung von Planungsaufgaben

Relaxed Planning Tasks

Definition (relaxation of action sequences)

The relaxation of action sequence π = ha ₁ , . . . , a _n i is the action sequence π ⁺ := ha ⁺ ₁ , . . . , a ⁺ _n i.

German: Relaxierung von Aktionsfolgen

Relaxed Planning Tasks: Terminology

I STRIPS planning tasks without delete effects are called relaxed planning tasks

or delete-free planning tasks.

I Plans for relaxed planning tasks are called relaxed plans.

I If Π is a STRIPS planning task and π ⁺ is a plan for Π ⁺ , then π ⁺ is called relaxed plan for Π.

I h ⁺ (Π) denotes the cost of an optimal plan for Π ⁺ , i.e., of an optimal relaxed plan.

I analogously: h ⁺ (s ) cost of optimal relaxed plan

starting in state s (instead of initial state)

I h ⁺ is called optimal relaxation heuristic.

(4)

35. Automated Planning: Delete Relaxation Examples

35.3 Examples

Example: Logistics

→

Example (Logistics Task)

I variables: V = {at _AL , at _AR , at _BL , at _BR , at _TL , at _TR , in _AT , in _BT } I initial state: I = {at _AL , at _BR , at _TL }

I goals: G = {at _AR , at _BL }

I actions: {move _LR , move _RL , load _AL , load _AR , load _BL , load _BR , unload _AL , unload _AR , unload _BL , unload _BR } I . . .

Example: Logistics

→

Example (Logistics Task)

I pre(move _LR ) = {at _TL }, add(move _LR ) = {at _TR }, del(move _LR ) = {at _TL }, cost(move _LR ) = 1

I pre(load _AL ) = {at _TL , at _AL }, add(load _AL ) = {in _AT }, del(load _AL ) = {at _AL }, cost(load _AL ) = 1

I pre(unload _AL ) = {at _TL , in _AT }, add(unload _AL ) = {at _AL }, del(unload _AL ) = {in _AT }, cost(unload _AL ) = 1

I . . .

Example: Logistics

→

I optimal plan:

1

load

AL 2

move

_LR

3

unload

AR 4

load

BR 5

move

RL 6

unload

_BL

I optimal relaxed plan: ?

I h ^∗ (I ) = 6, h ⁺ (I ) = ?

(5)

Example: 8-Puzzle

1 2 3

5 6 8

4 7

1 2 3

4 5

6 7 8

I (original) task:

I A tile can be moved from cell A to B if A and B are adjacent and B is free.

I simplification (basis for Manhattan distance):

I A tile can be moved from cell A to B if A and B are adjacent.

I relaxed task:

I A tile can be moved from cell A to B if A and B are adjacent and B is free.

I . . . where delete effects are ignored

(in particular: free cells at earlier time remain free)

Example: 8-Puzzle

1 2 3

5 6 8

4 7

1 2 3

4 5

6 7 8

I actual goal distance: h ^∗ (s ) = 8 I Manhattan distance: h ^MD (s ) = 6 I optimal delete relaxation: h ⁺ (s ) = 7

relationship:

h ⁺ dominates the Manhattan distance in the sliding tile puzzle (i.e., h ^MD (s) ≤ h ⁺ (s) ≤ h ^∗ (s ) for all states s )

Relaxed Solutions: Suboptimal or Optimal?

I For general STRIPS planning tasks, h ⁺ is an admissible and consistent heuristic.

I Can h ⁺ be computed efficiently?

I It is easy to solve delete-free planning tasks suboptimally. (How?)

I optimal solution (and hence the computation of h

⁺

) is NP-hard (reduction from Set Cover )

I In practice, heuristics approximate h ⁺ from below or above.

35. Automated Planning: Delete Relaxation Summary

35.4 Summary

(6)

35. Automated Planning: Delete Relaxation Summary

Summary

delete relaxation:

I ignore negative effects (delete effects) of actions I use solution costs of relaxed planning task

as heuristic for solution costs of the original planning task I computation of optimal relaxed solution costs h ⁺ is NP-hard,

hence usually approximated from below or above

35. Automated Planning: Delete Relaxation

Foundations of Artificial Intelligence

35. Automated Planning: Delete Relaxation

Malte Helmert

University of Basel

May 3, 2021

Foundations of Artificial Intelligence

May 3, 2021 — 35. Automated Planning: Delete Relaxation

35.1 How to Design Heuristics?

35.2 Delete Relaxation 35.3 Examples

35.4 Summary

35.1 How to Design Heuristics?

A Simple Planning Heuristic

The STRIPS planner (Fikes & Nilsson, 1971) uses the number of goals not yet satisfied in a STRIPS planning task as heuristic:

h(s ) := |G \ s|.

intuition: fewer unsatisfied goals closer to goal state

STRIPS heuristic (properties?)

Problems of STRIPS Heuristic

drawback of STRIPS heuristic?

I rather uninformed:

For state s , if there is no applicable action a in s such that applying a in s satisfies strictly more (or fewer) goals, then all successor states have the same heuristic value as s.

I ignores almost the whole task structure:

The heuristic values do not depend on the actions.

we need better methods to design heuristics

Planning Heuristics

We consider three basic ideas for general heuristics:

I delete relaxation this and next chapter I abstraction later

I landmarks later

Delete Relaxation: Basic Idea

Estimate solution costs by considering a simplified planning task, where all negative action effects are ignored.

Automated Planning: Overview

Chapter overview: automated planning I 33. Introduction

I 34. Planning Formalisms

I 35.–36. Planning Heuristics: Delete Relaxation I 35. Delete Relaxation

I 36. Delete Relaxation Heuristics I 37. Planning Heuristics: Abstraction I 38.–39. Planning Heuristics: Landmarks

35.2 Delete Relaxation

Relaxed Planning Tasks: Idea

In STRIPS planning tasks,

good and bad effects are easy to distinguish:

I Add effects are always useful.

I Delete effects are always harmful.

Why?

idea for designing heuristics: ignore all delete effects

Relaxed Planning Tasks

Definition (relaxation of actions)

The relaxation a + of STRIPS action a is the action with pre(a + ) = pre(a), add(a + ) = add(a), cost(a + ) = cost(a), and del(a + ) = ∅.

German: Relaxierung von Aktionen

Definition (relaxation of planning tasks)

The relaxation Π + of a STRIPS planning task Π = hV , I, G , Ai is the task Π + := hV , I , G , {a + | a ∈ A}i.

German: Relaxierung von Planungsaufgaben

Relaxed Planning Tasks

Definition (relaxation of action sequences)

The relaxation of action sequence π = ha 1 , . . . , a n i is the action sequence π + := ha + 1 , . . . , a + n i.

German: Relaxierung von Aktionsfolgen

Relaxed Planning Tasks: Terminology

I STRIPS planning tasks without delete effects are called relaxed planning tasks

or delete-free planning tasks.

I Plans for relaxed planning tasks are called relaxed plans.

I If Π is a STRIPS planning task and π + is a plan for Π + , then π + is called relaxed plan for Π.

I h + (Π) denotes the cost of an optimal plan for Π + , i.e., of an optimal relaxed plan.

I analogously: h + (s ) cost of optimal relaxed plan

starting in state s (instead of initial state)

I h + is called optimal relaxation heuristic.

35.3 Examples

Example: Logistics

→

Example (Logistics Task)

I variables: V = {at AL , at AR , at BL , at BR , at TL , at TR , in AT , in BT } I initial state: I = {at AL , at BR , at TL }

I goals: G = {at AR , at BL }

I actions: {move LR , move RL , load AL , load AR , load BL , load BR , unload AL , unload AR , unload BL , unload BR } I . . .

Example: Logistics

→

Example (Logistics Task)

I pre(move LR ) = {at TL }, add(move LR ) = {at TR }, del(move LR ) = {at TL }, cost(move LR ) = 1

I pre(load AL ) = {at TL , at AL }, add(load AL ) = {in AT }, del(load AL ) = {at AL }, cost(load AL ) = 1

I pre(unload AL ) = {at TL , in AT }, add(unload AL ) = {at AL }, del(unload AL ) = {in AT }, cost(unload AL ) = 1

I . . .

Example: Logistics

→

I optimal plan:

The relaxation a ⁺ of STRIPS action a is the action with pre(a ⁺ ) = pre(a), add(a ⁺ ) = add(a), cost(a ⁺ ) = cost(a), and del(a ⁺ ) = ∅.

The relaxation Π ⁺ of a STRIPS planning task Π = hV , I, G , Ai is the task Π ⁺ := hV , I , G , {a ⁺ | a ∈ A}i.

The relaxation of action sequence π = ha ₁ , . . . , a _n i is the action sequence π ⁺ := ha ⁺ ₁ , . . . , a ⁺ _n i.

I If Π is a STRIPS planning task and π ⁺ is a plan for Π ⁺ , then π ⁺ is called relaxed plan for Π.

I h ⁺ (Π) denotes the cost of an optimal plan for Π ⁺ , i.e., of an optimal relaxed plan.

I analogously: h ⁺ (s ) cost of optimal relaxed plan

I h ⁺ is called optimal relaxation heuristic.

I variables: V = {at _AL , at _AR , at _BL , at _BR , at _TL , at _TR , in _AT , in _BT } I initial state: I = {at _AL , at _BR , at _TL }

I goals: G = {at _AR , at _BL }

I actions: {move _LR , move _RL , load _AL , load _AR , load _BL , load _BR , unload _AL , unload _AR , unload _BL , unload _BR } I . . .

I pre(move _LR ) = {at _TL }, add(move _LR ) = {at _TR }, del(move _LR ) = {at _TL }, cost(move _LR ) = 1

I pre(load _AL ) = {at _TL , at _AL }, add(load _AL ) = {in _AT }, del(load _AL ) = {at _AL }, cost(load _AL ) = 1

I pre(unload _AL ) = {at _TL , in _AT }, add(unload _AL ) = {at _AL }, del(unload _AL ) = {in _AT }, cost(unload _AL ) = 1

I h ^∗ (I ) = 6, h ⁺ (I ) = ?

I actual goal distance: h ^∗ (s ) = 8 I Manhattan distance: h ^MD (s ) = 6 I optimal delete relaxation: h ⁺ (s ) = 7

h ⁺ dominates the Manhattan distance in the sliding tile puzzle (i.e., h ^MD (s) ≤ h ⁺ (s) ≤ h ^∗ (s ) for all states s )

I For general STRIPS planning tasks, h ⁺ is an admissible and consistent heuristic.

I Can h ⁺ be computed efficiently?

I In practice, heuristics approximate h ⁺ from below or above.

as heuristic for solution costs of the original planning task I computation of optimal relaxed solution costs h ⁺ is NP-hard,