• Keine Ergebnisse gefunden

Foundations of Artificial Intelligence 35. Automated Planning: Delete Relaxation Malte Helmert

N/A
N/A
Protected

Academic year: 2022

Aktie "Foundations of Artificial Intelligence 35. Automated Planning: Delete Relaxation Malte Helmert"

Copied!
24
0
0

Wird geladen.... (Jetzt Volltext ansehen)

Volltext

(1)

35. Automated Planning: Delete Relaxation

Malte Helmert

University of Basel

May 3, 2021

(2)

How to Design Heuristics?

(3)

A Simple Planning Heuristic

The STRIPS planner (Fikes & Nilsson, 1971) uses thenumber of goals not yet satisfiedin a STRIPS planning task as heuristic:

h(s) :=|G\s|.

intuition: fewer unsatisfied goals closer to goal state STRIPS heuristic (properties?)

(4)

Problems of STRIPS Heuristic

drawback of STRIPS heuristic?

ratheruninformed:

For state s, if there is no applicable action ain s such that applyinga in s satisfies strictly more (or fewer) goals, then all successor states have the same heuristic value as s.

ignores almost the wholetask structure:

The heuristic values do not depend on the actions.

we need better methods to design heuristics

(5)

Planning Heuristics

We considerthree basic ideas for general heuristics:

delete relaxation this and next chapter abstraction later

landmarks later

Delete Relaxation: Basic Idea

Estimate solution costs by considering a simplified planning task, where all negative action effects are ignored.

(6)

Planning Heuristics

We considerthree basic ideas for general heuristics:

delete relaxation this and next chapter abstraction later

landmarks later

Delete Relaxation: Basic Idea

Estimate solution costs by considering asimplified planning task, where allnegative action effects are ignored.

(7)

Automated Planning: Overview

Chapter overview: automated planning 33. Introduction

34. Planning Formalisms

35.–36. Planning Heuristics: Delete Relaxation 35. Delete Relaxation

36. Delete Relaxation Heuristics 37. Planning Heuristics: Abstraction 38.–39. Planning Heuristics: Landmarks

(8)

Delete Relaxation

(9)

Relaxed Planning Tasks: Idea

In STRIPS planning tasks,

good and bad effects are easy to distinguish:

Add effects are alwaysuseful.

Delete effects are always harmful.

Why?

idea for designing heuristics: ignore all delete effects

(10)

Relaxed Planning Tasks: Idea

In STRIPS planning tasks,

good and bad effects are easy to distinguish:

Add effects are alwaysuseful.

Delete effects are always harmful.

Why?

idea for designing heuristics: ignore all delete effects

(11)

Relaxed Planning Tasks

Definition (relaxation of actions)

Therelaxationa+ of STRIPS actionais the action with pre(a+) =pre(a),add(a+) =add(a), cost(a+) =cost(a), anddel(a+) =∅.

German: Relaxierung von Aktionen

Definition (relaxation of planning tasks)

Therelaxation Π+ of a STRIPS planning task Π =hV,I,G,Ai is the task Π+ :=hV,I,G,{a+|a∈A}i.

German: Relaxierung von Planungsaufgaben

(12)

Relaxed Planning Tasks

Definition (relaxation of action sequences)

Therelaxationof action sequence π=ha1, . . . ,ani is the action sequenceπ+ :=ha+1, . . . ,a+ni.

German: Relaxierung von Aktionsfolgen

(13)

Relaxed Planning Tasks: Terminology

STRIPS planning tasks without delete effects are called relaxed planning tasks

or delete-free planning tasks.

Plans for relaxed planning tasks are called relaxed plans.

If Π is a STRIPS planning task andπ+ is a plan for Π+, then π+ is called relaxed plan for Π.

h+(Π) denotes the cost of an optimal plan for Π+, i.e., of an optimal relaxed plan.

analogously: h+(s) cost of optimal relaxed plan starting in state s (instead of initial state) h+ is called optimal relaxation heuristic.

(14)

Relaxed Planning Tasks: Terminology

STRIPS planning tasks without delete effects are called relaxed planning tasks

or delete-free planning tasks.

Plans for relaxed planning tasks are called relaxed plans.

If Π is a STRIPS planning task andπ+ is a plan for Π+, then π+ is called relaxed plan for Π.

h+(Π)denotes the cost of an optimal plan for Π+, i.e., of an optimal relaxed plan.

analogously: h+(s) cost of optimal relaxed plan starting in state s (instead of initial state) h+ is called optimal relaxation heuristic.

(15)

Examples

(16)

Example: Logistics

Example (Logistics Task)

variables: V ={atAL,atAR,atBL,atBR,atTL,atTR,inAT,inBT} initial state: I ={atAL,atBR,atTL}

goals: G ={atAR,atBL}

actions: {moveLR,moveRL,loadAL,loadAR,loadBL,loadBR, unloadAL,unloadAR,unloadBL,unloadBR} . . .

(17)

Example: Logistics

Example (Logistics Task)

pre(moveLR) ={atTL},add(moveLR) ={atTR}, del(moveLR) ={atTL},cost(moveLR) = 1

pre(loadAL) ={atTL,atAL},add(loadAL) ={inAT}, del(loadAL) ={atAL},cost(loadAL) = 1

pre(unloadAL) ={atTL,inAT},add(unloadAL) ={atAL}, del(unloadAL) ={inAT},cost(unloadAL) = 1

. . .

(18)

Example: Logistics

optimal plan:

1 loadAL

2 moveLR

3 unloadAR 4 loadBR

5 moveRL

6 unloadBL

optimal relaxed plan: ? h(I) = 6, h+(I) =?

(19)

Example: 8-Puzzle

1 2 3

5 6 8

4 7

1 2 3

4 5

6 7 8

(original) task:

A tile can be moved from cell A to B if A and B are adjacent and B is free.

simplification (basis for Manhattan distance):

A tile can be moved from cell A to B if A and B are adjacent.

relaxed task:

A tile can be moved from cell A to B if A and B are adjacent and B is free.

. . . where delete effects are ignored

(in particular: free cells at earlier time remain free)

(20)

Example: 8-Puzzle

1 2 3

5 6 8

4 7

1 2 3

4 5

6 7 8

actual goal distance: h(s) = 8 Manhattan distance: hMD(s) = 6 optimal delete relaxation: h+(s) = 7

relationship:

h+ dominates the Manhattan distance in the sliding tile puzzle (i.e.,hMD(s)≤h+(s)≤h(s) for all states s)

(21)

Relaxed Solutions: Suboptimal or Optimal?

For general STRIPS planning tasks, h+ is an admissible and consistent heuristic.

Canh+ be computed efficiently?

It is easy to solve delete-free planning tasks suboptimally. (How?)

optimal solution (and hence the computation ofh+) is NP-hard (reduction fromSet Cover)

In practice, heuristics approximate h+ from below or above.

(22)

Relaxed Solutions: Suboptimal or Optimal?

For general STRIPS planning tasks, h+ is an admissible and consistent heuristic.

Canh+ be computed efficiently?

It iseasyto solve delete-free planning tasks suboptimally. (How?)

optimal solution (and hence the computation ofh+) isNP-hard (reduction fromSet Cover)

In practice, heuristics approximate h+ from below or above.

(23)

Summary

(24)

Summary

delete relaxation:

ignore negative effects (delete effects) of actions use solution costs of relaxed planning task

as heuristicfor solution costs of the original planning task computation of optimal relaxed solution costs h+ is NP-hard, hence usuallyapproximatedfrom below or above

Referenzen

ÄHNLICHE DOKUMENTE

Propositional formulas combine atomic formulas with ¬, ∧, ∨, → to more complex statements. Interpretations determine which atomic formulas are true and which ones

if splitting rule applied, then current formula satisfiable, and if a wrong decision is taken, then this will be recognized without applying further splitting rules (i.e., only

compact description of state space as input to algorithms state spaces exponentially larger than the input algorithms directly operate on compact description allows automatic

very similar to STRIPS: state variables not necessarily binary, but with given finite domain (cf. CSPs). states are assignments to these

I ignore negative effects (delete effects) of actions I use solution costs of relaxed planning task. as heuristic for solution costs of the original planning task I computation

Relaxed Planning Graphs Maximum and Additive Heuristics FF Heuristic Summary.. Automated

Automated Planning: Delete Relaxation Heuristics Relaxed Planning Graphs.. 36.1 Relaxed

initialize first RL policy network to SL policy network in each iteration, pick a former RL policy network uniformly randomly prevents overfitting to the current policy play with