• Keine Ergebnisse gefunden

Foundations of Artificial Intelligence

N/A
N/A
Protected

Academic year: 2022

Aktie "Foundations of Artificial Intelligence"

Copied!
2
0
0

Wird geladen.... (Jetzt Volltext ansehen)

Volltext

(1)

Foundations of Artificial Intelligence

M. Helmert S. Eriksson Spring Term 2021

University of Basel Computer Science

Exercise Sheet 12

Due: May 26, 2021 Exercise 12.1(4 marks)

MAX MIN MAX

MIN MAX

9 4 0 3 1 2 12 7 6 3 4 11 7

4 5

8

Apply alpha-beta search to the game tree depicted above, considering successor nodes from left to right. Annotate all considered nodes with the returned value as well as the (last) alpha and beta values, and show which parts of the tree can be pruned (e.g., by drawing cut lines through edges which lead to subtrees that do not need to be considered).

Exercise 12.2(3 marks) MAX

MIN MAX

MIN MAX

7 3 12 6 14 2 10 16 6 8 9 12 13 15 2 14

Apply the first 4 iterations of Monte Carlo Tree Search with the following policies:

• tree policy: For MAX nodes select the successor with higest utility, for MIN nodes select the successor with lowest utility.

• expansion: Always pick the left child first.

• default policy: Always pick the right child.

(2)

Estimate updates calculate the average of all utilities that were backpropagated trough that node.

Given a node n with (old) utility u, (updated) visit count v and r being utility of the terminal node that was reached in the simulation phase, we can efficiently calculate the updated utility for nwithu+ (r−u)/v.

Exercise 12.3(1+1 marks)

Assume we are in the middle of the Monte-Carlo selection phase and need to select between 3 MIN nodesn1,n2 andn3:

MAX MIN

12

4.5 4.0 4.3

352

150 2 200

n1 n2 n3

(a) Compute the probabilites of selecting each child when using Softmax withτ= 5.

Hint: To compute the probabilities, first compute xni =eu(ˆniτ ) for all nodes and then divide xni by the sum of allxni.

(b) Which child would Upper Confidence Bound select? Provide ˆu(ni) +B(ni) for all children.

Exercise 12.4(1 mark)

Is the tree policy used in AlphaGo asymptotically optimal? Justify your answer (you do not need to formally prove it).

Submission rules:

Upload a single PDF file (ending .pdf). If you want to submit handwritten parts, include their scans in the single PDF. Put the names of all group members on top of the first page. Use page numbers or put your names on each page. Make sure your PDF has size A4 (fits the page size if printed on A4).

Referenzen

ÄHNLICHE DOKUMENTE

Perfom DPLL on the clause set {{A, ¬B}, {¬A, B}, {B, ¬D}, {C}, {¬C, ¬B, ¬D}, {C, D}}, always picking the variable occuring in the highest number of clauses and always considering

In cases where the precondition choice function is not deterministic, choose the precondition in alphabetical order.. Exercise 11.3

Characterize the following definitions of Artificial Intelligence with respect to the four categories (acting humanly, thinking humanly, thinking rationally, acting rationally)

– one line for each table position: maximum number of blocks in the tower at this table position, followed by the IDs of the blocks in the initial state, starting from the surface

Successor nodes are generated by applying the following actions in order, ignoring the ones that are inapplicable in a given state: transport 2 missionaries, transport 1

(c) Test your implementation by verifying the statements on Slide 23 of Chapter 20 (print version), which state that hill climbing with a random initialization finds a solution

Provide a worst-case runtime estimate of the algorithm based on cutset conditioning if your cutset from the first part of this exercise is used (i.e., compute an upper bound for

The domain description (variables and actions) is given in the file bridges-once-domain.pddl, and the problem description (objects, initial state and goal description) is given in