100% Guaranteed Results


RL – Reinforcement Learning Assignment 1 Solved
$ 20.99
Category:

Description

5/5 – (1 vote)

1 Introduction
The goal of this assignment is to do experiment with Dynamic Programming(DP), including iterative policy evaluation, policy iteration and value iteration. Your goal is to implement DP methods and test them in the small gridworld mentioned in the slides of Lecture 3.
2 Small Gridworld

Figure 1: Gridworld
As shown in Fig.1, each grid in the Gridworld represents a certain state. Let st denotes the state at grid t. Hence the state space can be denoted as S = {st|t ∈ 0,..,35}. S1 and S35 are terminal states, where the others are nonterminal states and can move one grid to north, east, south and west. Hence the action space is A = {n,e,s,w}. Note that actions leading out of the Gridworld leave state unchanged. Each movement get a reward of -1 until the terminal state is reached.
A good policy should be able to find the shortest way to the terminal state randomly given an initial non-terminal state.
3 Experiment Requirments
• Programming language: python3
• You should build the Gridworld environment and implement iterative policy evaluation, policy iteration and value iteration methods. Then run the two methods to evaluate and improve an uniform random policy π(n|·) = π(e|·) = π(s|·) = π(w|·) = 0.25
4 Report and Submission
• Your report and source code should be compressed and named after “studentID+name”.

Reviews

There are no reviews yet.

Be the first to review “RL – Reinforcement Learning Assignment 1 Solved”

Your email address will not be published. Required fields are marked *

Related products