subject

Consider the following gridworld MDP. The states are grid squares, identified by their row and column number (row first). The agent always starts in state (1,1), marked with the letter S. There are two terminal goal states, (2,3) with reward 5 and (1,3) with reward -5. Rewards are 0 in non-terminal states. (The reward for a state is received as the agent moves into the state). The transition function is such that the intended agent movement (Up, Down, Left, or Right) happens with probability .8. With probability .1 each, the agent ends up in one of the states perpendicular to the intended direction. If a collision with a wall happens, the agent stays in the same state. +5
S -5
Which of the following is the optimal policy for this grid ?
A. Right Right +5
Up Left -5
B. Down Left +5
Right Up -5
C. Right Down +5
Up Right -5
D. Right Right +5
Right Right -5

ansver
Answers: 2

Another question on Computers and Technology

question
Computers and Technology, 22.06.2019 10:00
Which of the following is true of operations within a spreadsheet program’s built-in functions? a. operations within parentheses, then multiplication and division, and then addition and subtraction are computed. b. operations within parentheses, then addition and subtraction, and then multiplication and division are computed. c. multiplication and division, then addition and subtraction, and then operations within parentheses are computed. d. addition and subtraction, then multiplication and division, and then operations within parentheses are computed
Answers: 2
question
Computers and Technology, 22.06.2019 14:30
Hi plz 11 ! when planning a table, what step comes first: "define the column headers" or "calculate the number of columns/rows"? a. calculate the number of columns/rows b. define the column headers
Answers: 1
question
Computers and Technology, 22.06.2019 20:00
Need asap assignment directions: think of an organization (business, religious institution, volunteer organization, sports team) with which you have been involved. imagine outfitting it with an it infrastructure. prepare a plan for what you would do to support outfitting it. draw a map of a network connecting all the individuals, give them pcs and printers, and lay out the design as best you can. the purpose is to begin working with these concepts, not to build a perfect network.
Answers: 2
question
Computers and Technology, 22.06.2019 21:30
This graph compares the total cost of attending educational institutions in texas. the graph demonstrates that the cost at private and public technical schools greatly varies.
Answers: 2
You know the right answer?
Consider the following gridworld MDP. The states are grid squares, identified by their row and colum...
Questions
question
Geography, 22.11.2020 06:00
question
Mathematics, 22.11.2020 06:00
question
Social Studies, 22.11.2020 06:00
question
Social Studies, 22.11.2020 06:00
question
Health, 22.11.2020 06:00
question
Mathematics, 22.11.2020 06:00
question
Biology, 22.11.2020 06:00
Questions on the website: 13722360