subject
Mathematics, 21.02.2020 17:58 deonceee4671

In a coin game, you repeatedly toss a biased coin (0.4 for head, 0.6 for tail). Each head represents 3 points and tail represents 1 point. You can either Toss or Stop if the total number of points you have tossed is no more than 7. Otherwise, you must Stop. When you Stop, your utility is equal to your total points (up to 7), or 0 if you get a total of 8 points or higher. When you Toss, you receive no utility. There is no discounting (= 1).

(a) What are the states and the actions for this MDP? Which states are terminal?
(b) What is the transition function and the reward function for this MDP? Hint: The problem may be simpler to formulate using the general version of rewards: R(s, a, s')
(c) Run value iteration to find the optimal value function V* for the MDP. Show each Vk step (starting from Vo(s) = 0 for all states s). For a reasonable MDP formulation, this should converge in fewer than 10 steps. If you find it too tedious to do by hand, you may write a program to do this for you; however, there may be some benefit in seeing the calculation unfolding in front of you.
(d) Using the V* you found, determine the optimal policy for this MDP.

ansver
Answers: 3

Another question on Mathematics

question
Mathematics, 21.06.2019 17:50
F(x)=x/2-2 and g(x)=2x^2+x-3 find (f+g)(x)
Answers: 3
question
Mathematics, 21.06.2019 19:20
What is x3+3x2−16x−48 divided by x−1?
Answers: 1
question
Mathematics, 21.06.2019 22:00
Nikita wants to apply for student aid to fund her college education. arrange the steps involved in nikita’s application for financial aid
Answers: 3
question
Mathematics, 21.06.2019 23:30
Asinusoidal function whose frequency is 1/6pi
Answers: 2
You know the right answer?
In a coin game, you repeatedly toss a biased coin (0.4 for head, 0.6 for tail). Each head represents...
Questions
question
Mathematics, 15.05.2021 14:00
question
Chemistry, 15.05.2021 14:00
Questions on the website: 13722359