CS-456: Project- Dyna algo model

Hello,
As it is explained in section "4.2 Model building" of the pdf, your agent will build a (simple) model of the reward and transition probabilities. The agent's model should capture the expected probability of each transition and reward, given a state and an action. Since the rewards and transitions may be stochastic, this estimation must be updated with each new observation.

I hope this answers your question.

ANN Forum

Project- Dyna algo model

Re: Project- Dyna algo model