Exercise 5 3.c

Exercise 5 3.c

par Jakhongir Saydaliev,
Nombre de réponses : 1

In the solutions of exercise 5 3.c the answer is given as following:

When a=a', should we not have (1-gamma) as a coefficient? I don't understand how we are eliminating gamma when a=a'.

En réponse à Jakhongir Saydaliev

Re: Exercise 5 3.c

par Ariane Delrocq,
The elimination of gamma is independent of a=a'.
$\gamma$ only appears in delta_t as a factor of Q(s', a'). When using the semi gradient, Q(s', a') is considered 

fixed and independent of the weights. Therefore when deriving delta_t, \gamma multiplies the derivative of  Q(s', a') which is 0 (shown in the first equation of the answer).  Therefore this term disappear and only the derivative of  Q(s, a) remains.

Note that gamma still implicitly appears in delta_t.