Why the value still needs the step index f

noise level f / F 0.20

V at xᶠ (on-trajectory)–

slope ∇V at xᶠ–

If the on-trajectory scalar is the same at every f, why does V still take f as input? Because the policy update needs the gradient ∇_xᶠV, and xᶠ means a very different thing at different noise levels: near f=0 it is almost pure noise (the value surface is broad and gentle), near f=F it is nearly the finished action (the surface is sharp). Slide f: the on-trajectory value (the dot's height) barely moves, but the shape and slope of V(s,·,f) change a lot — so the network needs f to point the gradient the right way at each level.