photo

James Sorokhaibam


Last seen: 3 dagen ago Active since 2024

Followers: 0   Following: 0

Statistics

Feeds

View by

Question


High fluctuation in Q0 value for TD3 agent while training.
I am training a TD3 RL agent for pick and place robot. The reward function is, reward = exp(-E/d) where E is the total energy co...

6 maanden ago | 1 answer | 0

1

answer