728x90

https://github.com/seohyunjun/RL_DDPG 

 

GitHub - seohyunjun/RL_DDPG: CONTINUOUS CONTROL WITH DEEP REINFORCEMENT LEARNING (a.k.a DDPG)

CONTINUOUS CONTROL WITH DEEP REINFORCEMENT LEARNING (a.k.a DDPG) - GitHub - seohyunjun/RL_DDPG: CONTINUOUS CONTROL WITH DEEP REINFORCEMENT LEARNING (a.k.a DDPG)

github.com

 

DDPG 

 

*  Continuous Action Space RL ๋ฌธ์ œ ํ•ด๊ฒฐ (๊ธฐ์กด์˜ DQN discrete action space)

*  DQN์—์„œ actor-critic ์‚ฌ์šฉ

*  off-policy 

*  Target Network(=AC) ์‚ฌ์šฉ

Soft Update(Target Network๋ฅผ ์—…๋ฐ์ดํŠธํ•  ๋•Œ, parameter update t(tau) ๋น„์œจ ์กฐ์ •)

 

[Example Mujoco_Humanoid-v4] Episode 30 

 

Humanoid-V4 TD3 total reward : 2501.346

 

๋ฐ˜์‘ํ˜•
๋‹คํ–ˆ๋‹ค