728x90
https://github.com/seohyunjun/RL_DDPG
DDPG
* Continuous Action Space RL ๋ฌธ์ ํด๊ฒฐ (๊ธฐ์กด์ DQN discrete action space)
* DQN์์ actor-critic ์ฌ์ฉ
* off-policy
* Target Network(=AC) ์ฌ์ฉ
* Soft Update(Target Network๋ฅผ ์ ๋ฐ์ดํธํ ๋, parameter update t(tau) ๋น์จ ์กฐ์ )
[Example Mujoco_Humanoid-v4] Episode 30
๋ฐ์ํ
'๐พ Deep Learning' ์นดํ ๊ณ ๋ฆฌ์ ๋ค๋ฅธ ๊ธ
[RL] Soft Actor-Critic (a.k.a SAC) (0) | 2023.04.12 |
---|---|
[M1] Whisper.cpp Deploy C++ (ALL OS-) (0) | 2023.04.06 |
[RL] M1 Mac Mujoco_py ์ค์น (gcc@9 error) (0) | 2023.03.29 |
[RL] A3C (๋น๋๊ธฐ Advantage Actor-Critic) ์ ๋ฆฌ (0) | 2023.03.28 |
[RL] A3C (Asynchronous Advantage Actor-Critic) (0) | 2023.03.28 |