[RL] Deep Deterministic Policy Gradient (A.K.A DDPG)
·
👾 Deep Learning
https://github.com/seohyunjun/RL_DDPG GitHub - seohyunjun/RL_DDPG: CONTINUOUS CONTROL WITH DEEP REINFORCEMENT LEARNING (a.k.a DDPG) CONTINUOUS CONTROL WITH DEEP REINFORCEMENT LEARNING (a.k.a DDPG) - GitHub - seohyunjun/RL_DDPG: CONTINUOUS CONTROL WITH DEEP REINFORCEMENT LEARNING (a.k.a DDPG) github.com DDPG * Continuous Action Space RL 문제 해결 (기존의 DQN discrete action space) * DQN에서 actor-critic 사..
[leetcode 739] Daily Temperature
·
🐢 One step
LeetCode-739 Daily Temperature : 몇 일 후 기온이 오를까? note : Answer :::python class Solution: def dailyTemperatures(self, temperatures: List[int]) -> List[int]: out = [] left, right = 0, 0 while left != len(temperatures)-1: day = 0 check = 0 while right temperatures[stack[-1]]: last = stack.pop() answer[last] = i - last stack.append(i) return answer Result : 1352ms Memory: 28.6mb
[git] You've added another git repository inside your current repository.
·
🏃 Routine
힌트: You've added another git repository inside your current repository. 힌트: Clones of the outer repository will not contain the contents of 힌트: the embedded repository and will not know how to obtain it. 힌트: If you meant to add a submodule, use: 힌트: 힌트: git submodule add spinningup 힌트: 힌트: If you added this path by mistake, you can remove it from the 힌트: index with: 힌트: 힌트: git rm --cached spinn..
[LeetCode-316] Remove Duplicate Letters
·
🐢 One step
LeetCode-316 Remove Duplicate Letters : Given a string s, remove duplicate letters so that every letter appears once and only once. You must make sure your result is the smallest in lexicographical order among all possible results. note :Answer :::python class Solution: def removeDuplicateLetters(self, s: str) -> str: for char in sorted(set(s)): suffix = s[s.index(char):] if set(s)==set(suffix):..
[RL] M1 Mac Mujoco_py 설치 (gcc@9 error)
·
👾 Deep Learning
https://github.com/deepmind/mujoco/releases Releases · deepmind/mujoco Multi-Joint dynamics with Contact. A general purpose physics simulator. - deepmind/mujoco github.com Mujoco_py pip 설치 전 수행 작업 https://github.com/openai/mujoco-py/issues/662 Support MuJoCo 2.1.1 (including arm64 mac support) · Issue #662 · openai/mujoco-py I hope this can be a tracking issue for supporting MuJoCo 2.1.1, which ..
[leetcode-20] Valid Parenthese
·
🐢 One step
LeetCode-20 Valid Parentheses : Given a string s containing just the characters (, ), {, }, [ and ], determine if the input string is valid. note : s consists of parentheses only ()[]{}Answer :::python class Solution: def isValid(self, s: str) -> bool: valid = [] dict_valid = { "}":"{", "]":"[", ")":"(" } for l in s: if l not in dict_valid: valid.append(l) elif not valid or dict_valid[l] != vali..
[RL] A3C (비동기 Advantage Actor-Critic) 정리
·
👾 Deep Learning
Policy-Based 기존에 Value Based 즉 Q-value를 예측하는 방식은 State와 action에 의존해 항상 trajectories(state-action-reward sequence)를 구해나가야하는 제약이 있었다. 하지만 Policy-Based는 Q-value뿐 아니라 Policy에 대한 추정도 같이하는 것이다. 우리가 원하는 것은 Agent가 올바른 길로 가는 전략을 찾는 것으로 Policy-Based가 이를 더 잘 반영해주었다. 장점으로는 - policy를 직접 학습하므로 안정성이 높다.(환경 변화, 노이즈에 덜 민감) - 확률적인 정책(Exploration, Exploitation) 사이의 균형을 조절하면서 π*(Optimal Policy)를 학습 - Continuous spa..
[WARNING:torch.distributed.run] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid
·
🐍 Python
https://you.com/search?q=setting+omp_num_threads+environment+variable+for+each+process+to+be+1+in+default%2C+to+avoid+your+system+being+overloaded%2C+please+further+tune+the+variable+for+optimal+performance+in+your+application+as+needed.&tbm=youchat&cfr=chatb&cid=c2_fb5a239a-f7d6-44eb-88d6-662025275ef2 OMP_NUM_THREADS는 OpenMP 라이브러리가 병렬 처리를 위해 사용하는 스레드 수를 설정하는 데 사용되는 환경 변수입니다. 기본적으로 1로 설정하면 여러 스레..
다했다
'분류 전체보기' 카테고리의 글 목록 (28 Page)