Deep Learning

https://arxiv.org/abs/2304.06035 Choose Your Weapon: Survival Strategies for Depressed AI Academics Are you an AI researcher at an academic institution? Are you anxious you are not coping with the current pace of AI advancements? Do you feel you have no (or very limited) access to the computational and human resources required for an AI research breakthr arxiv.org Abstract AI 종사자인가요? 넵 점점 커져가는 A..
https://github.com/seohyunjun/RL_SAC/blob/main/README.md GitHub - seohyunjun/RL_SAC: Soft Actor-Critic Soft Actor-Critic. Contribute to seohyunjun/RL_SAC development by creating an account on GitHub. github.com * SAC (Soft Actor-Critic) Continuous Action Space / Discrete Action Space 모든 공간에서 안정적인 Policy를 찾는 방법을 고안 기존의 DDPG / TD3에서 한번 더 나아가 다음 state의 action 또한 보고 다음 policy를 선택 (좋은 영양분만 주겠다) * Pol..
https://github.com/ggerganov/whisper.cpp GitHub - ggerganov/whisper.cpp: Port of OpenAI's Whisper model in C/C++ Port of OpenAI's Whisper model in C/C++. Contribute to ggerganov/whisper.cpp development by creating an account on GitHub. github.com M1 Install 1 . git clone으로 최신 버전으로 설치할 경우 M1에서 .o architecture error 발생으로 [stable version]을 다운로드 한다. https://github.com/ggerganov/whisper.cpp/releases/..
https://github.com/seohyunjun/RL_DDPG GitHub - seohyunjun/RL_DDPG: CONTINUOUS CONTROL WITH DEEP REINFORCEMENT LEARNING (a.k.a DDPG) CONTINUOUS CONTROL WITH DEEP REINFORCEMENT LEARNING (a.k.a DDPG) - GitHub - seohyunjun/RL_DDPG: CONTINUOUS CONTROL WITH DEEP REINFORCEMENT LEARNING (a.k.a DDPG) github.com DDPG * Continuous Action Space RL 문제 해결 (기존의 DQN discrete action space) * DQN에서 actor-critic 사..
https://github.com/deepmind/mujoco/releases Releases · deepmind/mujoco Multi-Joint dynamics with Contact. A general purpose physics simulator. - deepmind/mujoco github.com Mujoco_py pip 설치 전 수행 작업 https://github.com/openai/mujoco-py/issues/662 Support MuJoCo 2.1.1 (including arm64 mac support) · Issue #662 · openai/mujoco-py I hope this can be a tracking issue for supporting MuJoCo 2.1.1, which ..
Policy-Based 기존에 Value Based 즉 Q-value를 예측하는 방식은 State와 action에 의존해 항상 trajectories(state-action-reward sequence)를 구해나가야하는 제약이 있었다. 하지만 Policy-Based는 Q-value뿐 아니라 Policy에 대한 추정도 같이하는 것이다. 우리가 원하는 것은 Agent가 올바른 길로 가는 전략을 찾는 것으로 Policy-Based가 이를 더 잘 반영해주었다. 장점으로는 - policy를 직접 학습하므로 안정성이 높다.(환경 변화, 노이즈에 덜 민감) - 확률적인 정책(Exploration, Exploitation) 사이의 균형을 조절하면서 π*(Optimal Policy)를 학습 - Continuous spa..
https://github.com/seohyunjun/RL_A3C GitHub - seohyunjun/RL_A3C: A3C (asynchronous advantage actor-critic) A3C (asynchronous advantage actor-critic). Contribute to seohyunjun/RL_A3C development by creating an account on GitHub. github.com
https://github.com/seohyunjun/Deep_RL/tree/main/1_DQN GitHub - seohyunjun/Deep_RL: Reinforcement Learning Reinforcement Learning. Contribute to seohyunjun/Deep_RL development by creating an account on GitHub. github.com CartPole-V1
다했다
'Deep Learning' 카테고리의 글 목록 (2 Page)