'Deep Learning' 카테고리의 글 목록

[Forensic Architecture] Justice Vision / "Vision으로 정의 구현"

2023.11.18· Deep Learning

https://forensic-architecture.org/ Investigations ← Forensic Architecture Forums Legal Process, Exhibition, Human Rights Report, Truth Commission, Web Platform forensic-architecture.org Wikipedia 더보기 Wikipedia 포렌식 아키텍처(Forensic Architecture) 는 런던 대학교 골드스미스(Goldsmiths) 에 기반을 둔 다학문적 연구 그룹으로 , 건축 기법과 기술을 사용하여 전 세계의 국가 폭력 및 인권 침해 사례를 조사합니다. 이 그룹은 건축가 Eyal Weizman 이 이끌고 있습니다 . [1] 그는 Forensic Archite..

[DATA] ACNE04 Dataset download(여드름 병변 검출)

2023.10.16· Deep Learning

https://github.com/xpwu95/LDL

[OpenAI] Whisper - Robust Speech Recognition via Large-Scale Weak Supervision

2023.08.19· Deep Learning

https://arxiv.org/abs/2212.04356 Robust Speech Recognition via Large-Scale Weak Supervision We study the capabilities of speech processing systems trained simply to predict large amounts of transcripts of audio on the internet. When scaled to 680,000 hours of multilingual and multitask supervision, the resulting models generalize well to standard arxiv.org Robust Speech Recognition via Large-Sca..

Demand forecasting in logistics

2023.07.25· Deep Learning

http://www.diva-portal.org/smash/get/diva2:1337390/FULLTEXT02.pdf

CM3leon(.Meta)

2023.07.16· Deep Learning

CM3leon: Multi-modal Generative AI 최근 몇 달 동안 생성 AI 모델 분야에서는 자연어 처리와 이미지 생성을 중심으로 상당한 발전이 이루어졌습니다. 그 중 하나인 CM3leon은 이미지에서 텍스트를 생성하고 텍스트에서 이미지를 생성할 수 있는 다중 모달 모델입니다. CM3leon에 대한 3가지 주요 인사이트는 다음과 같습니다: CM3leon은 다중 모달 모델로서 텍스트-이미지 생성에서 최첨단 성능을 달성하는 다재다능하고 효율적인 모델입니다. 이전의 트랜스포머 기반 모델보다 적은 컴퓨팅 파워로 훈련되었음에도 불구하고, CM3leon은 품질과 효율성 측면에서 능가합니다. CM3leon은 마스크된 혼합 모달(CM3) 모델로, 다른 이미지와 텍스트 콘텐츠의 임의의 시퀀스에 기반하여 텍..

[CS324] Introduction

2023.07.03· Deep Learning

https://stanford-cs324.github.io/winter2022/lectures/introduction/ Introduction Understanding and developing large language models. stanford-cs324.github.io CS324에 오신 것을 환영합니다! 이 과정은 대규모 언어 모델의 이해와 개발에 대한 새로운 강좌입니다. 1. 언어 모델이란 무엇인가요? 2. 간단한 역사 3. 이 강좌가 왜 필요한가요? 4. 이 강좌의 구조 5. 언어 모델이란 무엇인가요? 1. 언어 모델이란 무엇인가요? 언어 모델 (LM)의 클래식한 정의는 토큰 시퀀스에 대한 확률 분포입니다. 토큰 집합 (\sV)가 있다고 가정해 봅시다. 언어 모델 (p)은 각각의 토큰 시퀀..

[Drag Your GAN] Interactive Point-based Manipulation on the Generative Image Manifold

2023.07.02· Deep Learning

https://vcai.mpi-inf.mpg.de/projects/DragGAN/ Drag Your GAN: Interactive Point-based Manipulation on the Generative Image Manifold --> Abstract Synthesizing visual content that meets users' needs often requires flexible and precise controllability of the pose, shape, expression, and layout of the generated objects. Existing approaches gain controllability of generative adversarial net vcai.mpi-i..

[RL] Stable-baselines3 gym -> gymnasium

2023.04.20· Deep Learning

RL 계보로 보면 OpenAI와 Deepmind이 둘이 거의 다했다고 보면 된다.. 코드며 paper며 하지만 요즘 RL 보다 NLP LLM 모델에 관심이 쏠리면서 과거 OpenAI baseline git 이나 Deepmind rl acme git이 업데이트 되지 않고 있다. 그 사이 gym의 후원 재단이 바뀌면서 gymnasium으로 변형되고 일부 return 방식이 바뀌었다. 그래서 대부분의 2~3년이 지난 코드들은 과거 gym버전의 패키지가 아니면 호환이 되지 않고있다. 그러나 다행히 stable-baselines에서 최근 gymnasium으로 코드를 변경해 주었다. 이 패키지를 사용하면 기존 대부분의 PPO, HER, DDPG 등 RL model을 사용이 가능하고 custom 환경도 만들 수 있게..

티스토리툴바