B's — B's

손실 함수 (loss function)

2021.03.07·

👾 Deep Learning

모델을 학습할 때 손실함수를 지표로 삼고 모델의 학습을 관찰한다. 정확도를 목표로 하면 되는데 왜 정확도를 지표로 하지 않을까? 그 이유는 미분을 주로 사용하는 학습 모델에서 정확도는 대부분이 미분값이 0인 지점으로 삼아 미분값에 대한 변화를 알수 없다. 이에 반면 손실함수는 미분값에 영향을 받지 않아 변화량을 관찰할 수 있다. 정확도는 매개 변수의 변화에 거의 반응을 보이지 않고 반응이 있더라도 그 값이 불연속적으로 변한다. 예를 들면 활성화 함수로 계단 함수를 사용했다고 가정하면 0을 기준으로 미분값이 모두 0이다. 따라서 매개변수가 주는 변화를 계단함수가 모두 사라지게 만들어 손실함수의 값에 아무런 변화가 없다. 시그모이드 함수를 사용하면 미분 값이 0이 되는 구간이 없어 모든 구간에서 매개 변수의..

tensorboard 사용법, gpu 할당 메모리 관리

2021.03.06·

👾 Deep Learning

[tensorboard] 아나콘다 명령 prompt >tensorboard --logdir=./path/logs/ [gpu 메모리 관리] tf version 1.xx config = tf.ConfigProto() config.gpu_options.allow_growth = True session = tf.Session(config=config) [gpu 메모리 관리] tf version 2.xx config = tf.compat.v1.ConfigProto() config.gpu_options.allow_growth = True session = tf.compat.v1.Session(config=config) [gpu 사용량 80%] config = tf.compat.v1.ConfigProto() conf..

OSError: [WinError 127] 지정된 프로시저를 찾을 수 없습니다. Error loading \\torch\\lib\\*_ops_gpu.dll or one of its dependencies.

2021.03.06·

👾 Deep Learning

해당 오류는 pytorch 버전을 1.5.1이하로 낮추면 해결된다. 버전별 설치 방법 pytorch.org/get-started/previous-versions/ PyTorch An open source deep learning platform that provides a seamless path from research prototyping to production deployment. pytorch.org

TFBertModel parameter

2021.03.05·

👾 Deep Learning

huggingface.co/transformers/model_doc/bert.html BERT — transformers 4.3.0 documentation past_key_values (tuple(tuple(torch.FloatTensor)), optional, returned when use_cache=True is passed or when config.use_cache=True) – Tuple of tuple(torch.FloatTensor) of length config.n_layers, with each tuple having 2 tensors of shape (batch_size, num_he huggingface.co vocab_size (int, optional, defaults to 3..

BERT (Deep Bidirectional Transformers for Language Understanding)

2021.03.03·

🗣️ Natural Language Processing

BERT는 2018년 구글에서 공개한 논문인 BERT: Deep Bidirectional Transformers for Language Understanding에서 제안된 모델로서 비지도 학습을 통해 딥러닝 모델 들어가기 전에 사전 학습을 진행하는 모델이다. 버트의 특징은 다른 사전 모델 기법인 GPT나 ELMo와 다르게 양방향성을 가지고 학습한다. 마스크 언어 모델을 학습하기 때문이다. 버트는 word2vec의 CBOW 처럼 주변단어를 통해 의미를 파악한다. 마스크 언어 모델이란 양방향성을 가지고 언어 모델을 학습하기 위한 방법으로 입력 문장이 주어진 경우 일부 단어들을 마스킹해서 해당 단어를 모델이 알지 못하도록 가린다. 그 후 모델을 통해 마스킹된 단어가 무엇인지 예측한다. 입력값으로 들어간 문장 ..

Softmax RuntimeWarning 해결

2021.03.03·

👾 Deep Learning

softmax 구현 def softmax(x): return np.exp(x)/np.sum(np.exp(x)) 수식을 잘 구현했지만 한 가지 문제가 있다. softmax([900,123,22]) # array([nan, 0., 0.] 900만 돼도 Runtimewarning 과 함께 nan 값이 나온다. 따라서 값을 집어 넣어 값에 영향을 주지않고 0으로 가지 않게 해야한다. 인풋의 최댓값을 값들에서 빼서 해결한다. def softmax(x): return np.exp(x-np.max(x))/np.sum(np.exp(x-np.max(x))) softmax([900,123,22]) # array([0., 0., 0.])

티스토리툴바