[Transformer] Model ์ •๋ฆฌ
ยท
๐Ÿ‘พ Deep Learning
class MultiHeadAttention(tf.keras.layers.Layer): def __init__(self,**kargs): super(MultiHeadAttention,self).__init__() self.num_heads = kargs['num_heads'] self.d_model = kargs['d_model'] assert self.d_model % self.num_heads == 0 self.depth = self.d_model // self.num_heads self.wq = tf.keras.layers.Dense(kargs['d_model']) self.wk = tf.keras.layers.Dense(kargs['d_model']) self.wv = tf.keras.layers..
์„ ํ˜• ํŒ๋ณ„ ๋ถ„์„ ( LDA )
ยท
๐Ÿ—ฃ๏ธ Natural Language Processing
๋ถ„๋ฅ˜๋ช…์ด ๋ถ™์€ ๋ฌธ์ž ๋ฉ”์„ธ์ง€๋“ค๋กœ ์„ ํ˜• ํŒ๋ณ„ ๋ถ„์„ ๋ชจํ˜•์„ ํ›ˆ๋ จ LDA๋Š” LSA์™€ ๋น„์Šท ํ•œ ๊ณ ์ฐจ์› ๊ณต๊ฐ„์—์„œ ์ฐจ์›๋“ค(BOW, TF-IDF)์˜ ์ตœ๊ณ ์˜ ์ผ์ฐจ ๊ฒฐํ•ฉ์„ ์ฐพ์•„๋‚ด๋ ค๋ฉด ๋ถ„๋ฅ˜๋ช…์ด๋‚˜ ๊ธฐํƒ€ ์ ์ˆ˜๋“ค์ด ๋ฏธ๋ฆฌ ๋ถ€์—ฌ๋œ ํ›ˆ๋ จ๋œ ์ž๋ฃŒ๊ฐ€ ํ•„์š”ํ•˜๋‹ค. LSA - ์ƒˆ ๋ฒกํ„ฐ ๊ณต๊ฐ„์—์„œ ๋ชจ๋“  ๋ฒกํ„ฐ๊ฐ€ ์„œ๋กœ ์ตœ๋Œ€ํ•œ ๋–จ์–ด์ง€๊ฒŒ ๋ถ€์—ฌ LDA - ๋ถ„๋ฅ˜๋“ค ์‚ฌ์ด์˜ ๊ฑฐ๋ฆฌ ์ฆ‰ ํ•œ ๋ถ„๋ฅ˜์— ์†ํ•˜๋Š” ๋ฒกํ„ฐ๋“ค์˜ ๋ฌด๊ฒŒ ์ค‘์‹ฌ๊ณผ ๋‹ค๋ฅธ ๋ถ€๋ฅ˜์— ์†ํ•˜๋Š” ๋ฒกํ„ฐ๋“ค์˜ ๋ฌด๊ฒŒ์ค‘์‹ฌ ์‚ฌ์ด์˜ ๊ฑฐ๋ฆฌ๋ฅผ ์ตœ๋Œ€ํ™” LDA๋ฅผ ์ˆ˜ํ–‰ํ•˜๋ ค๋ฉด LDA ์•Œ๊ณ ๋ฆฌ์ฆ˜์— ๋ถ„๋ฅ˜๋ช…์ด ๋ถ™์€ ๊ฒฌ๋ณธ๋“ค์„ ์ œ๊ณตํ•ด์„œ ์šฐ๋ฆฌ๊ฐ€ ๋ชจํ˜•ํ™”ํ•˜๊ณ ์žํ•˜๋Š” ์ฃผ์ œ๋ฅผ ์•Œ๋ ค์ค˜์•ผํ•œ๋‹ค. ( ์ŠคํŒธ 1 / ๋น„์ŠคํŒธ 0 ) Data Load # data load import pandas as pd from nlpia.data.loaders import get_d..
VAE(Variational autoencoder) ์ข…๋ฅ˜
ยท
๐Ÿ‘พ Deep Learning
Conditional VAE (์กฐ๊ฑด๋ถ€ VAE) ์กฐ๊ฑด๋ถ€VAE(Conditional VAE)๋Š” ์ž ์žฌ ๋ณ€์ˆ˜๋ฟ๋งŒ ์•„๋‹ˆ๋ผ ๋ ˆ์ด๋ธ”๋„ ๋””์ฝ”๋”์— ์ž…๋ ฅํ•˜์—ฌ ๋ ˆ์ด๋ธ”์„ ์ง€์ •ํ•˜๋Š” ํ˜•ํƒœ๋กœ ๋ฐ์ดํ„ฐ๋ฅผ ์ƒ์„ฑํ•œ๋‹ค. ํ•„๊ธฐ์ฒด ์ˆซ์ž ์ด๋ฏธ์ง€๋ณ„๋กœ ๊ฐ€๋กœ์™€ ์„ธ๋กœ์˜ ์ž ์žฌ ๋ณ€์ˆ˜ 2๊ฐœ๋ฅผ ๋ณ€ํ™”์‹œํ‚ค๋ฉฐ ๊ฐ™์€ ์ˆซ์ž๋ผ๋„ ํ•„๊ธฐ์ฒด ์ˆซ์ž ์ด๋ฏธ์ง€๊ฐ€ ๋ฐ”๋€Œ๋Š” ๊ฒƒ์„ ์•Œ ์ˆ˜ ์žˆ๋‹ค. VAE๋Š” ๋ณดํ†ต ๋น„์ง€๋„ํ•™์Šต์ด์ง€๋งŒ ์ง€๋„ํ•™์Šต ์š”์†Œ๋ฅผ ์ถ”๊ฐ€ํ•ด ๋น„์ง€๋„ ํ•™์Šต์„ ์‹คํ–‰ํ•˜๋ฉด ๋ณต์›ํ•  ๋ฐ์ดํ„ฐ๋ฅผ ์ง€์ •ํ•  ์ˆ˜ ์žˆ๋‹ค. β-VAE β-VAE๋Š” ์ด๋ฏธ์ง€์˜ 'disentanglement', ์–ฝํžŒ ๊ฒƒ์„ ํ‘ธ๋Š” ๊ฒƒ์ด ํŠน์ง•์ด๋‹ค. ์ด๋ฏธ์ง€์˜ ํŠน์ง•์„ ์ž ์žฌ ๊ณต๊ฐ„์—์„œ ๋ถ„๋ฆฌํ•˜๋Š” ์‘์šฉ ๊ธฐ์ˆ ์ด๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด ์–ผ๊ตด ์ด๋ฏธ์ง€๋Š” ์ฒซ ๋ฒˆ์จฐ ์ž ์žฌ ๋ณ€์ˆ˜์—์„œ ๋ˆˆ์˜ ๋ชจ์–‘, ๋‘ ๋ฒˆ์งธ ์ž ์žฌ ๋ณ€์ˆ˜์—์„œ ์–ผ๊ตด ๋ฐฉํ–ฅ์˜ ํŠน์ง•์„ ๋‹ด๋Š”๋‹ค. ์ž ์žฌ ๋ณ€์ˆ˜๋กœ ๋ˆˆ์˜ ๋ชจ์–‘..
LSA ๊ฑฐ๋ฆฌ์™€ ์œ ์‚ฌ๋„
ยท
๐Ÿ—ฃ๏ธ Natural Language Processing
LSA ์ฃผ์ œ ๋ชจํ˜•์ด ๊ณ ์ฐจ์› TF-IDF ๋ฒกํ„ฐ ๋ชจํ˜•๊ณผ ์–ด๋Š ์ •๋„๋‚˜ ์ผ์น˜ํ•˜๋Š”์ง€๋ฅผ ์œ ์‚ฌ๋„ ์ ์ˆ˜๋ฅผ ์ด์šฉํ•ด์„œ ๋น„๊ต LSA๋ฅผ ๊ฑฐ์นœ ๋ชจํ˜•(๋‹ค์ฐจ์›์„ ์ถ•์†Œ)์ด ๊ณ ์ฐจ์› ๋ฒกํ„ฐ๋“ค๊ณผ ๋น„์Šทํ•œ ์„ฑ๋Šฅ์„ ๋‚ผ ์ˆ˜ ์žˆ๋‹ค๋ฉด ์ข‹์€ ๋ชจํ˜•์ด๋ผ๊ณ  ํ•  ์ˆ˜ ์žˆ๋‹ค. ๋‘ ์ฃผ์ œ๋ฒกํ„ฐ ์‚ฌ์ด ๊ฑฐ๋ฆฌ์™€ ๊ฐ๋„์— ๋”ฐ๋ผ ์ฃผ์ œ์˜ ์˜๋ฏธ๊ฐ€ ์–ผ๋งˆ๋‚˜ ๋น„์Šทํ•œ์ง€ ์•Œ๋ ค์ค€๋‹ค. ์ข‹์€ ์ฃผ์ œ ๋ชจํ˜•์ด๋ผ๋ฉด ๋น„์Šทํ•œ ์ฃผ์ œ์˜ ๋ฌธ์„œ๋“ค์— ๋Œ€ํ•œ ๋ฒกํ„ฐ ๊ณต๊ฐ„ ์•ˆ์—์„œ ์„œ๋กœ ๊ฐ€๊นŒ์ด ์žˆ์–ด์•ผํ•œ๋‹ค. LSA๋Š” ๋ฒกํ„ฐ๋“ค ์‚ฌ์ด์˜ ํฐ ๊ฑฐ๋ฆฌ๋ฅผ ์œ ์ง€ํ•˜์ง€๋งŒ, ๊ฐ€๊นŒ์šด ๊ฑฐ๋ฆฌ๋ฅผ ํ•ญ์ƒ ์œ ์ง€ํ•˜์ง€๋Š” ์•Š๋Š”๋‹ค. ๊ทธ๋ง์€ LSA๋Š” ๋ฌธ์„œ๋“ค ์‚ฌ์ด์˜ ๊ด€๊ณ„์— ๋Œ€ํ•œ ์„ฑ๋ถ„์ด ์†Œ์‹ค ๋  ์ˆ˜์žˆ๋‹ค. LSA์˜ SVD ์•Œ๊ณ ๋ฆฌ์ฆ˜์€ ์ƒˆ ์ฃผ์ œ ๋ฒกํ„ฐ ๊ณต๊ฐ„์—์„œ ๋ชจ๋“  ๋ฌธ์„œ์˜ ๋ถ„์‚ฐ์„ ์ตœ๋Œ€ํ™”ํ•˜๋Š” ๊ฒƒ์— ์ดˆ์ ์„ ๋‘์—ˆ๊ธฐ ๋•Œ๋ฌธ์ด๋‹ค. ํŠน์ง• ๋ฒกํ„ฐ ์‚ฌ์ด์˜ ๊ฑฐ๋ฆฌ๋Š” NLP ํŒŒ์ดํ”„๋ผ์ธ์˜ ์„ฑ๊ณผ์— ํฐ ์˜..
[Transformer] Positional Encoding (3)
ยท
๐Ÿ‘พ Deep Learning
nlp.seas.harvard.edu/2018/04/01/attention.html#position-wise-feed-forward-networks The Annotated Transformer The recent Transformer architecture from “Attention is All You Need” @ NIPS 2017 has been instantly impactful as a new method for machine translation. It also offers a new general architecture for many NLP tasks. The paper itself is very clearly writte nlp.seas.harvard.edu class Positiona..
[Transformer] Multi-Head Attention (1)
ยท
๐Ÿ—ฃ๏ธ Natural Language Processing
nlp.seas.harvard.edu/2018/04/01/attention.html#position-wise-feed-forward-networks The Annotated Transformer The recent Transformer architecture from “Attention is All You Need” @ NIPS 2017 has been instantly impactful as a new method for machine translation. It also offers a new general architecture for many NLP tasks. The paper itself is very clearly writte nlp.seas.harvard.edu class MultiHead..
๋‹คํ–ˆ๋‹ค
B's